Pedestrian attribute recognition is to predict a set of attribute labels of the pedestrian from surveillance scenarios, which is a very challenging task for computer vision due to poor image quality, continual appearance variations, as well as diverse spatial distribution of imbalanced attributes. It is desirable to model the label dependencies between different attributes to improve the recognition performance as each pedestrian normally possesses many attributes. In this paper, we treat pedestrian attribute recognition as multi-label classification and propose a novel model based on the graph convolutional network (GCN). The model is mainly divided into two parts, we first use convolutional neural network (CNN) to extract pedestrian feature, which is a normal operation processing image in deep learning, then we transfer attribute labels to word embedding and construct a correlation matrix between labels to help GCN propagate information between nodes. This paper applies the object classifiers learned by GCN to the image representation extracted by CNN to enable the model to have the ability to be end-to-end trainable. Experiments on pedestrian attribute recognition dataset show that the approach obviously outperforms other existing state-of-the-art methods.
ABSTRACT The applicability domain of machine learning models trained on structural fingerprints for the prediction of biological endpoints is often limited by the lack of diversity of chemical space of the training data. In this work, we developed similarity-based merger models which combined the outputs of individual models trained on cell morphology (based on Cell Painting) and chemical structure (based on chemical fingerprints) and the structural and morphological similarities of the compounds in the test dataset to compounds in the training dataset. We applied these similarity-based merger models using logistic regression models on the predictions and similarities as features and predicted assay hit calls of 177 assays from ChEMBL, PubChem and the Broad Institute (where the required Cell Painting annotations were available). We found that the similarity-based merger models outperformed other models with an additional 20% assays (79 out of 177 assays) with an AUC>0.70 compared with 65 out of 177 assays using structural models and 50 out of 177 assays using Cell Painting models. Our results demonstrated that similarity-based merger models combining structure and cell morphology models can more accurately predict a wide range of biological assay outcomes and further expanded the applicability domain by better extrapolating to new structural and morphology spaces. Abstract Figure Figure: For TOC Only
The applicability domain of machine learning models trained on structural fingerprints for the prediction of biological endpoints is often limited by the lack of diversity of chemical space of the training data. In this work, we developed similarity-based merger models which combined the outputs of individual models trained on cell morphology (based on Cell Painting) and chemical structure (based on chemical fingerprints) and the structural and morphological similarities of the compounds in the test dataset to compounds in the training dataset. We applied these similarity-based merger models using logistic regression models on the predictions and similarities as features and predicted assay hit calls of 177 assays from ChEMBL, PubChem and the Broad Institute (where the required Cell Painting annotations were available). We found that the similarity-based merger models outperformed other models with an additional 20% assays (79 out of 177 assays) with an AUC > 0.70 compared with 65 out of 177 assays using structural models and 50 out of 177 assays using Cell Painting models. Our results demonstrated that similarity-based merger models combining structure and cell morphology models can more accurately predict a wide range of biological assay outcomes and further expanded the applicability domain by better extrapolating to new structural and morphology spaces.
Abstract Chemical mutagenicity is a serious issue that needs to be addressed in early drug discovery. Over a long period of time, medicinal chemists have manually summarized a series of empirical rules for the optimization of chemical mutagenicity. However, given the rising amount of data, it is getting more difficult for medicinal chemists to identify the more comprehensive chemical rules behind the biochemical data. Herein, we integrated a large Ames mutagenicity data set with 8576 compounds to derive mutagenicity transformation rules for reversing Ames mutagenicity via matched molecular pairs analysis. A well-trained consensus model with a reasonable applicability domain was constructed, which showed favorable performance in the external validation set with an accuracy of 0.814. The model was used to assess the generalizability and validity of these mutagenicity transformation rules. The results demonstrated that these rules were of great practicability and could provide inspiration for the structural modifications of compounds with potential mutagenic effects. We also found that the local chemical environment of the attachment points of rules was critical for successful transformation. To facilitate the use of these mutagenicity transformation rules, we integrated them into ADMETopt2 (http://lmmd.ecust.edu.cn/admetsar2/admetopt2/), a free webserver for optimization of chemical ADMET properties. The above-mentioned approach would be extended to the optimization of other toxicity endpoints.
Identification of structural alerts for toxicity is useful in drug discovery and other fields such as environmental protection. With structural alerts, researchers can quickly identify potential toxic compounds and learn how to modify them. Hence, it is important to determine structural alerts from a large number of compounds quickly and accurately. There are already many methods reported for identification of structural alerts. However, how to evaluate those methods is a problem. In this paper, we tried to evaluate four of the methods for monosubstructure identification with three indices including accuracy rate, coverage rate, and information gain to compare their advantages and disadvantages. The Kazius' Ames mutagenicity data set was used as the benchmark, and the four methods were MoSS (graph-based), SARpy (fragment-based), and two fingerprint-based methods including Bioalerts and the fingerprint (FP) method we previously used. The results showed that Bioalerts and FP could detect key substructures with high accuracy and coverage rates because they allowed unclosed rings and wildcard atom or bond types. However, they also resulted in redundancy so that their predictive performance was not as good as that of SARpy. SARpy was competitive in predictive performance in both training set and external validation set. These results might be helpful for users to select appropriate methods and further development of methods for identification of structural alerts.
The steel ball counter is designed to count a batch of small steel balls and to count a certain number of balls so that they can be bagged separately and reduce the amount of labour. The design and study of the counter mechanism and counting system is mainly focused on the design and realisation of the counter. The counting mechanism is designed using solidworks 3D software and has a dynamic simulation. The processor for the dynamic counting system is STC12C5A60S2, the sensor part is an infrared sensor, the display part is an LCD 12864 liquid crystal display module, the control mechanism is a solenoid valve real-time dynamic control, the human-machine interface control is mainly achieved through the keypad. Through the physical inspection, the designed product counting error is within the permissible range, which is a good solution to the huge error brought by traditional metering and weighing, simple, economic and practical.
Abstract Traditional Chinese Medicine (TCM) has been practiced for thousands of years for treating human diseases. In comparison to modern medicine, one of the advantages of TCM is the principle of herb compatibility, known as TCM formulae. A TCM formula usually consists of multiple herbs to achieve the maximum treatment effects, where their interactions are believed to elicit the therapeutic effects. Despite being a fundamental component of TCM, the rationale of combining specific herb combinations remains unclear. In this study, we proposed a network-based method to quantify the interactions in herb pairs. We constructed a protein-protein interaction network for a given herb pair by retrieving the associated ingredients and protein targets, and determined multiple network-based distances including the closest, shortest, center, kernel, and separation, both at the ingredient and at the target levels. We found that the frequently used herb pairs tend to have shorter distances compared to random herb pairs, suggesting that a therapeutic herb pair is more likely to affect neighboring proteins in the human interactome. Furthermore, we found that the center distance determined at the ingredient level improves the discrimination of top-frequent herb pairs from random herb pairs, suggesting the rationale of considering the topologically important ingredients for inferring the mechanisms of action of TCM. Taken together, we have provided a network pharmacology framework to quantify the degree of herb interactions, which shall help explore the space of herb combinations more effectively to identify the synergistic compound interactions based on network topology.