Predicting the Androgenicity of Structurally Diverse Compounds from Molecular Structure Using Different Classifiers

2009 
Many environmental and industrial chemicals are reported to have androgenic or antiandrogenic activities. These androgenic chemicals may act as hormones and have the potential to disrupt the endocrine systems of wildlife and humans. In this study, the probabilistic neural network (PNN), support vector machine (SVM), and learning vector quantization (LVQ), three types of machine learning, were used to develop binary classification models to predict androgenicity directly from the organic compounds' molecular structures which were represented by only eleven numerical descriptors. The PNN model acquired the best overall classification rate of 86.67% for prediction data set, with Matthews Correlation Coefficient of 0.64, and the LVQ model gave the lowest false negative rate of 0.00%, which will tend to give relatively high priority during toxicology evaluation. In addition, a consensus model was produced that integrated all three of the basic model types. Compared with the individual models, this consensus model correctly predicted the androgenicity of 86.67% of the prediction set compounds, with false negative rate of 0.00% and the highest Matthews Correlation Coefficient of 0.65. The obtained results indicate that the proposed classification models could provide a feasible and practical tool for the rapid screening of potential androgens.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    4
    Citations
    NaN
    KQI
    []