A Study of Applications of Machine Learning Based Classification Methods for Virtual Screening of Lead Molecules.
2015
The ligand-based virtual screening of combinatorial libraries employs a number of statistical modeling and
machine learning methods. A comprehensive analysis of the application of these methods for the diversity oriented virtual
screening of biological targets/drug classes is presented here. A number of classification models have been built using
three types of inputs namely structure based descriptors, molecular fingerprints and therapeutic category for performing
virtual screening. The activity and affinity descriptors of a set of inhibitors of four target classes DHFR, COX, LOX and
NMDA have been utilized to train a total of six classifiers viz. Artificial Neural Network (ANN), k nearest neighbor (k-
NN), Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree - (DT) and Random Forest - (RF). Among these
classifiers, the ANN was found as the best classifier with an AUC of 0.9 irrespective of the target. New molecular
fingerprints based on pharmacophore, toxicophore and chemophore (PTC), were used to build the ANN models for each
dataset. A good accuracy of 87.27% was obtained using 296 chemophoric binary fingerprints for the COX-LOX inhibitors
compared to pharmacophoric (67.82 %) and toxicophoric (70.64 %). The methodology was validated on the classical
Ames mutagenecity dataset of 4337 molecules. To evaluate it further, selectivity and promiscuity of molecules from five
drug classes viz. anti-anginal, anti-convulsant, anti-depressant, anti-arrhythmic and anti-diabetic were studied. The TPC
fingerprints computed for each category were able to capture the drug-class specific features using the k-NN classifier.
These models can be useful for selecting optimal molecules for drug design.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
5
Citations
NaN
KQI