Improvement of Prediction Performance with Conjoint Molecular Fingerprint in Deep Learning: A Case Study of Predicting Logarithm of Partition Coefficients (logP)

2020 
The accurate predicting of physical properties and bioactivity of drug molecules in deep learning depends on how molecules are represented. Many types of molecular descriptors have been developed for bioinformatics. However, each descriptor is optimized for a specific application with encoding preference. Considering that standalone featurization methods may only cover parts of information in the chemical molecules, we proposed to build the conjoint fingerprint by combining two complementary fingerprints. The impact of conjoint fingerprint and each standalone fingerprint on predicting performance was systematically evaluated in predicting the logarithm of the partition coefficient (logP) by using machine learning / deep learning methods, including random forest (RF), support vector regression (SVR), extreme gradient boosting (XGBoost), long short-term memory network (LSTM), and deep neural network (DNN). The results demonstrated that the conjoint fingerprint yielded improved predictive performance, even outperforming the consensus model using two standalone fingerprints among four of five examined methods. Given that the conjoint fingerprint scheme shows easy extensibility and high applicability, we expect that the proposed conjoint scheme would create new opportunities for continuously improving predictive performance of deep learning by harnessing the complementarity of various types of fingerprints.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    68
    References
    4
    Citations
    NaN
    KQI
    []