LightGBM-PPI: Predicting protein-protein interactions through LightGBM with multi-information fusion

2019 
Abstract Protein-protein interactions (PPIs) play an important role in cell life activities such as transcriptional regulation, signal transduction and drug signal transduction. The study of PPIs has become a research hotspot in bioinformatics. However, the identification of PPIs using experimental methods is time-consuming and costly. PPIs prediction based on machine learning is very important. This paper proposes a new protein-protein interactions prediction method called LightGBM-PPI. First, pseudo amino acid composition, autocorrelation descriptor, local descriptor, conjoint triad are employed to extract feature information. Secondly, we use the elastic net to select the optimal feature subset and eliminate redundant features. Finally, the LightGBM is employed as the classifier to predict PPIs and the LightGBM-PPI model is built up. Five-fold cross-validation shows that the prediction accuracy of the Helicobacter pylori and Saccharomyces cerevisiae datasets are 89.03% and 95.07%, respectively. The prediction accuracy of Caenorhabditis elegans, Escherichia coli, Homo sapiens and Mus musculus are 90.16%, 92.16%, 94.83% and 94.57%, respectively, which are superior to the state-of-the-art prediction methods. To further evaluate the advantages and disadvantages of the model, we use one-core network and the crossover network for the Wnt-related pathway to predict PPIs, which can provide new ideas for drug design and disease prevention. The source code and all datasets are available at https://github.com/QUST-AIBBDRC/LightGBM-PPI/ .
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    90
    References
    70
    Citations
    NaN
    KQI
    []