Accurate Prediction of Hot Spots with Greedy Gradient Boosting Decision Tree

2018 
Hot spot residues play a crucial role in protein-protein interactions, which are conducive to drug discovery and rational drug design. Only several amino acid residues provide most of the binding free energy for protein interface. These amino acids are called hot spots. This work is to predict hot spot residues by an ensemble machine learning method called Gradient Boosting Decision Tree in Alanine Scanning Energetics Database (ASEdb) and Structural Kinetic and Energetic database of Mutant Protein Interactions (SKEMPI). According to properties of amino acid and protein complex chain where the amino acid is, we design the a program that will not stop until the last most unimportant feature calculated in GBDT method is discarded in every iteration. Consequently, the greedy GBDT method can get a better prediction on hot spot residues after comparing the result, one of evaluation criteria F-score reach at 0.808 in the ASEdb dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    1
    Citations
    NaN
    KQI
    []