Sequence-based predictor of ATP-binding residues using random forest and mRMR-IFS feature selection

2014 
Abstract We develop a computational and statistical approach (ATPBR) for predicting ATP-binding residues in proteins from amino acid sequences by using random forests with a novel hybrid feature. The hybrid feature incorporates a new feature called PSSMPP, the predicted secondary structure and orthogonal binary vectors. The mRMR-IFS feature selection method is utilized to construct the best prediction model. At last, ATPBR achieves significantly improved performance over existing methods, with 87.53% accuracy and a Matthew׳s correlation coefficient of 0.554. In addition, our further analysis demonstrates that PSSMPP distinguishes more effectively between ATP-binding and non-binding residues. Besides, the optimal features selected by the mRMR-IFS method improve the prediction performance and may provide useful insights for revealing the mechanisms of ATP and proteins interactions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    10
    Citations
    NaN
    KQI
    []