CATH functional families predict protein functional sites

2020 
Motivation: Identification of functional sites in proteins is essential for functional characterisation, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein-protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH functional families (FunFams). Results: FunSite9s prediction performance was rigorously benchmarked using cross-validation and a holdout dataset. FunSite outperformed all publicly-available functional site prediction methods. We show that conserved residues in FunFams are enriched in functional sites. We found FunSite9s performance depends greatly on the quality of functional site annotations and the information content of FunFams in the training data. Finally, we analyse which structural and evolutionary features are most predictive for functional sites.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    61
    References
    2
    Citations
    NaN
    KQI
    []