Evidence-based tailoring of bioinformatics approaches to optimize methods that predict the effects of nonsynonymous amino acid substitutions in glucokinase

2017 
Computational methods that allow predicting the effects of nonsynonymous substitutions are an integral part of exome studies. Here, we validated and improved their specificity by performing a comprehensive bioinformatics analysis combined with experimental and clinical data on a model of glucokinase (GCK): 8835 putative variations, including 515 disease-associated variations from 1596 families with diagnoses of monogenic diabetes (GCK-MODY) or persistent hyperinsulinemic hypoglycemia of infancy (PHHI), and 126 variations with available or newly reported (19 variations) data on enzyme kinetics. We also proved that high frequency of disease-associated variations found in patients is closely related to their evolutionary conservation. The default set prediction methods predicted correctly the effects of only a part of the GCK-MODY-associated variations and completely failed to predict the normoglycemic or PHHI-associated variations. Therefore, we calculated evidence-based thresholds that improved significantly the specificity of predictions (≤75%). The combined prediction analysis even allowed to distinguish activating from inactivating variations and identified a group of putatively highly pathogenic variations (EVmutation score 70), which were surprisingly underrepresented among MODY patients and thus under negative selection during molecular evolution. We suggested and validated the first robust evidence-based thresholds, which allow improved, highly specific predictions of disease-associated GCK variations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    6
    Citations
    NaN
    KQI
    []