Improving the informativeness of Mendelian disease pathogenicity scores for common disease

2020 
Despite considerable progress on pathogenicity scores prioritizing both coding and non-coding variants for Mendelian disease, little is known about the utility of these pathogenicity scores for common disease. Here, we sought to assess the informativeness of Mendelian disease pathogenicity scores for common disease, and to improve upon existing scores. We first applied stratified LD score regression to assess the informativeness of annotations defined by top variants from published Mendelian disease pathogenicity scores across 41 independent common diseases and complex traits (average N = 320K). Several of the resulting annotations were informative for common disease, even after conditioning on a broad set of coding, conserved, regulatory and LD-related annotations from the baseline-LD model. We then improved upon the published pathogenicity scores by developing AnnotBoost, a gradient boosting-based framework to impute and denoise pathogenicity scores using functional annotations from the baseline-LD model. AnnotBoost substantially increased the informativeness for common disease of both previously uninformative and previously informative pathogenicity scores; our combined joint model included 3 published and 8 boosted scores. The boosted scores also significantly outperformed the corresponding published scores in classifying disease-associated, fine-mapped SNPs. Our boosted scores have high potential to improve candidate gene discovery and fine-mapping for common disease.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    112
    References
    4
    Citations
    NaN
    KQI
    []