SVRare: discovering disease-causing structural variants in the 100K Genomes Project

2021 
Discovery of disease-causing structural variants (dcSV) from whole genome sequencing data is difficult due to high number of false positives and a lack of efficient way to estimate allele frequency. Here we introduce SVRare, an application that aggregates structural variants (SV) called by other tools, and efficiently annotates rare SVs to aid dcSVs discovery. Applied in the Genomics England (GEL) research environment to data from the 100K Genomes Project, SVRare aggregated 554,060,126 SVs called by Manta and Canvas in all the 71,408 participants in the rare-disease arm. From a pilot study of 4313 families, SVRare identified 36 novel protein-coding disrupting SVs on diagnostic grade genes that may explain proband9s phenotype. It is estimated that SVRare can increase SV-based diagnosis yield by at least 4-fold. We also performed a genome-wide association study, and uncovered clusters of dcSVs in genes with known pathogenicity, such as PKD1/2 - cystic kidney diseases and LDLR - familial hypercholesterolaemia.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []