Towards Clinical Grade Genomes with Joint Bayesian Variant Identification

2013 
The precipitous fall in the cost of sequencing spurred by innovations in high-throughout sequencing (HTS) is bringing the use of genome sequencing closer to the clinic. An important question yet to be answered is whether current HTS protocols provide data that meets clinical standards of quality. While false positives (FP) can be evaluated experimentally, false negatives are more difficult to assess due to the lack of an established gold standard. Sequencing of family pedigrees already enormously simplifies the identification of highly penetrant disease genes. However, joint analysis of family members raw data could also provide a significant boost in variant calling accuracy because related individuals share haplotype blocks. Here we present our Joint Bayesian Calling (JBC) method for pedigrees and show it reduces false positives & negatives and improves accuracy of identified variants in trios and larger pedigrees. Our approach reduces Mendelian errors in trios to 0.1% compared to 2% in singleton calling, and improves specificity of de novo variant identification by reducing FP 50%. We demonstrate JBC scalability to large pedigrees by analyzing sequencing data of a large CEPH pedigree where the genomes of 17 individuals where sequenced to ~50X. As more pedigree members are added accuracy improves and we are also capable of imputing genotypes of missing subjects. Our approach is able to inform trade-offs between depth of coverage and number of family members for research study planning, and coupled with a proprietary fast read mapping algorithm is able to analyze a full depth WGS trio in less than a day (hours for exomes) in commodity hardware. We believe these advances will be crucial for the adoption of genomes & exomes in clinical settings.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []