PopDel identifies medium-size deletions jointly in tens of thousands of genomes

2019 
Thousands of genomic structural variants segregate in the human population and can impact phenotypic traits and diseases. Their identification in whole-genome sequence data of large cohorts is a major computational challenge. Here we present PopDel, which identifies and genotypes deletions of about 500 to at least 10,000 bp in length in many genomes jointly. PopDel scales to tens of thousands of genomes as demonstrated by our evaluation on data of up to 49,962 genomes. Compared to previous tools, PopDel reduces the computational time needed to analyze 150 genomes from weeks to days. The deletions detected by PopDel in a single sample show a large overlap with high-confidence reference call sets. On data of up to 6,794 trios, inheritance patterns suggest a low false positive rate at a high recall. PopDel reliably reports common, rare and de novo deletions and the deletions reflect reported population structure. Therefore, PopDel enables routine scans for deletions in large-scale sequencing studies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    62
    References
    6
    Citations
    NaN
    KQI
    []