SeeCiTe: a method to assess CNV calls from SNP arrays using trio data

2020 
MotivationSingle nucleotide polymorphism (SNP) genotyping arrays remain an attractive platform for assaying copy number variants (CNVs) in large population-wide cohorts. However current tools for calling CNVs are still prone to extensive false positive calls when applied to biobank scale arrays. Moreover, there is a lack of methods exploiting cohorts with trios available (e.g. nuclear family) to assist in quality control and downstream analyses following the calling. ResultsWe developed SeeCiTe (Seeing Cnvs in Trios), a novel CNV quality control tool that post-processes output from current CNV calling tools exploiting child-parent trio data to classify calls in quality categories and provide a set of visualizations for each putative CNV call in the offspring. We apply it to the Norwegian Mother, Father, and Child Cohort Study (MoBa) and show that SeeCiTe improves the specificity and sensitivity compared to the common empiric filtering strategies. To our knowledge it is the first tool that utilizes probe-level CNV data in trios to systematically highlight potential artefacts and visualize signal intensities in a streamlined fashion suitable for biobank scale studies. Availability and ImplementationThe software is implemented in R with the source code freely available at https://github.com/aksenia/SeeCiTe. ContactKsenia.Lavrichenko@mpi.nl, Stefan.Johansson@uib.no or Inge.Jonassen@uib.no
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []