Applying Ancestry and Sex Computation as a Quality Control Tool in Targeted Next-Generation Sequencing

Patrick C. Mathias,Emily H. Turner,Sheena M. Scroggins,Stephen J. Salipante,Noah G. Hoffman,Colin C. Pritchard,Brian H. Shirts

Applying Ancestry and Sex Computation as a Quality Control Tool in Targeted Next-Generation Sequencing

2016

Objectives: To apply techniques for ancestry and sex computation from next-generation sequencing (NGS) data as an approach to confirm sample identity and detect sample processing errors. Methods: We combined a principal component analysis method with k -nearest neighbors classification to compute the ancestry of patients undergoing NGS testing. By combining this calculation with X chromosome copy number data, we determined the sex and ancestry of patients for comparison with self-report. We also modeled the sensitivity of this technique in detecting sample processing errors. Results: We applied this technique to 859 patient samples with reliable self-report data. Our k -nearest neighbors ancestry screen had an accuracy of 98.7% for patients reporting a single ancestry. Visual inspection of principal component plots was consistent with self-report in 99.6% of single-ancestry and mixed-ancestry patients. Our model demonstrates that approximately two-thirds of potential sample swaps could be detected in our patient population using this technique. Conclusions: Patient ancestry can be estimated from NGS data incidentally sequenced in targeted panels, enabling an inexpensive quality control method when coupled with patient self-report.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations