Ancestry prediction efficiency of the software GenoGeographer using a z-score method and the ancestry informative markers in the Precision ID Ancestry Panel
2019
Abstract We compared the efficiency of the freely available software GenoGeographer that includes a z-score based analysis with that of a naive method based on the maximal likelihoods of 164 of the 165 ancestral informative markers (AIM) that are included in the commercially available kit Precision ID Ancestry Panel from Thermo Fisher Scientific. The AIM profiles were obtained by investigations with the Precision ID Ancestry Panel in our laboratory and from SNP data in the literature and publically available databases. We established eight well-defined AIM reference population data sets from 3,603 AIM profiles. Six reference populations with profiles from multiple populations (Sub-Saharan Africa, North Africa, Middle East, Europe, South/Central Asia, East Asia), and two populations with individuals with admixed ancestry (Somalia and Greenland). By means of GenoGeographer and naive calculations of the maximal likelihoods, 566 AIM profiles from individuals that were not included in the reference populations and expected to belong to one of the eight reference populations were tested. An initial standard z-score based test with GenoGeographer demonstrated that 22.4% of the individuals could not be assigned to any of the reference populations. Among the remaining 77.6% of the individuals, 83.6% were assigned to the reference population that was concordant with the specified populations of origin of the individuals, 8.2% had ambiguous assignments because they could belong to both the specified population of origin and one or more of the other populations, and 8.2% were assigned to a reference population that was discordant from the specified population of origin. A naive assignment based on the maximal likelihood resulted in 78.1% concordant and 21.9% discordant assignments. The results demonstrate that the z-score analysis with GenoGeographer can reduce the error rate with a factor of almost three compared with that of the naive estimation based on the maximal likelihoods of the AIM profiles. The Precision ID Ancestry Panel is a useful kit for the assignment of ancestry of the eight investigated populations that included two admixed populations. More AIMs with better discrimination and more data on the distribution of AIMs in relevant populations are needed to improve the efficiency of genogeographic prediction with AIMs on a worldwide basis.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
53
References
10
Citations
NaN
KQI