A total of 444 individuals representing three ethnic groups (Albanians, Turks and Romanies) in the Republic of Macedonia were sequenced in the mitochondrial control region. The mtDNA haplogroup composition differed between the three groups. Our results showed relatively high frequencies of haplogroup H12 in Albanians (8.8%) and less in Turks (3.3%), while haplogroups M5a1 and H7a1a were dominant in Romanies (13.7% and 10.3%, respectively) but rare in the former two. This highlights the importance of regional sampling for forensic mtDNA databasing purposes. These population data will be available on EMPOP under accession numbers EMP00644 (Albanians), EMP00645 (Romanies) and EMP00646 (Turks).
Distinct, partly competing, "waves" have been proposed to explain human migration in(to) today's Island Southeast Asia and Australia based on genetic (and other) evidence. The paucity of high quality and high resolution data has impeded insights so far. In this study, one of the first in a forensic environment, we used the Ion Torrent Personal Genome Machine (PGM) for generating complete mitogenome sequences via stand-alone massively parallel sequencing and describe a standard data validation practice. In this first representative investigation on the mitochondrial DNA (mtDNA) variation of East Timor (Timor-Leste) population including >300 individuals, we put special emphasis on the reconstruction of the initial settlement, in particular on the previously poorly resolved haplogroup P1, an indigenous lineage of the Southwest Pacific region. Our results suggest a colonization of southern Sahul (Australia) >37 kya, limited subsequent exchange, and a parallel incubation of initial settlers in northern Sahul (New Guinea) followed by westward migrations <28 kya. The temporal proximity and possible coincidence of these latter dispersals, which encompassed autochthonous haplogroups, with the postulated "later" events of (South) East Asian origin pinpoints a highly dynamic migratory phase.
Pan-American mitochondrial DNA (mtDNA) haplogroup C1 has been recently subdivided into three branches, two of which (C1b and C1c) are characterized by ages and geographical distributions that are indicative of an early arrival from Beringia with Paleo-Indians. In contrast, the estimated ages of C1d--the third subset of C1--looked too young to fit the above scenario. To define the origin of this enigmatic C1 branch, we completely sequenced 63 C1d mitochondrial genomes from a wide range of geographically diverse, mixed, and indigenous American populations. The revised phylogeny not only brings the age of C1d within the range of that of its two sister clades, but reveals that there were two C1d founder genomes for Paleo-Indians. Thus, the recognized maternal founding lineages of Native Americans are at least 15, indicating that the overall number of Beringian or Asian founder mitochondrial genomes will probably increase extensively when all Native American haplogroups reach the same level of phylogenetic and genomic resolution as obtained here for C1d.
Human head hair shape, commonly classified as straight, wavy, curly or frizzy, is an attractive target for Forensic DNA Phenotyping and other applications of human appearance prediction from DNA such as in paleogenetics. The genetic knowledge underlying head hair shape variation was recently improved by the outcome of a series of genome-wide association and replication studies in a total of 26,964 subjects, highlighting 12 loci of which 8 were novel and introducing a prediction model for Europeans based on 14 SNPs. In the present study, we evaluated the capacity of DNA-based head hair shape prediction by investigating an extended set of candidate SNP predictors and by using an independent set of samples for model validation. Prediction model building was carried out in 9674 subjects (6068 from Europe, 2899 from Asia and 707 of admixed European and Asian ancestries), used previously, by considering a novel list of 90 candidate SNPs. For model validation, genotype and phenotype data were newly collected in 2415 independent subjects (2138 Europeans and 277 non-Europeans) by applying two targeted massively parallel sequencing platforms, Ion Torrent PGM and MiSeq, or the MassARRAY platform. A binomial model was developed to predict straight vs. non-straight hair based on 32 SNPs from 26 genetic loci we identified as significantly contributing to the model. This model achieved prediction accuracies, expressed as AUC, of 0.664 in Europeans and 0.789 in non-Europeans; the statistically significant difference was explained mostly by the effect of one EDAR SNP in non-Europeans. Considering sex and age, in addition to the SNPs, slightly and insignificantly increased the prediction accuracies (AUC of 0.680 and 0.800, respectively). Based on the sample size and candidate DNA markers investigated, this study provides the most robust, validated, and accurate statistical prediction models and SNP predictor marker sets currently available for predicting head hair shape from DNA, providing the next step towards broadening Forensic DNA Phenotyping beyond pigmentation traits.