Human Reference Genome and a High Contiguity Ethnic Genome AK1

2019 
Studies have shown that the current human reference genome (GRCh38) might miss information for some populations, but 'exactly what we miss' is still elusive due to the lower contiguity of non-reference genomes. We juxtaposed the GRCh38 with high contiguity genome assemblies, AK1, to show that ~1.8% (~53.4 Mbp) of AK1 sequences missed in GRCh38 with ~0.76% (~22.2 Mbp) of ectopic chromosomes. The unique AK1 sequences harbored ~1,390 putative coding elements. We found that ~5.3Mb (~0.2%) of the AK1 sequences aligned and recovered the 'unmapped' reads of fourteen individuals (5 East-Asians, 4 Europeans, and 5 Africans) as a reference. The regions that 'unmapped' reads aligned included 110 common (shared between [≥]2 individuals) and 38 globally ([≥]7 individuals) missing regions with 25 candidate coding elements. We verified that many of the common missing regions exist in multiple populations and chimpanzee's DNA. Our study illuminates not only the discovery of missing information but the use of highly precise ethnic genomes in understanding human genetics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    0
    Citations
    NaN
    KQI
    []