Human genetic clustering

Analysis of genetic clustering examines the degree to which regional groups differ genetically, the categorization of individuals into clusters, and what can be learned about human ancestry from this data. There is broad scientific agreement that a relatively small fraction of human genetic variation occurs between populations, continents, or clusters. Researchers of genetic clustering differ, however, on whether genetic variation is principally clinal or whether clusters inferred mathematically are important and scientifically useful. One of the underlying questions regarding the distribution of human genetic diversity is related to the degree to which genes are shared between the observed clusters. It has been observed repeatedly that the majority of variation observed in the global human population is found within populations. This variation is usually calculated using Sewall Wright's fixation index (FST), which is an estimate of between to within group variation. The degree of human genetic variation is a little different depending upon the gene type studied, but in general it is common to claim that ~85% of genetic variation is found within groups, ~6–10% between groups within the same continent and ~6–10% is found between continental groups. Ryan Brown and George Armelagos described this as 'a host of studies concluded that racial classification schemes can account for only a negligible proportion of human genetic diversity,' including the studies listed in the table below. These average numbers, however, do not mean that every population harbors an equal amount of diversity. In fact, some human populations contain far more genetic diversity than others, which is consistent with the likely African origin of modern humans. Therefore, populations outside of Africa may have undergone serial founder effects that limited their genetic diversity. The FST statistic has come under criticism by A. W. F. Edwards and Jeffrey Long and Rick Kittles. British statistician and evolutionary biologist A. W. F. Edwards faulted Lewontin's methodology for basing his conclusions on simple comparison of genes and rather on a more complex structure of gene frequencies. Long and Kittles' objection is also methodological: according to them the FST is based on a faulty underlying assumptions that all populations contain equally genetic diverse members and that continental groups diverged at the same time. Sarich and Miele have also argued that estimates of genetic difference between individuals of different populations understate differences between groups because they fail to take into account human diploidy. Keith Hunley, Graciela Cabana, and Jeffrey Long created a revised statistical model to account for unequally divergent population lineages and local populations with differing degrees of diversity. Their 2015 paper applies this model to the Human Genome Diversity Project sample of 1,037 individuals in 52 populations. They found that least diverse population examined, the Surui, 'harbors nearly 60% of the total species’ diversity.' Long and Kittles had noted earlier that the Sokoto people of Africa contains virtually all of human genetic diversity. Their analysis also found that non-African populations are a taxonomic subgroup of African populations, that 'some African populations are equally related to other African populations and to non-African populations,' and that 'outside of Africa, regional groupings of populations are nested inside one another, and many of them are not monophyletic.' Multiple studies since 1972 have backed up the claim that, 'The average proportion of genetic differences between individuals from different human populations only slightly exceeds that between unrelated individuals from a single population.' Edwards (2003) claims, 'It is not true, as Nature claimed, that 'two random individuals from any one group are almost as different as any two random individuals from the entire world'' and Risch et al. (2002) state 'Two Caucasians are more similar to each other genetically than a Caucasian and an Asian.' However Bamshad et al. (2004) used the data from Rosenberg et al. (2002) to investigate the extent of genetic differences between individuals within continental groups relative to genetic differences between individuals between continental groups. They found that though these individuals could be classified very accurately to continental clusters, there was a significant degree of genetic overlap on the individual level, to the extent that, using 377 loci, individual Europeans were about 38% of the time more genetically similar to East Asians than to other Europeans. Witherspoon et al. (2007) have argued that even when individuals can be reliably assigned to specific population groups, it may still be possible for two randomly chosen individuals from different populations/clusters to be more similar to each other than to a randomly chosen member of their own cluster, when sampling a small number of SNPs (as in the case with scientists James Watson, Craig Venter and Seong-Jin Kim). They state that using around one-thousand SNPs, individuals from different populations/clusters are never more similar, which they state some may find surprising. Witherspoon et al. conclude that 'caution should be used when using geographic or genetic ancestry to make inferences about individual phenotypes'.

Parent Topic

Child Topic

No Parent Topic