Gene frequencies for 32 populations show 3 Jewish and 4 gentile clusters. From phenotype frequencies the estimate of inbreeding F is .0110 for populations in the Middle East and .0002 for other populations. Differentiation of populations in the region is small (F =.026) compared with that of the species as a whole (F =.146). Kinship between populations is appreciable within clusters but much less within the total Jewish and gentile groups (.004 and .001, respectively). Isolation by distance is similar within and between groups. From the available data the major Jewish populations appear to have undergone admixture with gentiles at a rate of about 1% per generation, leading to total admixture rates of about 50%. Estimates of evolutionary size from these data suggest substantial founder effects. These results are qualitatively consistent with inferences of Cavalli-Sforza and Carmelli and with the nonparametric analysis of Karlin et al., but not with some interpretations of the latter.
Victor McKusick and the History of Medical Genetics. New York: Springer, Krishna R Dronamraju KR, Clair A Francomano CA (Eds); 2012. 232 pages, ISBN 978-1-4614-1677-7.
Over the years, Krishna Dronamraju has been associated in one way or another with some of the leading figures in 20th century genetics. He has produced several books documenting the issues, lives, and work of some of these scientists, perhaps most notably JBS Haldane. The current volume is another such book, which he has edited along with Claire Francomano.
Victor McKusick, who has been called a - if not the - Founding Father of Medical Genetics, was a leading researcher and database developer for the burgeoning young field of human medical genetics. In this manageably small volume, Dronamraju and Francomano have written, assembled and edited reminiscences and thoughtful commentaries on McKusick, both personally and in regard to the issues and his role in advancing them. Some chapters are very short fond memorials, including warm musings by family members including his wife Anne, as well as a variety of obituaries and eulogies to McKusick. These various sections provide vignettes of interactions with him over scientific as well as personal things. In this way, the essentials of McKusick’s biography are covered very readably.
Among his long list of achievements, during his comparably long working life, McKusick did extensive primary research, especially in the Amish, to show the value of isolates and inbreeding to reveal genetic disease. He also provided foundational research on Marfan’s syndrome, dwarfism and numerous other heritable disorders. The chapters in this book provide some details of these topics directly, but also their role in the history of medical genetics and their standing today.
In addition to primary medical genetic research, McKusick organized and ran a leading academic and clinical program in human genetics at the Johns Hopkins medical school. He also led the assembly and maintenance of the catalog, now called Online Mendelian Inheritance in Man (OMIM), to make the rapidly growing, heterogeneous body of knowledge easily available to investigators. This was firstly through several print editions that all of us in human genetics used on a regular basis and then, as the database became unmanageably large and too rapidly changing for print (and web access had become widely available), it morphed into the current online OMIM database. Geneticists still use this on a daily basis.
Also included in this volume’s chapters are miscellaneous photos of McKusick and family or colleagues in various settings and gatherings that give the book a sense of personal history. The editors also provide a bibliography of McKusick’s publications, which spanned 1949 to 2008, an impressive total (even in today’s hasty world) of 772 papers. These papers cover an equally impressive diversity of topics, a testament in itself to his career.
This is not a book about controversies or with many caveats about the approach that McKusick and others took during their active time in human genetics that more recent data have raised, such as the complexity of genetic causation. Only by the later part of his life did heavily scaled up technology make it possible to go much beyond identifying and cataloguing ‘Mendelian’ traits, or exploring the mechanisms behind those that were simple enough to be tractable. But the mid-century human geneticists systematically laid the groundwork for modern human genetics.
In our era of short memory and little interest in conceptual history among current students, McKusick may already be fading as a figure or even a name. We are impatient, and legacy is a luxury that many may feel is not of much interest. However, history does sow the seeds of the present. At least, historians of 20th century genetics will find this a useful collection to consult, a single volume from which to get both a sense of McKusick as a man and as a figure in the history of the science. Unfortunately, for others this will prove to be prohibitively expensive even as an ebook, but it will be one that good libraries will want to have on their shelves.
The James V. Neel Papers document nearly every phase of the career of one of the founders of human population genetics in the United States. Neel was particularly thorough and organized, and retained virtually all of his significant scientific correspondence, committee reports, minutes of meetings, and drafts of manuscripts. The collection also includes data collected during Neel’s work among the Xavante, Yanomano and other indigenous populations. In a career that spanned the period from the late work of Thomas Hunt Morgan and Charles B. Davenport to the contemporary world of molecular genetics and nucleic acids, Neel knew, worked with, and corresponded with many of the most influential 20th century practitioners of genetics. The collection begins in earnest in 1943, after Neel had decided to focus on human genetics. Neel’s work with Drosophila and none of his Drosophila manuscripts are found in the collection.
Currently, many Amerindian peoples, including European- Amerindian admixed groups such as Mexicans, are experiencing a major epidemic of a series of diseases which includes a tendency to become obese at an early adult age, adult onset diabetes mellitus, the formation of cholesterol gallstones, and gallbladder cancer, especially in females. Other cancer sites, and morbid consequences of these primary disorders, also occur at elevated rates. This epidemic has begun, or at least increased dramatically since World War II, and seems to be due to an interaction between susceptible Amerindian genotype(s) and some recently changed aspect of the environment, probably involving dietary components. This paper reviews the epidemiology of this New World Syndrome (NWS), quantifying incidence, prevalence, and risk in susceptible genotypes. This pattern is distinct from the rise of similar diseases associated around the world with "westernization." The susceptible genes probably arose by virtue of a selective advantage during or before the initial peopling of the Americas.
Abstract Mitochondrial DNAs (mtDNAs) from 167 American Indians including 87 Amerind-speakers (Amerinds) and 80 Nadene-speakers (Nadene) were surveyed for sequence variation by detailed restriction analysis. All Native American mtDNAs clustered into one of four distinct lineages, defined by the restriction site variants: HincII site loss at np 13,259, AluI site loss at np 5,176, 9-base pair (9-bp) COII-tRNA(Lys) intergenic deletion and HaeIII site gain at np 663. The HincII np 13,259 and AluI np 5,176 lineages were observed exclusively in Amerinds and were shared by all such tribal groups analyzed, thus demonstrating that North, Central and South American Amerinds originated from a common ancestral genetic stock. The 9-bp deletion and HaeIII np 663 lineages were found in both the Amerinds and Nadene but the Nadene HaeIII np 663 lineage had a unique sublineage defined by an RsaI site loss at np 16,329. The amount of sequence variation accumulated in the Amerind HincII np 13,259 and AluI np 5,176 lineages and that in the Amerind portion of the HaeIII np 663 lineage all gave divergence times in the order of 20,000 years before present. The divergence time for the Nadene portion of the HaeIII np 663 lineage was about 6,000-10,000 years. Hence, the ancestral Nadene migrated from Asia independently and considerably more recently than the progenitors of the Amerinds. The divergence times of both the Amerind and Nadene branches of the COII-tRNA(Lys) deletion lineage were intermediate between the Amerind and Nadene specific lineages, raising the possibility of a third source of mtDNA in American Indians.
Abstract Polymorphism is variation in specific locations in the deoxyribonucleic acid (DNA) sequence. Some authors use the term only for variant alleles with at least 1% frequency in a population. This shows the population dependence of the concept itself. Variation arises initially by some form of mutation in a single individual but, because DNA is transmitted from parent to offspring, over time the frequency and geographical distribution of a genetic variant is essentially a population concept that depends on history.
In the domesticated animal all variations have an equal chance of continuance; and those which would decidely render a wild animal unable to compete … are no disadvantage whatever …A. R. Wallace, ‘On the Tendency of Varieties to Depart Indefinitely from the Original Type’ (1858)
We all like telling stories about our past, though often the distinction between true history and mythology is blurred. This may be harmless in some aspects of life, but not in science, which is supposed to be an attempt to understand things as they really are or were. But even science is susceptible to confusing mythology and history. An example is the way we describe human variation and its origins. Life is history, and the Darwinian revolution showed that it is a particular kind of history. Instead of being the results of a string of discrete creation events, the diverse species of the world are the result of continually acting processes that modify existing species, eventually generating new ones. As Darwin put it in his Origin of Species, divergence accumulating gradually over time leads first to "varieties" or subspecies, which, after further thousands of generations of divergence, become species. Darwin was entirely vague about how this phase transition between quantitative variation and qualitative differences actually occurs. Indeed, it is still an active area of research in evolutionary biology. The same issues apply closer to home, because a major objective in anthropology is to understand the origins of human variation. Since we're so similar overall that we're clearly a single species, a point repeatedly stressed by Darwin, the traditional anthropological approach to the ways in which we do differ is to typologically divide humankind into subspecies or races. However, identifying those categories has always been notoriously problematic. In Descent of Man, Darwin famously struggled with, then gave up on the attempt, declaring that "Every naturalist who has had the misfortune to undertake the description of a group of highly varying organisms, has encountered cases (I speak after experience) precisely like that of man; and if of a cautious disposition, he will end by uniting all the forms which graduate into each other, under a single species; for he will say to himself that he has no right to give names to objects which he cannot define."1 Despite the well-known problems, anthropologists have persisted in dividing in order to conquer an understanding of human variation. Initially, these efforts rested on morphology. In the early twentieth century, the leading American anthropologist, E. A. Hooton suggested how to categorize human variation (Fig. 1).2 As he put it, "A race is a great division of mankind, the members of which, though individually varying, are characterized as a group by a certain combination of morphological and metrical features…which have been derived from their common descent." Primary races are the product of "evolutionary factors," while secondary or "composite" races have been produced by "long-continued intermixture of two or more primary races." (Quotes from Ref. 2, p. 76). Cover page of Hooton's article in Science.2 Photo of EA Hooton, public domain. Results of simulation of pairwise genotypic identity levels within and between two evolving populations. Heat maps showing pairwise genotypic similarities between 1,000 individuals, 500 from each of two populations. A (left). Two completely isolated populations were simulated, showing little gene identity between individuals within each population, but much less identity among between-population pairs (one individual from each population). B (right). The same as in A, but with 1% gene flow per generation between the two populations. The figures are symmetric about the diagonal, so only one or the other half need be looked at. The vertically oriented legend bar at the left provides the color scale from high genetic identity (red, at the top) to low (violet, bottom). The diagonal is red, reflecting the 100% genetic identity of each individual to itself. Simulated with the ForSim5 program. Details are available from the authors. In what today sounds naïvely informal, Hooton suggested that we should first group individuals intuitively, using expert judgment to identify races, then apply multivariate statistical analysis to define races more precisely in terms of those traits that differ to statistically significant degrees. Hooton argued that this procedure will make the distinguishing features "very apparent." "Pure" races will be identifiable; individuals that are admixtures of the pure races can then be characterized. Evolutionary change was due to genetic change, so in the dawning genetic age it seemed more modern to turn to genetics for the task of identifying different races. This thinking can be seen in the world's leading human genetics text at that time.3 Racial types should be defined by traits that are clearly inherited, as could be shown, for example, by their Mendelian appearance in families. Such traits are inherent and permanent, not blurred by environment or life experience, and are faithfully transmitted from parent to offspring. Based on such traits, the authors of this text asserted that "In reality…the races are sharply delimited."3:165 In fact, "Technically speaking, there is no such generalized being as 'man'; there are only men and women belonging to particular races or particular racial crossings."3:209 Yet a Mendelian trait, like eye color, varies among family members and over geographic space. Indeed the authors noted clearly that even within a racial group no two individuals are alike.3:100 This has always posed a curious problem for doing typology within an evolutionary context. How can a trait be useful in defining discretely differing races if it varies even within them? In fact, though it was not expressed in this way at the time, it's easy to see how a 'type' can be variable when it is defined in genetic terms. Suppose we choose a set of genes to define a race. In the early twentieth century, these would have been genes with observable effects on traits, like eye color, and their alternate states, called alleles, would be transmitted in Mendelian fashion in the families that make up a race. This means that within a given race the various alleles would have some frequencies, which we can use to define our racial category objectively in what can be called a statistical typology, as follows. Each member of the race is composed of a random draw from this same set of frequencies. While each person may have drawn a different set of alleles, they are all drawn from the same gene pool, which, for those familiar with population genetics, means that the race is thus made of a population in multilocus Hardy-Weinberg genotype proportions.4 With this statistical definition, what is implied by saying that a race is 'pure' or 'homogeneous' is that its allele frequencies differ from those of other such races. This was implicit in both the early morphometric and genetic variable-type concepts of human races. Since these patterns were based on population concepts and purported to be the result of evolutionary processes, let's take a simple look at how it works. We simulated the evolution of 'pure' populations using a computer program that we have developed for this kind of purpose.5 We simulated 5,000 random-mating diploid individuals for 5,000 generations, and just three genes that accumulated variation by mutations arising at a rate typical for humans. The future frequency of each mutation was left to reproductive chance (genetic drift), as is generally the case in life. After 2,500 generations, the simulated population split into two subpopulations of equal size with no genetic exchange between them, each of which then persisted for another 2,500 generations. We then repeated the same conditions but with 1% gene flow (mate exchange) between the two populations every generation. These numbers are not as fickle as they may seem. In human terms, 5,000 generations corresponds roughly to 100,000 years, or about the age of our species, and 2,500 generations is about the time since the founding of the major continental human populations, the traditional major races. Under these conditions, 5,000 randomly mating individuals would also generate roughly the same amount of genetic variation as is actually seen in humans. Now we ask, how genetically similar to each other are individuals within and between these two simulated 'races'? To answer that question in a practical way, we randomly sampled 1,000 individuals, 500 from each population. For every pair of individuals, we computed similarity in the simulated genes: Since people have two copies of a gene, at any variable site a pair of individuals could be zero, 50%, or 100% identical. For each pair, we averaged their similarity scores over all the variable sites. Figure 2 shows these genetic similarities in the form of what is called a heat map whose color scale is given in the vertical column on the left. There are 1,000 horizontal and 1,000 vertical lines in each of the two figures, representing the sampled individuals. The color of the pixel where the lines intersect represents the genetic similarity between those two simulated individuals. Such figures are symmetric, since comparing X to Y is the same as comparing Y to X so one need look only above or below the diagonal. To understand these figures, you do not need to resolve the details: the overall gestalt tells the tale. Of course real history does not involve rigid population boundaries. But that, together with the fact that the simulation involves only three genes, whereas these days one can look at literally hundreds of thousands of variable sites in the genome, means that our simulation should yield much greater genotypic similarity than is found after real-life history. Yet individuals within these 'pure' populations simply are not very closely related to each other. In Figure 2A, the diagonal is a dark red line reflecting the 100% genetic similarity of each individual compared to itself. The upper left and lower right triangular sections reflect similarities between individuals within each of the two populations, with the light color showing that there is only modest similarity, reflecting the fact that, as in real life, no two individuals even of the same 'race' are genetically identical. In contrast, the upper right square compares individuals between the two populations, with the darker color showing, as would be expected on the basis of evolutionary theory, that when the two simulated 'races' evolve as separate populations, they indeed develop clearly greater, if far from complete, genetic divergence. Even the classical racial typologists were well aware of the roughly continuous geographical pattern of human variation. Even so, as shown in Figure 2B, even with just 1% gene flow between populations and rigid population boundaries, there is hardly more genetic similarity within groups than between them. Though not shown here, the same general patterns result if we have the simulated genes affect some trait that is driven by natural selection in opposite directions in the two populations. These simulation results are entirely consistent with evolutionary theory and data in the real human world. They differ from actual histories in that the population boundaries were made rigid, unlike the real world. Yet it is striking how often, even in the technically sophisticated literature on human genetics, one reads of population isolates or even whole nations described as being genetically homogeneous and hence good populations to study (for example, in disease genetics). An important point is that typological thinking about human variation assumed the reality of the 'pure' populations in which everyone is either a member or an admixed offspring. One might expect that in modern anthropology based on modern evolutionary thinking, the idea of types had long ago been replaced by that of geographically continuous variation, a point long ago stressed in anthropological genetics.6 Would anyone still think that humans are either members of some pure type or admixed between them? The answer is that such views are widespread. The growing availability of extensive genetic data sampled from around the world has led to the development of powerful statistical methods to recognize and analyze global patterns of human genetic variation. One recently developed and now widely used type of analysis takes essentially the classical typological approach to human variation. It is generally called structure analysis, named after a computer program that initially took this approach.7 It asserts, in effect, that there are (or historically were) a set of parental populations, so that everyone in a test sample taken today is either a member of such a population or else is admixed among them. The parental populations are assumed to be variable types in the sense described earlier, which in times past were called pure races. The user of a structure program can provide prespecified parental population allele frequencies or the program can be asked to produce statistically best-estimates of those frequencies in parental populations, either being told or asked to estimate how many of those populations there are. The program then uses the resulting allele frequencies to estimate the admixture proportions for the other individuals in the sample. Investigators around the world are now routinely using this clearly typological approach. It uses sophisticated statistical digests of complex data that we might call [as-if] fictions: They describe data as if they reflect true evolutionary history. As-if fictions can be useful as analytic tools if everyone understands they are simply convenient statistical digests. But the phrasing of papers often suggests that the typological conclusions are being taken as if they represent actual history. Presentation of structure-analysis results are usually given as shown in Figure 3A. Sampled individuals are arrayed linearly according to the geographic location from which they were sampled. The parental populations are assigned colors. A thin vertical line that uses these colors in segments represents each individual, and the segment lengths proportionally reflect the individual's estimated admixture from among the parental populations. A member of a parental population will have just that population's color, while admixed individuals will have two or more differently colored segments. As shown in Figure 3, to understand these results you do not need to see the details (which are available in the original papers) to get the point. Structure analysis figures. A (left). Global analysis of human genetic variation showing pure 'parental' and admixed individuals. Color coding denotes the presumed parental populations. Sampled individuals are arranged geographically from Africa on the left to the Americas. Each individual is represented by a thin horizontal bar color coded in segments proportional to his or her pure or estimated admixed ancestry. From Wang and colleagues,11 http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal. pgen.0030185. B (top right). When Europeans who are held to be homogeneous parental in part A are analyzed on their own, they manifest complex internal parental admixture structure, from Armenians on the left to Finns on the right. From ScienceDirect,8 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852743/?tool=pubmed. C (bottom right). Proportional admixture in American 'mestizos.' The color key on the right identifies presumed [parental] populations from Mexico on the top to those from southern South America on the bottom. Using this key in the same order, each vertical bar represents estimated average admixture components in "mestizo" populations from Mexico City (left) to southern South America (right). The arrow identifies an estimated admixture fraction for Mexico City "mestizos" from the Kaingang in southern South America. From Wang and coworkers,13 http://www.plosgenetics.org/article/info% 3Adoi%2F10.1371%2Fjournal.pgen.1000037. Figure 3A shows one prominent example, a global structure analysis of people sampled from Africa to the Americas. On the top end are people represented as having been sampled from essentially pure African and European parental populations (basically corresponding to classical race typology), from which Middle Easterners are shown as admixed descendants. Similar kinds of admixture patterns are seen as one moves eastward, with many individuals having more than two contributing parental populations. Structure figures like this graphically and esthetically reflect the change of genotype frequencies that we always find across geographic space. However, the sampled individuals in the parental populations are clearly not literally parental to the purportedly admixed individuals, since they are of the same age and living at the same time but thousands of kilometers apart. This as-if presentation might be said to be all right except that it is trivial to show that the parental populations are not themselves homogeneous even in the sense of statistical typologies. When these populations are analyzed on their own by structure analysis rather than in a broader geographic context, they manifest the same or even greater internal complexity, with their own local parental populations and admixed individuals. That can be seen for Europe in Figure 3B,8-10 the same has been shown for the Americas11 and Africa,12 among others. It is indeed a strange kind of evolutionary 'race'—use whatever term you wish for it—that depends for its existence on the context in which it is studied. In terms of real history, does anyone truly think that Europe contains the mish-mash of different pure populations shown in Figure 3B or that Africa once had (much less currently has) 14 pure races that have subsequently admixed, as has recently been suggested?12 Does anyone think that if we had sampled the globe thousands of years ago there would have been such subdivided homogeneity within continents? It can hardly be so certainly there is no precedent for such an expectation. That these as-if statistical digests are treated as reflecting true history can be seen in another example shown in Figure 3C, in which the authors used structure analysis of samples from purportedly unadmixed indigenous parental populations to estimate the admixture fractions of mixed ("mestizo") individuals from Central to South America.13 This bar chart shows the resulting estimated average admixture components of people in each "mestizo" population sample. The authors clearly point out that this shows the sensible pattern of greater affiliation of mixed people with their nearest indigenous neighbors. However, a literal historical interpretation would imply, for example, that people in Mexico City are truly admixed with southern Brazilian natives. This is an as-if statistical digest, because the 'pure' populations themselves are from small samples from isolated parts of more complex indigenous cultures located hundreds or thousands of kilometers apart, all of adults of comparable age, probably not historically entirely unaffected by post-Columbian genetic interchanges. This is because, among countless others that might have been used, these were the only indigenous populations that were available for analysis. The reason structure analysis can give an illusion of classical types and admixture is that, like hilly topography, the geographic pattern of allele frequency change is continuous, but does not change smoothly like an inclined plane. Structure analysis detects these irregularities on the allele-frequency surface and treats them as-if they were discrete parental populations. But like geological hills, historically these irregularities arose in a continuous process. Here we are discussing the realities of human variation, not making politically correct statements of any kind. In fact, in opposing racist typology it is often said that any two copies of the human genome are more than 99% alike, and that hence we are all effectively alike. As a single species, we are obviously all pretty much alike, morphologically as well as genetically. Racial typologists knew this, but their focus was on the differences. Structure analysis is also a study of differences, though its authors do not use terms like 'type,' 'race,' or 'pure,' and we are not imputing to them any social racism whatsoever. However, it is likely that few users of the programs or readers of the results realize that they are still employing the typological approaches of our once and properly rejected past. Ironically, by ignoring the way population history actually works as one process from a common origin rather than as a string of creation events, structure analysis that seems to present variation in Darwinian evolutionary terms is fundamentally non-Darwinian.14 The ensnarled problems associated with race concepts are not going away anytime soon.15 Race is itself a cultural as-if digest that has important psychological and social meaning for many people. But we should be aware of the difference between what actually was and as-if reconstructions of our history. This is especially true in anthropology and human affairs, where culture and true history are important, the reason being that notions about human variation have substantial societal consequences in terms of access to resources and, sometimes, even survival. Classical social racists are still around, and we can't assume that any of us is immune to what may be a natural tendency to typological thinking. Why does typological thinking persist with so little change over so long a time, even when we understand its problems? In part, the answer probably is the longstanding American interest in estimating the admixture proportions in our African-, Hispanic-, and Native-American subpopulations, which resulted from the mass movement of distant peoples to these shores since Columbus. In this case, the idea of admixture makes intuitive sense. Admixture analysis is also relevant and indeed important in searches for disease-related genes. It also can help detect internal structure that could cause false identification of susceptibility genes in case-control studies.7 However, typological concepts that reconstruct scenarios for which there is no direct evidence and plenty of reason to doubt are not needed for understanding or graphically portraying the geographic pattern of the actual evidence of human variation and the evolutionary processes that brought it about.14 We clearly have not shaken a century of attempts to view human variation that changes roughly continuously over geographic space as if it were instead packaged into internally homogeneous products of evolution, with admixture among them. Still, is it just being persnickety to take issue with that? If we are careful to understand that this is an as-if abstraction, does it matter that it's not our real history? Perhaps not, if you are the scientist whose analysis is putting people into packages. But you may feel differently if you're the one being boxed in by somebody else. In that case, in matters of history, history matters. Comments on this column would be welcome at: [email protected] With Anne Buchanan, KMW maintains a blog on relevant topics at EcoDevoEvo.blogspot.com. We thank Anne, Jennifer Wagner, and John Fleagle, as well as two anonymous reviewers, for helping us develop this manuscript, which has been enabled in part by financial assistance from funds provided to Penn State Evan Pugh professors and by NIH grants MH063749 and MH084995 to support ForSim simulation.