Genetic Chaos

Wednesday, March 23, 2005

Genetic Structure of Human Populations

We studied human population structure using genotypes at 377 autosomal microsatellite loci in 1056 individuals from 52 populations. Within-population differences among individuals account for 93 to 95% of genetic variation; differences among major groups constitute only 3 to 5%. Nevertheless, without using prior information about the origins of individuals, we identified six main genetic clusters, five of which correspond to major geographic regions, and subclusters that often correspond to individual populations. General agreement of genetic and predefined populations suggests that self-reported ancestry can facilitate assessments of epidemiological risks but does not obviate the need to use genetic information in genetic association studies.

PDF file

Supplementary information

DNA Polymorphism in a Worldwide Sample of Human X Chromosomes 

DNA sequence data from humans can provide insight into the history of modern humans and the genetic variability in human populations. We report here a study of human DNA sequence variation at an X-linked noncoding region of 10,346 bp. The sample consists of 62 X chromosomes from Africa, Europe, and Asia. Forty-four polymorphic sites were found among the 62 sequences, resulting in 23 different haplotypes. Statistical analyses of the data led to the following inferences. (1) There is strong evidence of human population expansion in the relatively recent past, and this population expansion has had a significant effect on the pattern of polymorphism at this locus. (2) Non-African populations were unlikely to have been derived from a very small number of African lineages. (3) There was considerable geographic subdivision in the ancient human population, which could be an important reason why many studies failed to detect population expansion. (4) The long-term effective population size of humans is between 12,000 and 15,000. And (5) a non-African specific variant was found at a frequency of 35% in non-Africans, an estimate supported by the genotyping of additional 80 non-African and 106 African X chromosomes. This variant could have arisen in Eurasia more than 140,000 years ago, predating the emergence of modern humans. Moreover, this haplotype and all other haplotypes coalesced to the most recent common ancestor of the sample, which was estimated to be older than 490,000 years. Therefore, this region may have a long history in Eurasia.

PDF file

X chromosome evidence for ancient human histories

Diverse African and non-African samples of the X-linked PDHA1 (pyruvate dehydrogenase E1 alpha subunit) locus revealed a fixed DNA sequence difference between the two sample groups. The age of onset of population subdivision appears to be about 200 thousand years ago. This predates the earliest modern human fossils, suggesting the transformation to modern humans occurred in a subdivided population. The base of the PDHA1 gene tree is relatively ancient, with an estimated age of 1.86 million years, a late Pliocene time associated with early species of Homo. PDHA1 revealed very low variation among non-Africans, but in other respects the data are consistent with reports from other X-linked and autosomal haplotype data sets. Like these other genes, but in conflict with microsatellite and mitochondrial data, PDHA1 does not show evidence of human population expansion.

PDF file

Genetic Structure, Self-Identified Race/Ethnicity, and Confounding in Case-Control Association Studies

We have analyzed genetic data for 326 microsatellite markers that were typed uniformly in a large multiethnic population-based sample of individuals as part of a study of the genetics of hypertension (Family Blood Pressure Program). Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity. On the other hand, we detected only modest genetic differentiation between different current geographic locales within each race/ethnicity group. Thus, ancient geographic ancestry, which is highly correlated with self-identified race/ethnicity — as opposed to current residence — is the major determinant of genetic structure in the U.S. population. Implications of this genetic structure for case-control association studies are discussed.

PDF file

X-chromosome as a marker for population history: linkage disequilibrium and haplotype study in Eurasian populations

Linkage disequilibrium (LD) structure is still unpredictable because the interplay of regional recombination rate and demographic history is poorly understood. We have compared the distribution of LD across two genomic regions differing in crossing-over activity – Xq13 (0.166 cM/Mb) and Xp22 (1.3 cM/Mb) – in 15 Eurasian populations. Demographic events predicted to increase the LD level – genetic drift, bottleneck and admixture – had a very strong impact on extent and patterns of regional LD across Xq13 compared to Xp22. The haplotype distribution of the DXS1225–DXS8082 microsatellites from Xq13 exhibiting strong association in all populations was remarkably influenced by population history. European populations shared one common haplotype with a frequency of 25–40%. The Volga-Ural populations studied, living at the geographic borderline of Europe, showed elevated LD as well as harboring a significant fraction of haplotypes originating from East Asia, thus reflecting their past migrations and admixture. In the young Kuusamo isolate from Finland, a bottleneck has led to allelic associations between loci and shifted the haplotype distribution, but has much less affected single microsatellite allele frequencies compared to the main Finnish population. The data show that the footprint of a demographic event is longer preserved in haplotype distribution within a region of low crossing-over rate, than in the information content of a single marker, or between actively recombining markers. As the knowledge of LD patterns is often chosen to assist association mapping of common disease, our conclusions emphasize the importance of understanding the history, structure and variation of a study population.

PDF file

Tuesday, March 08, 2005

Genetic Evidence for the Expansion of Arabian Tribes into the Southern Levant and North Africa

In a recent publication, Bosch et al. (2001) reported on Y-chromosome variation in populations from northwestern (NW) Africa and the Iberian peninsula. They observed a high degree of genetic homogeneity among the NW African Y chromosomes of Moroccan Arabs, Moroccan Berbers, and Saharawis, leading the authors to hypothesize that "the Arabization and Islamization of NW Africa, starting during the 7th century AD, ... [were] cultural phenomena without extensive genetic replacement" (p. 1023). H71 (Eu10) was found to be the second-most-frequent haplogroup in that area. Following the hypothesis of Semino et al. (2000), the authors suggested that this haplogroup had spread out from the Middle East with the Neolithic wave of advance. Our recent findings (Nebel et al. 2000, 2001), however, suggest that the majority of Eu10 chromosomes in NW Africa are due to recent gene flow caused by the migration of Arabian tribes in the first millennium of the Common Era (CE).

PDF file

Ancient mtDNA analysis and the origin of the Guanches

The prehistoric colonisation of the Canary Islands by the Guanches (native Canarians) woke up great expectation about their origin, since the Europeans conquest of the Archipelago. Here, we report mitochondrial DNA analysis (HVRI sequences and RFLPs) of aborigine remains around 1000 years old. The sequences retrieved show that the Guanches possessed U6b1 lineages that are in the present day Canarian population, but not in Africans. In turn, U6b, the phylogenetically closest ancestor found in Africa, is not present in the Canary Islands. Comparisons with other populations relate the Guanches with the actual inhabitants of the Archipelago and with Moroccan Berbers. This shows that, despite the continuous changes suffered by the population (Spanish colonisation, slave trade), aboriginal mtDNA lineages constitute a considerable proportion of the Canarian gene pool. Although the Berbers are the most probable ancestors of the Guanches, it is deduced that important human movements have reshaped Northwest Africa after the migratory wave to the Canary Islands.

PDF file

Palaeolithic Populations and Waves of Advance

The wave-of-advance model has been previously applied to Neolithic human range expansions, yielding good agreement to the speeds inferred from archaeological data. Here, we apply it for the first time to Palaeolithic human expansions by using reproduction and mobility parameters appropriate to hunter-gatherers (instead of the corresponding values for preindustrial farmers). The order of magnitude of the predicted speed is in agreement with that implied by the AMS radiocarbon dating of the late glacial human recolonization of northern Europe (14.2–12.5 kyr BP). We argue that this makes it implausible for climate change to have limited the speed of the recolonization front. It is pointed out that a similar value for the speed can be tentatively inferred from the archaeological data on the expansion of modern humans into the Levant and Europe (42–36 kyr BP).

PDF file

A Predominantly Neolithic Origin for Y-Chromosomal DNA Variation in North Africa

We have typed 275 men from five populations in Algeria, Tunisia, and Egypt with a set of 119 binary markers and 15 microsatellites from the Y chromosome, and we have analyzed the results together with published data from Moroccan populations. North African Y-chromosomal diversity is geographically structured and fits the pattern expected under an isolation-by-distance model. Autocorrelation analyses reveal an east-west cline of genetic variation that extends into the Middle East and is compatible with a hypothesis of demic expansion. This expansion must have involved relatively small numbers of Y chromosomes to account for the reduction in gene diversity towards the West that accompanied the frequency increase of Y haplogroup E3b2, but gene flow must have been maintained to explain the observed pattern of isolation-by-distance. Since the estimates of the times to the most recent common ancestor (TMRCAs) of the most common haplogroups are quite recent, we suggest that the North African pattern of Y-chromosomal variation is largely of Neolithic origin. Thus, we propose that the Neolithic transition in this part of the world was accompanied by demic diffusion of Afro-Asiatic speaking pastoralists from the Middle East.

PDF file

Tuesday, March 01, 2005

Y-Chromosomal Diversity in Europe Is Clinal and Influenced Primarily by Geography, Rather than by Language

Clinal patterns of autosomal genetic diversity within Europe have been interpreted in previous studies in terms of a Neolithic demic diffusion model for the spread of agriculture; in contrast, studies using mtDNA have traced many founding lineages to the Paleolithic and have not shown strongly clinal variation. We have used 11 human Y-chromosomal biallelic polymorphisms, defining 10 haplogroups, to analyze a sample of 3,616 Y chromosomes belonging to 47 European and circum-European populations. Patterns of geographic differentiation are highly non-random, and, when they are assessed using spatial autocorrelation analysis, they show significant clines for five of six haplogroups analyzed. Clines for two haplogroups, representing 45% of the chromosomes, are continent-wide and consistent with the demic diffusion hypothesis. Clines for three other haplogroups each have different foci and are more regionally restricted and are likely to reflect distinct population movements, including one from north of the Black Sea. Principal components analysis suggests that populations are related primarily on the basis of geography, rather than on the basis of linguistic affinity. This is confirmed in Mantel tests, which show a strong and highly significant partial correlation between genetics and geography but a low, nonsignificant partial correlation between genetics and language. Genetic barrier analysis also indicates the primacy of geography in the shaping of patterns of variation. These patterns retain a strong signal of expansion from the Near East but also suggest that the demographic history of Europe has been complex and influenced by other major population movements, as well as by linguistic and geographic heterogeneities and the effects of drift.

PDF file

Phylogeography of Y-Chromosome Haplogroup I Reveals Distinct Domains of Prehistoric Gene Flow in Europe

To investigate which aspects of contemporary human Y-chromosome variation in Europe are characteristic of primary colonization, late-glacial expansions from refuge areas, Neolithic dispersals, or more recent events of gene flow, we have analyzed, in detail, haplogroup I (Hg I), the only major clade of the Y phylogeny that is widespread over Europe but virtually absent elsewhere. The analysis of 1,104 Hg I Y chromosomes, which were identified in the survey of 7,574 males from 60 population samples, revealed several subclades with distinct geographic distributions. Subclade I1a accounts for most of Hg I in Scandinavia, with a rapidly decreasing frequency toward both the East European Plain and the Atlantic fringe, but microsatellite diversity reveals that France could be the source region of the early spread of both I1a and the less common I1c. Also, I1b*, which extends from the eastern Adriatic to eastern Europe and declines noticeably toward the southern Balkans and abruptly toward the periphery of northern Italy, probably diffused after the Last Glacial Maximum from a homeland in eastern Europe or the Balkans. In contrast, I1b2 most likely arose in southern France/Iberia. Similarly to the other subclades, it underwent a postglacial expansion and marked the human colonization of Sardinia ~9,000 years ago.

PDF file

Towards the understanding of post-glacial spread of human mitochondrial DNA haplogroups in Europe and beyond: a phylogeographic approach

PDF file

The Etruscans: A Population-Genetic Study

The origins of the Etruscans, a non-Indo-European population of preclassical Italy, are unclear. There is broad agreement that their culture developed locally, but the Etruscans’ evolutionary and migrational relationships are largely unknown. In this study, we determined mitochondrial DNA sequences in multiple clones derived from bone samples of 80 Etruscans who lived between the 7th and the 3rd centuries B.C. In the first phase of the study, we eliminated all specimens for which any of nine tests for validation of ancient DNA data raised the suspicion that either degradation or contamination by modern DNA might have occurred. On the basis of data from the remaining 30 individuals, the Etruscans appeared as genetically variable as modern populations. No significant heterogeneity emerged among archaeological sites or time periods, suggesting that different Etruscan communities shared not only a culture but also a mitochondrial gene pool. Genetic distances and sequence comparisons show closer evolutionary relationships with the eastern Mediterranean shores for the Etruscans than for modern Italian populations. All mitochondrial lineages observed among the Etruscans appear typically European or West Asian, but only a few haplotypes were found to have an exact match in a modern mitochondrial database, raising new questions about the Etruscans’ fate after their assimilation into the Roman state.

PDF file

Clinal patterns of human Y chromosomal diversity in continental Italy and Greece are dominated by drift and founder effects

We explored the spatial distribution of human Y chromosomal diversity on a microgeographic scale, by typing 30 population samples from closely spaced locations in Italy and Greece for 9 haplogroups and their internal microsatellite variation. We confirm a significant difference in the composition of the Y chromosomal gene pools of the two countries. However, within each country, heterogeneity is not organized along the lines of clinal variation deduced from studies on larger spatial scales. Microsatellite data indicate that local increases of haplogroup frequencies can be often explained by a limited number of founders. We conclude that local founder or drift effects are the main determinants in shaping the microgeographic Y chromosomal diversity.

PDF file