Thursday, October 26, 2006

Population structure and history in East Asia

Archaeological, anatomical, linguistic, and genetic data have suggested that there is an old and significant boundary between the populations of north and south China. We use three human genetic marker systems and one human-carried virus to examine the north/south distinction. We find no support for a major north/south division in these markers; rather, the marker patterns suggest simple isolation by distance.

Evidence for Archaic Asian Ancestry on the Human X Chromosome

The human RRM2P4 pseudogene has a pattern of nucleotide polymorphism that is unlike any locus published to date. A gene tree constructed from a 2.4-kb fragment of the RRM2P4 locus sequenced in a sample of 41 worldwide humans clearly roots in East Asia and has a most-recent common ancestor approximately 2 Myr before present. The presence of this basal lineage exclusively in Asia results in higher nucleotide diversity among non-Africans than among Africans. A global survey of a single-nucleotide polymorphism that is diagnostic for the basal, Asian lineage in 570 individuals shows that it occurs at frequencies up to 53% in south China, whereas only one of 177 surveyed Africans carries this archaic lineage. We suggest that this ancient lineage is a remnant of introgressive hybridization between expanding anatomically modern humans emerging from Africa and archaic populations in Eurasia.

Tuesday, October 17, 2006

Analysis of Mitochondrial DNA Lineages in Yakuts

To study the mitochondrial gene pool structure in Yakuts, polymorphism of mtDNA hypervariable segment I (16,024–16,390) was analyzed in 191 people sampled from the indigenous population of the Sakha Republic. In total, 67 haplotypes of 14 haplogroups were detected. Most (91.6%) haplotypes belonged to haplogroups A, B, C, D, F, G, M*, and Y, which are specific for East Eurasian ethnic groups; 8.4% haplotypes represented Caucasian haplogroups H, HV1, J, T, U, and W. A high frequency of mtDNA types belonging to Asian supercluster M was peculiar for Yakuts: mtDNA types belonging to haplogroup C, D, or G and undifferentiated mtDNA types of haplogroup M (M*) accounted for 81% of all haplotypes. The highest diversity was observed for haplogroups C and D, which comprised respectively 22 (44%) and 18 (30%) haplotypes. Yakuts showed the lowest genetic diversity (H= 0.964) among all Turkic ethnic groups. Phylogenetic analysis testified to common genetic substrate of Yakuts, Mongols, and Central Asian (Kazakh, Kyrgyz, Uighur) populations. Yakuts proved to share 21 (55.5%) mtDNA haplotypes with the Central Asian ethnic groups and Mongols. Comparisons with modern Paleoasian populations (Chukcha, Itelmen, Koryaks) revealed three (8.9%) haplotypes common for Yakuts and Koryaks. The results of mtDNA analysis disagree with the hypothesis of an appreciable Paleoasian contribution to the modern Yakut gene pool.

Mitochondrial DNA evidence for admixed origins of central Siberian populations

The Yakuts of northeastern Siberia are a Turkic-speaking population of horse- and cattle-breeders surrounded by Tungusic-speaking reindeer-herders and hunter-gatherers. Archaeological and ethnohistorical data suggest that Yakuts stem from a common ancestral population with the Buryats living near Lake Baikal. To address this hypothesis, we obtained sequences of the first hypervariable segment (HV1) of the mitochondrial DNA control region from Yakuts and Buryats and compared these with sequences from other Eurasian populations. The mtDNA results show that the Buryats have close affinities with both Central Asian Turkic groups and Mongols, while the Yakuts have close affinities with northeastern Siberian, Tungusic-speaking Evenks and south Siberian, Turkic-speaking Tuvans. This different ancestry of the Yakuts and the Tuvans (compared with other Turkic-speaking groups) most likely reflects extensive admixture that occurred between Turkic-speaking steppe groups and Evenks as the former migrated into Siberia. Moreover, the Yakuts are unique among Siberian populations in having a high number of haplotypes shared exclusively with Europeans, suggesting, contrary to the historical record, that occasionally Yakut men took Russian women as wives.

Investigating the effects of prehistoric migrations in Siberia: genetic variation and the origins of Yakuts

The Yakuts (also known as Sakha), Turkic-speaking cattle- and horse-breeders, inhabit a vast territory in Central and northeastern Siberia. On the basis of the archaeological, ethnographic and linguistic evidence, they are assumed to have migrated north from their original area of settlement in the vicinity of Lake Baykal in South Siberia under the pressure of the Mongol expansion during the thirteenth to fifteenth century AD. During their initial migration and subsequent expansion, the ancestors of the Yakuts settled in the territory originally occupied by Tungusic- and Uralic-speaking reindeer-herders and hunters. In this paper we use mtDNA and Y-chromosomal analyses to elucidate whether the Yakut immigration and expansion was accompanied by admixture with the indigenous populations of their new area of settlement or whether the Yakuts displaced the original inhabitants without intermarriage. The mtDNA results show a very close aYnity of the Yakuts with Central Asian and South Siberian groups, which conWrms their southern origin. There is no conclusive evidence for admixture with indigenous populations, though a small amount cannot be excluded on the basis of the mtDNA data alone. The Y-chromosomal results confirm previous findings of a very strong bottleneck in the Yakuts, the age of which is in good accordance with the hypothesis that the Yakuts migrated north under Mongol pressure. Furthermore, the genetic results show that the Yakuts are a very homogenous population, notwithstanding their current spread over a very large territory. This conWrms the historical accounts that they spread over their current area of settlement fairly recently.

An Indian Ancestry: a Key for Understanding Human Diversity in Europe and Beyond

A recent African origin of modern humans, although still disputed, is supported now by a majority of genetic studies. To address the question when and where very early diversification(s) of modern humans outside of Africa occurred, we concentrated on the investigation of maternal and paternal lineages of the extant populations of India, southern China, Caucasus, Anatolia and Europe. Through the analyses of about 1000 mtDNA genomes and 400 Y chromosomesfrom various locations in India we reached the following conclusions, relevant to the peopling of Europe in particular and of the Old World in general. First, we found that the node of the phylogenetic tree of mtDNA, ancestral to more than 90 per cent of the present-day typically European maternal lineages, is present in India at a relatively high frequency. Inferred coalescence time of this ancestral node is slightly above 50,000 BP. Second, we found that haplogroup U is the second most abundant mtDNA variety in India as it is in Europe. Summing up, we believe that there are now enough reasons not only to question a 'recent Indo-Aryan invasion' into India some 4000 BP, but alternatively to consider India as a part of the common gene pool ancestral to the diversity of human maternal lineages in Europe. Our results on Y-chromosomal diversity of various Indian populations support an early split between Indian and east of Indian paternal lineages, while on a surface, Indian (Sanskrit as well as Dravidic speakers) and European Y-chromosomal lineages are much closer than the corresponding mtDNA variants.

Influence of language and ancestry on genetic structure of contiguous populations: A microsatellite based study on populations of Orissa

Background: We have examined genetic diversity at fifteen autosomal microsatellite loci in seven predominant populations of Orissa to decipher whether populations inhabiting the same geographic region can be differentiated on the basis of language or ancestry. The studied populations have diverse historical accounts of their origin, belong to two major ethnic groups and different linguistic families. Caucasoid caste populations are speakers of Indo-European language and comprise Brahmins, Khandayat, Karan and Gope, while the three Australoid tribal populations include two Austric speakers: Juang and Saora and a Dravidian speaking population, Paroja. These divergent groups provide a varied substratum for understanding variation of genetic patterns in a geographical area resulting from differential admixture between migrants groups and aboriginals, and the influence of this admixture on population stratification.

Results: The allele distribution pattern showed uniformity in the studied groups with approximately 81% genetic variability within populations. The coefficient of gene differentiation was found to be significantly higher in tribes (0.014) than caste groups (0.004). Genetic variance between the groups was 0.34% in both ethnic and linguistic clusters and statistically significant only in the ethnic apportionment. Although the populations were genetically close (FST = 0.010), the contemporary caste and tribal groups formed distinct clusters in both Principal-Component plot and Neighbor-Joining tree. In the phylogenetic tree, the Orissa Brahmins showed close affinity to populations of North India, while Khandayat and Gope clustered with the tribal groups, suggesting a possibility of their origin from indigenous people.

Conclusions: The extent of genetic differentiation in the contemporary caste and tribal groups of Orissa is highly significant and constitutes two distinct genetic clusters. Based on our observations, we suggest that since genetic distances and coefficient of gene differentiation were fairly small, the studied populations are indeed genetically similar and that the genetic structure of populations in a geographical region is primarily influenced by their ancestry and not by socio-cultural hierarchy or language. The scenario of genetic structure, however, might be different for other regions of the subcontinent where populations have more similar ethnic and linguistic backgrounds and there might be variations in the patterns of genomic and socio-cultural affinities in different geographical regions.

Molecular insight into the genesis of ranked caste populations of western India based upon polymorphisms across nonrecombinant and recombinant regions in genome


Large-scale trade and cultural contacts between coastal populations of western India and Western-Eurasians paved for extensive immigration and genesis of wide spectrum of admixed gene pool. To trace admixture and genesis of caste populations of western India, we have examined polymorphisms across non-recombining 20 Y-SNPs, 20 Y-STRs, 18 mtDNA diagnostic sites, HVS-1 plus HVS-2 regions; and recombining 15 highly polymorphic autosomal STRs in four predominant caste populations- upper-ranking Desasth-brahmin and Chitpavan-brahmin; a middle-ranking Kshtriya Maratha; and a lower-rank peasant Dhangar.


The generated genomic data was compared with putative parental populations- Central Asians, West Asians and Europeans using AMOVA, PC plot, and admixture estimates. Overall, disparate uniparental ancestries, and l.1% GST value for biparental markers among four studied caste populations linked well with their exchequer demographic histories. Marathi-speaking ancient Desasth-brahmin shows substantial admixture from Central Asian males but Paleolithic maternal component support their Scytho-Dravidian origin. Chitpavanbrahmin demonstrates younger maternal component and substantial paternal gene flow from West Asia, thus giving credence to their recent Irano-Scythian ancestry from Mediterranean or Turkey, which correlated well with European-looking features of this caste. This also explains their untraceable ethno-history before 1000 years, brahminization event and later amalgamation by Maratha. The widespread Palaeolithic mtDNA haplogroups in Maratha and Dhangar highlight their shared Proto-Asian ancestries. Maratha males harboured Anatolian-derived J2 lineage corroborating the blending of farming communities. Dhangar heterogeneity is ascribable to predominantly South-Asian males and West-Eurasian females.


The genomic data-sets of this study provide ample genomic evidences of diverse origins of four ranked castes and synchronization of caste stratification with asymmetrical gene flows from Indo-European migration during Upper Paleolithic, Neolithic, and later dates. However, subsequent gene flows among these castes living in geographical proximity, have diminished significant genetic differentiation as indicated by AMOVA and structure.

Genetic affinities among the lower castes and tribal groups of India: inference from Y chromosome and mitochondrial DNA

India is a country with enormous social and cultural diversity due to its positioning on the crossroads of many historic and pre-historic human migrations. The hierarchical caste system in the Hindu society dominates the social structure of the Indian populations. The origin of the caste system in India is a matter of debate with many linguists and anthropologists suggesting that it began with the arrival of Indo-European speakers from Central Asia about 3500 years ago. Previous genetic studies based on Indian populations failed to achieve a consensus in this regard. We analysed the Y-chromosome and mitochondrial DNA of three tribal populations of southern India, compared the results with available data from the Indian subcontinent and tried to reconstruct the evolutionary history of Indian caste and tribal populations.

No significant difference was observed in the mitochondrial DNA between Indian tribal and caste populations, except for the presence of a higher frequency of west Eurasian-specific haplogroups in the higher castes, mostly in the north western part of India. On the other hand, the study of the Indian Y lineages revealed distinct distribution patterns among caste and tribal populations. The paternal lineages of Indian lower castes showed significantly closer affinity to the tribal populations than to the upper castes. The frequencies of deep-rooted Y haplogroups such as M89, M52, and M95 were higher in the lower castes and tribes, compared to the upper castes.

The present study suggests that the vast majority (>98%) of the Indian maternal gene pool, consisting of Indio-European and Dravidian speakers, is genetically more or less uniform. Invasions after the late Pleistocene settlement might have been mostly male-mediated. However, Y-SNP data provides compelling genetic evidence for a tribal origin of the lower caste populations in the subcontinent. Lower caste groups might have originated with the hierarchical divisions that arose within the tribal groups with the spread of Neolithic agriculturalists, much earlier than the arrival of Aryan speakers. The Indo-Europeans established themselves as upper castes among this already developed caste-like class structure within the tribes.

Genetic demography of Antioquia (Colombia) and the Central Valley of Costa Rica

We report a comparative genetic characterization of two population isolates with parallel demographic histories: the Central Valley of Costa Rica (CVCR) and Antioquia (in northwest Colombia). The analysis of mtDNA, Y-chromosome and autosomal polymorphisms shows that Antioquia and the CVCR are genetically very similar, indicating that closely related parental populations founded these two isolates. In both populations, the male ancestry is predominantly European, whereas the female ancestry is mostly Amerind. In agreement with their isolation, the Amerindian mtDNA diversity of Antioquia and the CVCR is typical of ethnically-defined native populations and is markedly lower than in other Latin American populations. A comparison of linkage disequilibrium (LD) at 18 marker pairs in Antioquia and the CVCR shows that markers in LD in both populations are located at short genetic distances (<~1 cM), whereas markers separated by greater distances are in LD only in the CVCR. This difference probably reflects stochastic variation of LD at the limited number of genome regions compared. The genetic similarity of the populations from Antioquia and the CVCR together with differences in LD between them should be exploitable for the identification and fine mapping of shared disease-related gene variants.

The Evolution and Genetics of Latin American Populations

Demography, genetic diversity, and population relationships among Argentinean Mapuche Indians

Fertility, mortality and migration data from four Mapuche Indian communities located along a 215-km NE-SW linear area in the Province of Río Negro, Argentina, were collated with genetic information furnished by nine blood group systems and by mtDNA haplogroups. The demographic and genetic data indicated a clear dichotomy, which split the four populations into two groups of two. Differing degrees of non-Indian exchanges was probably the main determining factor for this separation. Total genetic variability was very similar in all groups, and the interpopulational variability accounted for only 10% of the total variability. A low prevalence of the Diego(a) antigen among the Mapuche was confirmed. The fact that significant genetic heterogeneity and population clusters were found in such a small territorial region attests to the sensitivity of demographic and genetic approaches in unraveling human history.

Admixture dynamics in Hispanics: A shift in the nuclear genetic ancestry of a South American population isolate

Although it is well established that Hispanics generally have a mixed Native American, African, and European ancestry, the dynamics of admixture at the foundation of Hispanic populations is heterogeneous and poorly documented. Genetic analyses are potentially very informative for probing the early demographic history of these populations. Here we evaluate the genetic structure and admixture dynamics of a province in northwest Colombia (Antioquia), which prior analyses indicate was founded mostly by Spanish men and native women. We examined surname, Y chromosome, and mtDNA diversity in a geographically structured sample of the region and obtained admixture estimates with highly informative autosomal and X chromosome markers. We found evidence of reduced surname diversity and support for the introduction of several common surnames by single founders, consistent with the isolation of Antioquia after the colonial period. Y chromosome and mtDNA data indicate little population substructure among founder Antioquian municipalities. Interestingly, despite a nearly complete Native American mtDNA background, Antioquia has a markedly predominant European ancestry at the autosomal and X chromosome level, which suggests that, after foundation, continuing admixture with Spanish men (but not with native women) increased the European nuclear ancestry of Antioquia. This scenario is consistent with historical information and with results from population genetics theory.

