Eighteen binary polymorphisms and 16 multiallelic, short-tandem-repeat (STR) loci from the nonrecombining portion of the human Y chromosome were typed in 718 male subjects belonging to 12 ethnic groups of Pakistan. These identified 11 stable haplogroups and 503 combination binary marker/STR haplotypes. Haplogroup frequencies were generally similar to those in neighboring geographical areas, and the Pakistani populations speaking a language isolate (the Burushos), a Dravidian language (the Brahui), or a Sino-Tibetan language (the Balti) resembled the Indo-European-speaking majority. Nevertheless, median-joining networks of haplotypes revealed considerable substructuring of Y variation within Pakistan, with many populations showing distinct clusters of haplotypes. These patterns can be accounted for by a common pool of Y lineages, with substantial isolation between populations and drift in the smaller ones. Few comparative genetic or historical data are available for most populations, but the results can be compared with oral traditions about origins. The Y data support the well-established origin of the Parsis in Iran, the suggested descent of the Hazaras from Genghis Khan's army, and the origin of the Negroid Makrani in Africa, but do not support traditions of Tibetan, Syrian, Greek, or Jewish origins for other populations.
Polarity and Temporality of High-Resolution Y-Chromosome Distributions in India Identify Both Indigenous and Exogenous Expansions and Reveal Minor Genetic Influence of Central Asian Pastoralists
Although considerable cultural impact on social hierarchy and language in South Asia is attributable to the arrival of nomadic Central Asian pastoralists, genetic data (mitochondrial and Y chromosomal) have yielded dramatically conflicting inferences on the genetic origins of tribes and castes of South Asia. We sought to resolve this conflict, using high-resolution data on 69 informative Y-chromosome binary markers and 10 microsatellite markers from a large set of geographically, socially, and linguistically representative ethnic groups of South Asia. We found that the influence of Central Asia on the pre-existing gene pool was minor. The ages of accumulated microsatellite variation in the majority of Indian haplogroups exceed 10,000–15,000 years, which attests to the antiquity of regional differentiation. Therefore, our data do not support models that invoke a pronounced recent genetic input from Central Asia to explain the observed genetic variation in South Asia. R1a1 and R2 haplogroups indicate demographic complexity that is inconsistent with a recent single history. Associated microsatellite analyses of the high-frequency R1a1 haplogroup chromosomes indicate independent recent histories of the Indus Valley and the peninsular Indian region. Our data are also more consistent with a peninsular origin of Dravidian speakers than a source with proximity to the Indus and with significant genetic input resulting from demic diffusion associated with agriculture. Our results underscore the importance of marker ascertainment for distinguishing phylogenetic terminal branches from basal nodes when attributing ancestral composition and temporality to either indigenous or exogenous sources. Our reappraisal indicates that pre-Holocene and Holocene-era — not Indo-European — expansions have shaped the distinctive South Asian Y-chromosome landscape.