Genome-wide distribution of linkage disequilibrium in the population of Palau and its implications for gene flow in Remote Oceania
Linkage disequilibrium (LD) between alleles on the same human chromosome results from various evolutionary processes and is thus telling about the history of populations. Recently, LD has garnered substantial interest for its value to map and fine-map disease genes. We examine the distribution of LD between short tandem repeat alleles on autosomes and sex chromosomes in the Remote Oceanic population of Palau to evaluate whether the data are consistent with a recent hypothesis about the origins of genetic variation in Palau, specifically that the population experienced extensive male-biased gene flow following initial settlement. Consistent with evolutionary theory based on effective population size, LD between X-linked alleles is stochastically greater than LD between autosomal alleles, however, small but detectable LD occurs for autosomal markers separated by substantial distances. By contrast, while Y-linked alleles experience only one third the effective population size of X-linked alleles, their mean value for pairwise LD is only slightly larger than X-linked alleles. For a small population known to experience at least two extreme bottlenecks, 56 six-locus Y haplotypes exhibit remarkable diversity (0.96), comparable to Y diversity of Europeans, however, autosomal and X-linked markers display significantly less diversity, as measured by heterozygosity (4.1% less). Palauan Y haplotypes also fall into distinct clusters, again unlike that of Europe. We argue these data are consistent with waves of male-biased gene flow.
PDF file
Heterogeneous Patterns of Variation Among Multiple Human X-Linked Loci: The Possible Role of Diversity-Reducing Selection in Non-Africans
Studies of human DNA sequence polymorphism reveal a range of diversity patterns throughout the genome. This variation among loci may be due to natural selection, demographic influences, and/or different sampling strategies. Here we build on a continuing study of noncoding regions on the X chromosome in a panel of 41 globally sampled humans representing African and non-African populations by examining patterns of DNA sequence variation at four loci (APXL, AMELX, TNFSF5, and RRM2P4) and comparing these patterns with those previously reported at six loci in the same panel of 41 individuals. We also include comparisons with patterns of noncoding variation seen at five additional X-linked loci that were sequenced in similar global panels. We find that, while almost all loci show a reduction in non-African diversity, the magnitude of the reduction varies substantially across loci. The large observed variance in non-African levels of diversity results in the rejection of a neutral model of molecular evolution with a multi-locus HKA test under both a constant size and a bottleneck model. In non-Africans, some loci harbor an excess of rare mutations over neutral equilibrium predictions, while other loci show no such deviation in the distribution of mutation frequencies. We also observe a positive relationship between recombination rate and frequency spectra in our non-African, but not in our African, sample. These results indicate that a simple out-of-Africa bottleneck model is not sufficient to explain the observed patterns of sequence variation and that diversity-reducing selection acting at a subset of loci and/or a more complex neutral model must be invoked.
PDF file
Gene Losses during Human Origins
Pseudogenization is a widespread phenomenon in genome evolution, and it has been proposed to serve as an engine of evolutionary change, especially during human origins (the “less-is-more” hypothesis). However, there has been no comprehensive analysis of human-specific pseudogenes. Furthermore, it is unclear whether pseudogenization itself can be selectively favored and thus play an active role in human evolution. Here we conduct a comparative genomic analysis and a literature survey to identify 80 nonprocessed pseudogenes that were inactivated in the human lineage after its separation from the chimpanzee lineage. Many functions are involved among these genes, with chemoreception and immune response being outstandingly overrepresented, suggesting potential species-specific features in these aspects of human physiology. To explore the possibility of adaptive pseudogenization, we focus on CASPASE12, a cysteinyl aspartate proteinase participating in inflammatory and innate immune response to endotoxins. We provide population genetic evidence that the nearly complete fixation of a null allele at CASPASE12 has been driven by positive selection, probably because the null allele confers protection from severe sepsis. We estimate that the selective advantage of the null allele is about 0.9% and the pseudogenization started shortly before the out-of-Africa migration of modern humans. Interestingly, two other genes related to sepsis were also pseudogenized in humans, possibly by selection. These adaptive gene losses might have occurred because of changes in our environment or genetic background that altered the threat from or response to sepsis. The identification and analysis of human-specific pseudogenes open the door for understanding the roles of gene losses in human origins, and the demonstration that gene loss itself can be adaptive supports and extends the “less-is-more” hypothesis.
PDF file
Disentangling the Effects of Demography and Selection in Human History
Demographic events affect all genes in a genome, whereas natural selection has only local effects. Using publicly available data from 151 loci sequenced in both European-American and African-American populations, we attempt to distinguish the effects of demography and selection. To analyze large sets of population genetic data such as this one, we introduce ‘‘Perlymorphism,’’ a Unix-based suite of analysis tools. Our analyses show that the demographic histories of human populations can account for a large proportion of effects on the level and frequency of variation across the genome. The African-American population shows both a higher level of nucleotide diversity and more negative values of Tajima’s D statistic than does a European-American population. Using coalescent simulations, we show that the significantly negative values of the D statistic in African-Americans and the positive values in European-Americans are well explained by relatively simple models of population admixture and bottleneck, respectively. Working within these nonequilibrium frameworks, we are still able to show deviations from neutral expectations at a number of loci, including ABO and TRPV6. In addition, we show that the frequency spectrum of mutations — corrected for levels of polymorphism — is correlated with recombination rate only in European-Americans. These results are consistent with repeated selective sweeps in non-African populations, in agreement with recent reports using microsatellite data.
PDF file
Genome Scans of DNA Variability in Humans Reveal Evidence for Selective Sweeps Outside of Africa
The last 50,000-150,000 years of human history have been characterized by rapid demographic expansions and the colonization of novel environments outside of sub-Saharan Africa. Mass migrations outside the ancestral species range likely entailed many new selection pressures, suggesting that genetic adaptation to local environmental conditions may have been more prevalent in colonizing populations outside of sub-Saharan Africa. Here we report a test of this hypothesis using genome-wide patterns of DNA polymorphism. We conducted a multilocus scan of microsatellite variability to identify regions of the human genome that may have been subject to continent-specific hitchhiking events. Using published polymorphism data for a total of 624 autosomal loci in multiple populations of humans, we used coalescent simulations to identify candidate loci for geographically restricted selective sweeps. We identified a total of 13 loci that appeared as outliers in replicated population comparisons involving different reference samples for Africa. A disproportionate number of these loci exhibited reduced levels of relative variability in non-African populations alone, suggesting that recent episodes of positive selection have been more prevalent outside of sub-Saharan Africa.
PDF file
Linkage disequilibrium (LD) between alleles on the same human chromosome results from various evolutionary processes and is thus telling about the history of populations. Recently, LD has garnered substantial interest for its value to map and fine-map disease genes. We examine the distribution of LD between short tandem repeat alleles on autosomes and sex chromosomes in the Remote Oceanic population of Palau to evaluate whether the data are consistent with a recent hypothesis about the origins of genetic variation in Palau, specifically that the population experienced extensive male-biased gene flow following initial settlement. Consistent with evolutionary theory based on effective population size, LD between X-linked alleles is stochastically greater than LD between autosomal alleles, however, small but detectable LD occurs for autosomal markers separated by substantial distances. By contrast, while Y-linked alleles experience only one third the effective population size of X-linked alleles, their mean value for pairwise LD is only slightly larger than X-linked alleles. For a small population known to experience at least two extreme bottlenecks, 56 six-locus Y haplotypes exhibit remarkable diversity (0.96), comparable to Y diversity of Europeans, however, autosomal and X-linked markers display significantly less diversity, as measured by heterozygosity (4.1% less). Palauan Y haplotypes also fall into distinct clusters, again unlike that of Europe. We argue these data are consistent with waves of male-biased gene flow.
PDF file
Heterogeneous Patterns of Variation Among Multiple Human X-Linked Loci: The Possible Role of Diversity-Reducing Selection in Non-Africans
Studies of human DNA sequence polymorphism reveal a range of diversity patterns throughout the genome. This variation among loci may be due to natural selection, demographic influences, and/or different sampling strategies. Here we build on a continuing study of noncoding regions on the X chromosome in a panel of 41 globally sampled humans representing African and non-African populations by examining patterns of DNA sequence variation at four loci (APXL, AMELX, TNFSF5, and RRM2P4) and comparing these patterns with those previously reported at six loci in the same panel of 41 individuals. We also include comparisons with patterns of noncoding variation seen at five additional X-linked loci that were sequenced in similar global panels. We find that, while almost all loci show a reduction in non-African diversity, the magnitude of the reduction varies substantially across loci. The large observed variance in non-African levels of diversity results in the rejection of a neutral model of molecular evolution with a multi-locus HKA test under both a constant size and a bottleneck model. In non-Africans, some loci harbor an excess of rare mutations over neutral equilibrium predictions, while other loci show no such deviation in the distribution of mutation frequencies. We also observe a positive relationship between recombination rate and frequency spectra in our non-African, but not in our African, sample. These results indicate that a simple out-of-Africa bottleneck model is not sufficient to explain the observed patterns of sequence variation and that diversity-reducing selection acting at a subset of loci and/or a more complex neutral model must be invoked.
PDF file
Gene Losses during Human Origins
Pseudogenization is a widespread phenomenon in genome evolution, and it has been proposed to serve as an engine of evolutionary change, especially during human origins (the “less-is-more” hypothesis). However, there has been no comprehensive analysis of human-specific pseudogenes. Furthermore, it is unclear whether pseudogenization itself can be selectively favored and thus play an active role in human evolution. Here we conduct a comparative genomic analysis and a literature survey to identify 80 nonprocessed pseudogenes that were inactivated in the human lineage after its separation from the chimpanzee lineage. Many functions are involved among these genes, with chemoreception and immune response being outstandingly overrepresented, suggesting potential species-specific features in these aspects of human physiology. To explore the possibility of adaptive pseudogenization, we focus on CASPASE12, a cysteinyl aspartate proteinase participating in inflammatory and innate immune response to endotoxins. We provide population genetic evidence that the nearly complete fixation of a null allele at CASPASE12 has been driven by positive selection, probably because the null allele confers protection from severe sepsis. We estimate that the selective advantage of the null allele is about 0.9% and the pseudogenization started shortly before the out-of-Africa migration of modern humans. Interestingly, two other genes related to sepsis were also pseudogenized in humans, possibly by selection. These adaptive gene losses might have occurred because of changes in our environment or genetic background that altered the threat from or response to sepsis. The identification and analysis of human-specific pseudogenes open the door for understanding the roles of gene losses in human origins, and the demonstration that gene loss itself can be adaptive supports and extends the “less-is-more” hypothesis.
PDF file
Disentangling the Effects of Demography and Selection in Human History
Demographic events affect all genes in a genome, whereas natural selection has only local effects. Using publicly available data from 151 loci sequenced in both European-American and African-American populations, we attempt to distinguish the effects of demography and selection. To analyze large sets of population genetic data such as this one, we introduce ‘‘Perlymorphism,’’ a Unix-based suite of analysis tools. Our analyses show that the demographic histories of human populations can account for a large proportion of effects on the level and frequency of variation across the genome. The African-American population shows both a higher level of nucleotide diversity and more negative values of Tajima’s D statistic than does a European-American population. Using coalescent simulations, we show that the significantly negative values of the D statistic in African-Americans and the positive values in European-Americans are well explained by relatively simple models of population admixture and bottleneck, respectively. Working within these nonequilibrium frameworks, we are still able to show deviations from neutral expectations at a number of loci, including ABO and TRPV6. In addition, we show that the frequency spectrum of mutations — corrected for levels of polymorphism — is correlated with recombination rate only in European-Americans. These results are consistent with repeated selective sweeps in non-African populations, in agreement with recent reports using microsatellite data.
PDF file
Genome Scans of DNA Variability in Humans Reveal Evidence for Selective Sweeps Outside of Africa
The last 50,000-150,000 years of human history have been characterized by rapid demographic expansions and the colonization of novel environments outside of sub-Saharan Africa. Mass migrations outside the ancestral species range likely entailed many new selection pressures, suggesting that genetic adaptation to local environmental conditions may have been more prevalent in colonizing populations outside of sub-Saharan Africa. Here we report a test of this hypothesis using genome-wide patterns of DNA polymorphism. We conducted a multilocus scan of microsatellite variability to identify regions of the human genome that may have been subject to continent-specific hitchhiking events. Using published polymorphism data for a total of 624 autosomal loci in multiple populations of humans, we used coalescent simulations to identify candidate loci for geographically restricted selective sweeps. We identified a total of 13 loci that appeared as outliers in replicated population comparisons involving different reference samples for Africa. A disproportionate number of these loci exhibited reduced levels of relative variability in non-African populations alone, suggesting that recent episodes of positive selection have been more prevalent outside of sub-Saharan Africa.
PDF file