DISSERTATION RESEARCH: Validation of positive selection on gene regulation at the genome-wide scale
Harvard University, Cambridge MA
Investigators
Abstract
Variation in the regulation of the genes in human DNA is thought to have played a critical role in the evolution of organism complexity, and thus the characterization of regulatory variants selected for in the human genome is broadly informative to understanding the evolution of multicellular, complex organisms as a whole. The direct product of this research is the identification of variants that may lead to population-specific phenotypes, such as resistance to endemic infectious threats, adaptation to high sun exposure or high altitude, or heightened sensory function in situations of predation. Follow-up work of the variants identified could include field studies in Asia and Africa that will promote the development of scientific infrastructure in the developing world. Because this research will require close partnerships with both computational and experimental biologists it will initiate productive scientific collaborations between these synergistic fields of evolutionary biology. Positive, or Darwinian, selection is an evolutionary force driving the creation of genetic diversity that enhances the survival and/or reproductive fitness of an organism. The availability of human whole-genome sequences marks an exciting time in which signatures of positive selection can be detected directly from primary genomic sequence without bias for the biological functions affected or the molecular mechanisms involved. Because tests for positive selection are limited by linkage disequilibrium of the causal variant with many non-causal variants which "hitchhike" along the selective sweep, the Composite of Multiple Signals (CMS) test was recently developed. The CMS test identifies genomic regions under positive selection with 100-fold improved resolution and when applied to human whole-genome sequencing data, 90% of causal variants were predicted to be regulatory. This is intriguing because most examples of positive selection identified to date involve coding regions, and positive selection on regulatory variants remains largely unchartered territory. One class of regulatory variants is single-nucleotide polymorphisms in miRNA binding sites (mirSNPs). Based on preliminary prediction in silico and low-throughput validation of several mirSNPs identified by CMS, this project will use a high-throughput method to test all mirSNPs residing within CMS regions by functional analysis with a massively parallel reporter assay. By combining in silico prediction with high-throughput validation, the research will drastically improve the ability to detect mirSNPs which have driven recent positive selection in the human lineage. High-throughput validation allows the comparison of multiple linked variants in tandem, providing a more direct assessment of which variant(s) likely contributed to the selective sweep. This study will provide much-needed insight into the impact of positive selective forces on regulatory variation, particularly variation in miRNA binding sites, and in addition to uncovering the causal allele within CMS regions, the project will also increase understanding of the role this distinct functional class of regulatory elements plays in adaptive evolution.
View original record on NSF Award Search →