Genome-wide mutation models to decipher function

$261,273R01FY2013GMNIH

University Of Colorado Denver, Aurora CO

Investigators

Linked publications & trials

Paper 29062121 Paper 27028523 Paper 26244060 Paper 25737491 Paper 25671092 Paper 25670730 Paper 25504731 Paper 25121584 Paper 24706894 Paper 24297902 Paper 24297900 Paper 24060849 Paper 23537068 Paper 22976081

Abstract

DESCRIPTION (provided by applicant): Complete vertebrate genomes are accumulating rapidly, and the pace of accumulation will only increase. This is excellent news, because the utility of comparative analysis depends heavily on the diversity of species sampling. There are, however, substantial challenges to exploiting the full potential of such extensive data: development of novel methods and analytical approaches is needed. We aim to develop and extend our capacity to analyze the dynamic evolutionary processes (across regions and through time) that have shaped extant genomes. We will achieve this goal using a Bayesian evolutionary analysis approach we recently developed that allows us many orders of magnitude speed advantage over competing approaches, and which scales well with model complexity and data size. Many of the studies we propose are based on biologically realistic paradigms that previously were impossible to consider or test because of computational limitations. We propose to comprehensively delineate the repetitive contents of a selected set of vertebrate genomes, including annotation of ancient elements from the dark matter of genomes (the currently unannotated portion). The transposable elements in this set of repeat sequences will be used to build the first complete genome-wide models of context-dependent substitution processes. We will consider contexts such as recombination, rearrangement, expression, and local nucleotide content, as well as unknown contexts, and analyze how the evolutionary processes influenced by these contexts have changed over time. These context- dependent substitution models will provide a powerful tool for identifying and annotating functional regions in interspecific comparisons of vertebrate genomes, and for differentiating and characterizing fitness-based effects in proteins. The core concept is that that if we better understand genome-wide patterns of background nucleotide substitution, then we will be able to more accurately identify genomic regions that are likely functional, and to understand how selection directs the evolution of proteins.

View original record on NIH RePORTER →