New statistical methods for neutral phylogenetic reconstruction
University Of Washington, Seattle WA
Investigators
Abstract
Phylogenetic inference comprises a set of important statistical techniques that allow researchers to reconstruct evolutionary histories of genomic sequences. Most phylogenetic model-based methods implicitly assume neutral evolution of molecular sequences. However, selection often drives genetic material to the same state, making phylogenetic methods erroneously conclude that some sequences are more closely related than they really are. Although it is widely accepted that such selection-driven convergent evolution biases phylogenetic estimation, no statistically rigorous treatment of this problem exists. The investigators of this project develop a radically novel neutral phylogenetic reconstruction method that remedies the convergent evolution problem by excluding signatures of selection from observed data. The method is based on a new framework for estimating partially observed continuous-time Markov chains. Although the focus of this proposal is evolutionary biology, the novel estimation principle opens a new direction in the field of inference for stochastic processes and covers a wide range of Markov chain applications. The investigators apply the newly developed framework to several important problems in evolutionary biology. First, they infuse neutral phylogenetic reconstruction into population genetics, which critically depends on the assumption of neutrality. Next, the authors use their new estimation framework to distinguish recombination and convergent evolution, an urgent problem in evolutionary studies of human pathogens. Lastly, the investigators employ their neutral phylogenetic reconstruction method to study the functional role of convergent evolution in the core genes of Escherichia coli. This last analytical task is accompanied by deep sequencing of hundreds of genes across a large number of pathogenic Escherichia coli strains. Viruses and bacteria can rapidly adapt their genetic material in response to changes in the environment. This amazing genetic plasticity enables many human pathogens to develop drug resistance and escape our immune system even when the latter is assisted by the infusion of appropriate vaccines. Therefore, understanding evolution of viruses and bacteria is critical for developing therapies and vaccines aimed at suppressing harmful effects of these pathogens on the human body. The investigators of this proposal develop novel statistical tools aimed at improving our ability to reconstruct evolutionary histories of rapidly evolving pathogens. These new methods, implemented in freely available software packages, will find extensive use in biological laboratories that study human pathogens such as hepatitis, influenza, human immunodeficiency virus, Escherichia coli, and Salmonella. In addition to the theoretical part of the project, the investigators will undertake an extensive analysis of Escherichia coli genes in order to illuminate evolutionary mechanisms that allow this bacteria to adapt to new environments. This new knowledge will allow the investigators to shed light on biochemical processes that transform Escherichia coli, normally a harmless resident of the human intestine, to a dangerous pathogen in other parts of the human body.
View original record on NSF Award Search →