GGrantIndex
← Search

AF: Small: Algorithms for Reconstructing Complex Evolutionary History with Discordant Phylogenetic Trees

$256,796FY2011CSENSF

University Of Connecticut, Storrs CT

Investigators

Abstract

Algorithms for Reconstructing Complex Evolutionary History with Discordant Phylogenetic Trees PI: Yufeng Wu, University of Connecticut Many important computational formulations in biological data analysis are inherently intractable. Many existing computational approaches are either too slow or not able to grasp biological complexity needed for more faithful modeling of the underlying biology. In the meantime, the size of biological data is growing rapidly. Therefore, currently there is a need to develop more efficient and accurate algorithms for analyzing large amount of data and solving more complex computational formulations. The study of the complex evolutionary history of species and populations is the main theme of this project. Evolutionary history is often modeled by phylogenetic tree, the so-called ?tree of life?, which has been studied extensively. Recently, a more complex model, phylogenetic network, has been proposed and studied to accommodate various evolutionary processes, including horizontal gene transfer, recombination and hybrid speciation. As a generalization of the phylogenetic tree model, phylogenetic network is a directed graph with nodes with two or more parents (in addition to nodes with a single parent as in the tree model). There are also other less studied computational formulations for a complex evolutionary process called incomplete lineage sorting. Roughly speaking, incomplete lineage sorting causes discordance of evolutionary histories in different parts of genomes. This can greatly complicate phylogenetic study. This project is focused on developing new algorithms for hard optimization problems arising in reconstructing complex evolutionary histories, such as those modeled by phylogenetic networks. These algorithms consider multiple discordant phylogenetic trees, which model the evolutionary histories of different genomic regions. This project will develop algorithms to compare two or more correlated phylogenetic trees and infer the plausible phylogenetic networks that explain the given trees. Although this formulation is generally intractable, this project will develop practical algorithms that can give optimal solutions for data within certain range. One approach to be taken in this project is finding efficiently computable close lower and upper bounds of optimal solutions. This may help to quantify the range of solutions when the optimal solutions are difficult to compute directly. This project will also explore related computational problems arising in the study of incomplete lineage sorting. The expected project outcome will include efficient algorithms for the above computational problems, related open-source software tools that are readily usable by biologists, and rigorous methodologies for both theoretical and empirical evaluation of the algorithms. Developed software tools will be made available freely to the multi-disciplinary research community, and are expected to enable novel biological applications in studying complex evolution. Research results will be integrated into classroom teaching. The proposed educational and outreach activities include reaching out to students with various backgrounds, and training of future researchers with interdisciplinary skills.

View original record on NSF Award Search →