Model Choice and Model Averaging in Molecular Phylogenetics

$200,000FY2005BIONSF

University Of California-San Diego, La Jolla CA

Investigators

Abstract

Phylogenetic analysis--an endeavor aimed at uncovering the genealogical relationships of organisms--makes strong, though testable, assumptions about how molecular evolution occurs. Today, a typical phylogenetic analysis is performed by first assuming a particular model of molecular evolution, and then finding the best phylogenetic tree describing the relationships among the organisms of interest. Importantly, the phylogenetic tree is conditioned on the model chosen in the first step of the analysis. Usually, the model is chosen from a small set of candidate models. In this project, we will vastly expand the set of candidate models of molecular evolution and devise 'automated' ways to simultaneously search for the best model of molecular evolution and the best phylogeny. We will use a variant of a method called Markov chain Monte Carlo to search among models and phylogenies. Importantly, the biologist will not only be able to find the tree(s) that best explains his/her data but also find the best model(s) of evolution. We intend to explore two different methods for choosing among models. The first is a Bayesian method that calculates probabilities of models using Bayes' theorem. The second uses information criteria to find a model that is a compromise between the fit of the model to the observed data and the number of free parameters in the model. We will apply the methods to alignments of DNA and amino sequences.

View original record on NSF Award Search →