GGrantIndex
← Search

Mathematical and computational analysis for species tree inference

$390,758R01FY2016GMNIH

University Of Alaska Fairbanks, Fairbanks AK

Investigators

Linked publications & trials

Abstract

? DESCRIPTION (provided by applicant): Understanding the evolutionary relationships between organisms is fundamental in a wide variety of problems in biology. This project investigates and develops new methods for inferring species relationships from genetic data, utilizing probabilistic models of gene trees conditional on a species tree. Its main goals are (1) to advance the mathematical understanding of these models, with a view toward species tree inference; (2) to develop improved methods for species tree inference by considering new and underutilized data types derived from gene trees, including clades, splits, unrooted gene trees, and ranked gene trees; (3) to validate theoretical, computational, and statistical properties of these new methods; (4) to produce software for use by empirical biologists. The project will identify gene tree summary statistics on which accurate inference can be based, and will employ these statistics to develop practical methods that can be used in the presence of missing data and under violations of model assumptions. The mathematical, statistical, and computational properties of both new and current methods will be studied to enable comparisons that can guide empirical applications. The model-based, probabilistic approach of this work provides a foundation for enhancing species tree inference from gene tree samples, and thus from genetic sequence data. The project addresses a promising methodological middle ground between computationally intensive full likelihood and Bayesian analyses, which are often infeasible for genomic-scale data sets, and tractable combinatorial methods, which often lack desirable statistical behaviors. The work will advance phylogenetic analysis by deepening knowledge of probabilistic models of gene tree discordance through analysis of the behavior of summary statistics. It will improve the practice of species tree inference by introducing new statistically consistent approaches and by developing theoretical and experimental understanding of the robustness of methods. Further, its use of mathematical techniques from probability, combinatorics, and algebraic statistics, as well as computational experiments employing simulation, will enhance mathematical evolutionary biology more generally.

View original record on NIH RePORTER →