RESEARCH-PGR: Illuminating the Plant Protein Interactome with Genome-wide Signatures of Coevolution
Colorado State University, Fort Collins CO
Investigators
Abstract
Sequencing plant genomes has unlocked a massive catalog of genes and yet our ability to understand the specific role of each gene lags far behind, meaning functions have only been assigned to a fraction of known plant genes. Individual genes rarely work alone, and a critical aspect of a given gene's function is its interacting partners. Deciphering genetic interactions throughout plant genomes will advance our understanding of the function of a large number of uncharacterized genes; however, most existing methods for identifying genetic interactions are laborious, expensive, and difficult to scale to the entire genome. This project will develop scalable, efficient computational analyses tailored to identifying genetic interactions based on gene-specific rates of evolution across plant genomes. These computational tools will serve as resources applicable to any group of plants and be used to generate databases of genetic interactions for several scientifically and agriculturally important plant groups, which plant geneticists and molecular biologists can search to identify new functions for genes of interest. The results will also be used to probe the emergent properties of entire networks of genetic interactions, which will illuminate the higher-level functional architecture of plant genomes. The impact of these resources will be maximized by offering workshops that will provide training to researchers interested in using these tools and by engaging with first-year undergraduates to broaden participation in the field of computational biology. Genetic interactions are an important indicator of gene function but the existing tools for identifying the genome-wide assemblage of interactions (i.e., the interactome) in plants provide only a partial view. The phylogenetic signature of evolutionary rate covariation (ERC) between interacting proteins has been successfully applied on small sets of genes in plants to demonstrate coevolution between known subunits within enzyme complexes and on a genome-wide all-by-all basis in non-plant lineages to discover novel interaction partners. However, ERC has never been applied at genome-wide scale in plants, likely due to phylogenetic challenges caused by the especially frequent gene and genome duplication that occur in plants. This project will develop a novel pipeline tailored to ERC analyses of plant genomes, which will employ existing theory on reconciling complex histories of duplication across gene trees. This pipeline will be used to perform ERC analyses at multiple levels of plant evolution and compile the results to generate a web-based plant interactome database, which can be queried for specific genetic interactions. Further, the resulting ERC-based interaction data will be integrated with existing binding assay and coexpression-based interactions to functionally validate ERC results. Finally, network analytics will identify modules of interacting cofunctional proteins and determine how network structure is driven by gene function. This work will create a resource that will empower genome biologists to add a needed layer of information to genome annotation efforts and provide molecular biologists with a wealth of ready-to-test functional hypotheses. All project outcomes will be freely accessible through CyVerse and long-term repositories such as GitHub and Dryad. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →