Collaborative Research: TRTech-PGR: PlantSynBio: FuncZyme: Building a pipeline for rapid prediction and functional validation of plant enzyme activities
Colorado State University, Fort Collins CO
Investigators
Abstract
Over a thousand plant genomes have already been sequenced and this number is rapidly increasing. While genome sequencing, assembly and gene annotation are less of a bottleneck for researchers today, predicting and validating gene functions is still a major challenge. This is especially the case for genes in large families, such as those encoding metabolic enzymes. Such enzymes are associated with critical primary and specialized metabolic pathways, and the current lack of their meaningful annotation is a major barrier to pathway discovery. Chemistry is the language of the plant world: metabolites mediate defenses against pests, pathogens and abiotic stresses, attract mutualists and play a role in defining growth patterns and crop yield. Societally, plant metabolites are important for foods, drugs, cosmetics and numerous other products. Improving metabolic gene annotation is therefore crucial not just for understanding fundamental plant biology, but also for societal impacts by aiding crop breeding/engineering and synthetic biology. This project, focusing on ten of the largest plant enzyme families, will (1) facilitate deposition of hundreds of published enzyme activities into public repositories such as the UniProt and Gene Ontology databases; (2) develop computational pipelines for predicting enzyme function from high-quality sequenced genomes; (3) develop and apply synthetic biology-based tools for rapid validation of predicted enzyme function; and (4) derive novel evolutionary and functional insights from the accumulated datasets. Research efforts will be coupled with activities that improve inclusive undergraduate participation in research and an art exhibition to demonstrate the power of synthetic biology in creating dynamic, living art pieces. In most plant genomes, genes involved in metabolism belong to large gene families with dozens of members and are poorly annotated. This creates a barrier for dissecting the genetic basis of metabolic traits such as yield, fruit ripening, stress response, and mutualistic interactions. Three critical bottlenecks stymie these efforts: (1) although thousands of enzyme activities have been published, only a miniscule fraction of these are logged into protein function databases and available for use by powerful function prediction programs and machine learning approaches; (2) existing vocabularies and tools for function transfer are not based on substrate chemistry and do not take into account enzyme promiscuity; and, (3) synthetic biology (SynBio) tools for rapid functional validation of computational predictions are insufficiently developed. To address these challenges, this project will (1) develop a Cas9-based SynBio tool using RNA vectors and synthetic transcription factors, enabling high-throughput gene function validation in three angiosperm species; (2) facilitate one of the largest depositions of published plant enzyme activities of 10 targeted enzyme families into the UniProt and GO databases, as well as develop a computational workflow to predict substrate classes of the targeted enzyme family members from 150 high-quality plant genomes; and, (3) apply these workflows to investigate in vivo roles and evolution of these enzyme families. With respect to training and outreach, the project will engage undergraduate students in pathway discovery studies where students will sample biochemical diversity in flora and probe underlying metabolic pathways of non-reference/medicinal plants. In addition, the project will work with faculty in the Colorado State University’s Department of Art and Art History to develop novel SynBio-generated dynamic living art pieces where plants will be used as “canvases” painted with natural colors/pigments synthesized in planta using RNA vectors. All project outcomes that include new computational tools, biological resources and datasets will be shared broadly through public access repositories and through training workshops at national plant science conferences. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →