A New Platform to Identify Utilized Coding Sequences

$256,919FY2010BIONSF

University Of Minnesota-Twin Cities, Minneapolis MN

Investigators

Igor G Libourelcontact Michael J Sadowsky Michael Travisano

Abstract

Intellectual Merit: The rapid development of inexpensive, high fidelity, and high-throughput sequencing technologies has facilitated the determination of countless genomes. Sequencing data have led to valuable insights into the descent of species and have played a key role in the annotation of genes in related organisms. However, the elucidation of the direct relationship between a gene and its precise function is still laborious. Similarly, although functional domains in proteins are often known, determining which amino acids are essential and which amino acid substitutions are allowed still requires an enormous time investment. This project leverages high-throughput sequencing technology to measure genome-wide utilization of genes at the nucleotide level. Bacterial cells will be grown under well-defined experimental conditions in the continual presence of a strong mutagen. This treatment will result in the accumulation of mutations in unutilized genes or in parts of utilized genes coding for replaceable amino acids. The introduced variation will be detected by DNA sequencing and the results should distinguish utilized genes from unutilized ones, and replaceable amino acids from those required for protein function. Broader Impacts: This platform will be widely applicable, both for industrial and academic purposes to enable protocol-driven gene discovery. The method does not require much upfront knowledge about metabolism, and uses biocomputing for assessing gene function. The project will provide educational opportunities for a postdoctoral fellow and for undergraduate students.

View original record on NSF Award Search →