GGrantIndex
← Search

CAREER: A Machine Learning Framework for Metagenomic Relationships

$679,704FY2009BIONSF

Drexel University, Philadelphia PA

Investigators

Abstract

(This award is funded through the American Recovery and Reinvestment Act of 2009: Public Law 111-5). This is a CAREER award to support the research of Dr. Gail Rosen, in the Department of Computer and Electrical Engineering at Drexel University. Dr. Rosen is a third-year, tenure-track Assistant Professor. Dr. Rosen is developing a computational framework which enables identification and comparison of microorganisms to the environmental factors in their habitats. With recent technologies, DNA can be extracted directly from the millions of cells in any environment, and vast amounts of this DNA can now be sequenced from an environment, a technology known as metagenomics. The ability to analyse these metagenomic datasets lies in the problem of identifying the content of this fragmented mixture, which is composed of thousands or millions of genomes. Machine learning, with its ability to recognize patterns in complex data, is well-suited to this task. Dr. Rosen believes a machine learning approach to analyzing metagenomic datasets will allow the vast majority of the unculturable microbial species in an environment to be studied. For example, machine learning may enable biologists to determine the combinations of microbes and genetic capabilities present that promote soil health and increase crop-yield. Typically, sequenced DNA fragments are identified by scoring their alignment to previously sequenced organisms. Unfortunately, annotation protocols employed for single genome analysis do not hold for a mixture of environmental DNA. The Rosen lab is developing a general classification system to identify the genomic origin of sequenced fragments, methods to reconstruct fragment taxonomy and infer functional relationships through discriminative classification methods and a genomic word-frequency model to predict feature sparseness as a function of fragment length and database complexity. This research will also address fundamental biological questions about global genomic features and their effect on taxonomical and functional relationships. All tools development in this project will be posted on Dr. Rosen?s website: http://www.ece.drexel.edu/gailr/ As a part of her CAREER plan, Dr. Rosen recognizes that this research endeavor is naturally interdisciplinary with concepts from electrical engineering, computer science, and biology. Therefore, her lab is developing an interdisciplinary graduate and undergraduate Bioinformatics curricula (in collaboration with a molecular ecologist) and K-12 modules to incorporate an NSF-funded K-12 program. A particularly creative activity includes image and audio processing applications for the classroom to illustrate math and science concepts through effects used in Photoshop and Garage Band applications. For example, the students are asked to transcribe particular musical chords and as a parallel, ?translate? codons to their amino acids. This activity illustrates the parallel of the Genetic Code to piano chords.

View original record on NSF Award Search →