Toward RNA Genomics: A Pilot Study in the Analysis, Design, and Prediction of RNA Structures
New York University, New York NY
Investigators
Abstract
We propose a pilot study to analyze, design, and predict RNA structures using a combination of graph theory and computational methods. The design and prediction work aims to expand the set of RNA sequences and structures available, and hence our knowledge of RNA functions. By representing RNA secondary structures as tree or pseudoknot graphs, we exploit the mathematical results in graph theory for graph comparison, enumeration, and construction. This approach allows us to enumerate all possible RNA graphs which may represent both natural and hypothetical RNA topologies. Our proposed investigations consist of three related stages: (1) survey and analyze existing RNAs; (2) design RNA sequences for novel RNA tree topologies; and (3) predict the three-dimensional structures of designed RNA sequences. In survey and analysis (goal 1), we will systematically search for the occurrence of RNAs within a larger RNA system, especially for ribosome structures; use RNA topological characteristics to survey and classify functional RNA families; and develop graph theory algorithms to estimate the probable size and diversity of various functional RNA groups within genomes. Our RNA search tools, survey results, and classification methods will be made available to the public through the Nucleic Acid Database. To generate novel RNA topologies for three-dimensional structure prediction (goal 2), we will apply two strategies: sequence mutation of known RNAs and de novo sequence design. Finally, our goal of predicting the three-dimensional structures of designed sequences (goal 3) will be accomplished through the development of empirical force fields and folding algorithms for reduced RNA models. For this challenging goal, we will consider several computational approaches, including the use of elastic energy models with Brownian dynamics as employed for long DNAs. RNA molecules play vital and diverse biological roles in the cell. In parallel with protein genomics efforts, the development of RNA genomics or ribonomics --- a systematic analysis and prediction of RNA sequence/structure relationships --- will help advance our understanding of the biological functions of RNAs. In marked contrast to proteins, only a small number of distinct RNA structures is currently known. The methodologies currently used to search for novel functional RNAs are either restricted to small RNAs or to known RNA classes. In an effort to overcome these limitations, we have developed a novel approach based on an area in mathematics called graph theory and computational biology methods to allow comprehensive search through RNA's structural possibilities. Since known RNA structures represent only a small subset of all distinct structures, we postulate that some of the missing structures represent novel functional RNAs which may occur naturally in cells or be generated in the laboratory, but as yet unidentified or do not exist. The three-dimensional structures of novel RNAs will be predicted through design and folding methodologies. Our approaches will likely expand the number of RNA types and lead to the identification of unknown RNA families. Designed RNAs with novel properties have potential applications as catalysts, molecular sensors, and therapeutic agents in biotechnology and molecular medicine. In addition to RNA structure prediction, graph theory will allow us to systematically characterize, classify and establish structural/functional relationships between existing RNAs. The results and tools developed for this analysis will be made available through the publicly accessible Nucleic Acid Database. Our proposed investigations will contribute to the development of RNA genomics, a field currently lagging behind protein genomics or proteomics. This grant is made under the Joint DMS/NIGMS Initiative to Support Research Grants in the Area of Mathematical Biology. This is a joint competition sponsored by the Division of Mathematical Sciences (DMS) at the National Science Foundation and the National Institute of General Medical Sciences (NIGMS) at the National Institutes of Health.
View original record on NSF Award Search →