GGrantIndex
← Search

CAREER: Sparse Spatial Reasoning for High-Throughput Protein Structure Determination

$425,703FY2004CSENSF

Dartmouth College, Hanover NH

Investigators

Abstract

This is a Faculty Early Career Development (CAREER) award. The research will develop new methods for analyzing the structure of protein molecules, interpreting spatial data sets containing significant noise and sparse information content. Despite many experimental and computational advances, traditional structure determination protocols remain very difficult, expensive, and time-consuming. Consequently, in order to increase the throughput of structure determination, researchers are pursuing minimalist techniques that provide much less structure information much faster; examples include mutation studies, indicating at which positions amino acid substitutions significantly affect the protein's function; cross-linking mass spectrometry, providing crude proximity information for some positions in the protein; and electron microscopy, elucidating the protein's surface/volume at relatively low resolution. These minimalist experiments then place more burden on associated algorithms for experiment planning and data interpretation. This project pursues new theory, representations, and algorithms to address data interpretation and experiment design problems in domains characterized by sparse spatial data. A significant component of the research is the case study application of minimalist protein structure determination. The education plan addresses the need to build bridges between computer science and the life sciences in order to attack problems of this combined computational-experimental kind. A spatial reasoner will be developed, leveraging key problem structure to efficiently and effectively plan and interpret experiments. It will represent data, models, and biophysical knowledge with multi-level, multi-dimensional topological and geometric objects and constraints. This representation will allow algorithms to match features of data and models, overcome problems of noise and scarcity by uncovering consistent feature sets, target clarifying queries in response to conflicts, and plan additional experiments. This approach will thus support closed-loop integration of modeling and experiment -- experimental evidence will trigger evaluation of model features and even optimization of models themselves, while model analysis will trigger specific data interpretation questions and even new experiments. The education component of this project brings together students from computer science and the life sciences to train them for interdisciplinary computational biology research. Additional and revised coursework, to be developed in conjunction with the Computer Science Department and Computational Science and Engineering program at Purdue, will combine advanced computational techniques and biological applications. The training will provide life science students with the necessary algorithmic background and computer science students with the necessary exposure to and experience with motivating biological problems. Research opportunities, course projects, and other learning opportunities will further involve students in the many challenging and fascinating biological problems requiring advanced computational techniques. This CAREER award recognizes and supports the early career-development activities of a teacher-scholar who is likely to become an academic leader of the twenty-first century. The research will lead to scientific contributions in the structural and functional understanding of biomolecular machinery. The challenges faced in developing, applying, and extending algorithms for this application will lead to core contributions in reasoning about physical systems, where many similar tasks in planning, modeling, predicting, and controlling face similar problems with sparse, noisy spatial data.

View original record on NSF Award Search →