Identifying the building blocks of protein structures
Trustees Of Boston University, Boston
Investigators
Abstract
A Supersecondary Structure Unit (SSU) is defined as three or more secondary structures in a given fold that pack against one another. When such SSUs occur in different folds, they are referred to as "legos". Numerous anecdotal examples of legos, such as 4-a-helix-bundle and 3-b-corner, have been identified, but there has been no systematic attempt to explore the entire structural database for a full list of recurrent structural patterns from which know protein folds might be constructed. This project will develop the necessary methods to carry out such a search and to fully characterize the resulting lego set. Such an effort will provide an entirely new view for protein folds. Specifically, the project will seek to answer the following questions: 1. How large is the lego set? 2. To what extent can the lego set cover known folds? 3. What are the rules governing lego-lego interactions? 4. Can one construct novel protein folds by following the rules? 5. How conserved are sequences representing the same lego? 6. Do legos correlate to functional sites? This project will be an important complement to current approaches in functional genomics. Most current approaches infer protein function by sequence and structural similarities to existing folds. The results obtained from this project will provide a basis to relate different folds and to extend functional inference beyond the boundary of known folds. The project will lead to the development of an array of lego-based algorithms for predicting protein structure and function. The legos and their various properties will constitute a publicly available database. Via the World Wide Web, researchers will be able to perform two types of searches. A protein structure can be submitted to discover the legos it encompasses. The user can also search a protein sequence against all lego profiles; the matched legos may provide information otherwise unavailable for novel sequences.
View original record on NSF Award Search →