AF: Small: Algorithms for Accurate Prediction of Protein Interaction Sites by Integrating Sequence, Structure, and Network Data
Trustees Of Boston University, Boston
Investigators
Abstract
Proteins are the main building blocks and functional molecules of the cell, yet the biological function of most proteins encoded in genomes are not well characterized. Recent advances in structural genomics have generated a wealth of data regarding the three-dimensional structure of individual proteins. At the same time, proteins rarely act alone in the cell; rather, they form complex networks of protein-protein interactions and other types of biomolecular interactions from which intricate yet robust cellular behavior emerges. Identifying amino acid sites that are involved in these biomolecular interactions is an essential first step towards understanding the molecular basis of protein function. Despite their biological significance, these amino acid sites mediating protein-protein interactions are difficult to elucidate experimentally. Computational algorithms are needed to accurately predict these sites. Intellectual Merit. The objective of this proposal is to develop novel computational algorithms that integrate a wide spectrum of publicly available protein sequence, structure, and network data to accurately predict amino acid sites mediating protein interaction. In particular, two new algorithms will be developed to accurately and efficiently identify amino acid residues on the surface of proteins that evolve more slowly than expected, as well as short sequence motifs that are enriched among non-homologous proteins with a common interacting partner. These amino acid residues and sequence motifs are strong candidates for mediating protein interactions. An innovative and unifying feature of this proposal is that both algorithms will take into account the powerful spatial constraints on these amino acid sites imposed by the three-dimensional structure of proteins. The proposed work is significant in that it addresses a fundamental question in molecular systems biology: identifying amino acid residues and sequence motifs that mediate biological networks. The execution of this proposal will provide a set of algorithms, tools, and datasets that maximize the impact of high-throughput approaches on systems and network biology research, which can be used by researchers to address a wide variety of questions ranging from biomedical to evolutionary. Finally, this proposal develops a novel computational paradigm that integrates a wide spectrum of biological data (protein sequences, protein-protein interaction network graphs, and protein three-dimensional structures) to predict amino acid sites mediating protein interaction with high accuracy. These novel algorithms for fundamental problems in computational biology contribute directly to the core mission of the NSF CISE/CCF program. Broader Impacts. The proposed research will further strengthen the interdisciplinary ties between the PI in the Boston University Bioinformatics Program and the collaborating experimentalists in the Boston University School of Medicine. These ties provide invaluable opportunities for cross-disciplinary research experiences for both graduate and undergraduate trainees. The educational plan aims to bridge traditional teaching and mentoring methods between biology, chemistry, and computer science at the K-12, undergraduate, and graduate levels, and to bring the latest research findings and methods to the classroom. He will continue to play a key role in the curriculum development and improvement of the Bioinformatics Program at Boston University.
View original record on NSF Award Search →