GGrantIndex
← Search

Understanding the Book of Life: Bayesian Protein Secondary Structure Analysis and its Application to Protein Function Prediction

$173,395FY2005CSENSF

Georgia Tech Research Corporation, Atlanta GA

Investigators

Abstract

The identification of all protein coding genes in a genome sequence and the determination of the cellular functions of the proteins encoded by these genes can only be accomplished by combining powerful computational tools with a variety of experimental approaches. It is unrealistic to expect that every single gene and protein will ever be studied experimentally. However, using relatively cheap and fast computational approaches, it is usually possible to reliably predict the protein-coding regions in DNA sequences and to predict the functions of the encoded proteins. Within this context, the investigator improves the accuracy of secondary structure predictions by discovering the most significant dependencies using statical analysis. In addition, the investigator incorporates non-local correlations that arise from hydrogen bonding interactions in beta-strands. This is achieved by extracting hydrogen bonding statistics in interacting beta-strand segments and by implementing a hidden Markov model (HMM) that combines non-local hydrogen bonding patterns with local physiochemical interactions.

View original record on NSF Award Search →