INTERACTION PATTERN BASED PREDICTOR OF PROTEIN STRUCTURE

$182,074R01FY2000GMNIH

Donald Danforth Plant Science Center, Saint Louis MO

Investigators

Linked publications & trials

Paper 27283949 Paper 27208174 Paper 26962440 Paper 26414808 Paper 26225536 Paper 26175011 Paper 26057345 Paper 26027735 Paper 26022780 Paper 25857669 Paper 25703118 Paper 25690787 Paper 25484448 Paper 25336501 Paper 24936211 Paper 24677212 Paper 24204237 Paper 23690621 Paper 23587325 Paper 23516343 Paper 23415854 Paper 23335017 Paper 23240691 Paper 22923291 Paper 22574683 Paper 22355140 Paper 22272723 Paper 22105797 Paper 22004759 Paper 21655593 Paper 21365685 Paper 21287609 Paper 21149688 Paper 21044605 Paper 20958088 Paper 20853887 Paper 20850544 Paper 20635423 Paper 20624782 Paper 20455261 Paper 20080513 Paper 19827144 Paper 19731377 Paper 19639638 Paper 19503616 Paper 19361344 Paper 19356921 Paper 19324930 Paper 19289038 Paper 18559081 Paper 18487301 Paper 18293308 Paper 18214965 Paper 18196502 Paper 18172838 Paper 18165317 Paper 18004783 Paper 17905848 Paper 17705276 Paper 17680687 Paper 17496016 Paper 17469193 Paper 17166279 Paper 16963505 Paper 16551468 Paper 16524716 Paper 16485037 Paper 16478803 Paper 16463265 Paper 16187349 Paper 15849316 Paper 15653774 Paper 15576349 Paper 15476259 Paper 15454459 Paper 15229883 Paper 15126668 Paper 15048836 Paper 14764543 Paper 14636603 Paper 14579335 Paper 14568541 Paper 12885659 Paper 12799350 Paper 12609891 Paper 12463416 Paper 12360525 Paper 11504922 Paper 11391776 Paper 11331242 Paper 11316881 Paper 11151004 Paper 10651034

Abstract

DESCRIPTION: The long-term objective of this proposal is the development of algorithms capable of predicting low resolution globular protein tertiary structure based on a threading folding approach, termed the topology fingerprint method. To facilitate development, five distinct testing criteria will be formulated. These will allow for the rapid and objective assessment of the efficacy of a given protein representation and energy parameterization. To identify the crucial variables responsible for fold recognition, a "reverse engineering" approach will be employed. Here, one first assumes that the property of interest is accurately known. If inclusion of this property in the threading algorithm greatly enhances sequence-structure recognition, then an attempt is made to predict it at the requisite level of accuracy. If the property proves to be irrelevant, then it is not included. To date, reverse engineering strongly suggests that a major error of our current approach is the failure to include the correct identity of the interacting residue pairs when threading with gaps in the sequence is done. Thus, better treatment of pair contributions to the potential will be developed. Furthermore, a principal limitation of contemporary threading algorithms, that an example of the global fold already be known, will be addressed. To accomplish this and generate all possible topologies consistent with known knowledge-based rules for the arrangement of supersecodary structure, the protein is viewed as comprised of "U" turns, where the chain reverses global direction and secondary structural elements or blocks between such "U" turns. These quantities can be predicted with rather high accuracy using our recently developed algorithms. Having predicted the number of topological elements, then using graph theory, all topologies consistent with this prediction are enumerated. The predicted structures will be constructed from fragments excised from proteins and recombined. Full atom models will be built and validated using their threading energy to select the predicted native fold. Finally, a divide and conquer strategy is proposed for the rapid screening of massive sequence libraries. A cascade of sequence-based, mixed sequence-threading algorithms and full threading algorithms will be assembled. Those sequences whose topology is identified with high reliability by a given protocol are successively filtered out to leave the most difficult cases. Some of these may be assigned with high reliability to a given topology, while for others, a set of possible folds will be proposed. Thus, a robust protocol capable of handling the plethora of sequences provided by the human genome project will be developed.

View original record on NIH RePORTER →