CAREER: Improving the Performance of Motif Finding Tools through Novel Reliable Significance Estimation and a Study of DNA Replication Origins
Cornell University, Ithaca NY
Investigators
Abstract
A Cornell University researcher is awarded an NSF CAREER grant to uncover regulatory motifs in DNA sequences. It remains a fundamental problem in computational biology as identification of such regulatory elements is central to understanding regulation of gene expression. A recent extensive comparative study showed there is potential for great improvement in existing tools when it comes to detection of real binding sites. The first goal of this project is to develop a reliable significance analysis for profile based de novo motif finders. This analysis can then further assist in delineating the theoretical limit of these finders and in pushing their performance envelope toward that limit. Over the last few years several new types of data emerged that were successfully integrated into motif finders. In particular, with the increased availability of closely related species, a class of phylogeny-aware motif finders has been developed. The second major goal is to develop a new efficient significance analysis that can be used to analyze the results of motif finders that integrate phylogeny or localization data and thereby improve their performance. The third goal is to acquire in a collaborative work with a molecular biologist a better characterization of replication origins in yeast species, in particular, characterizing the sequence elements that account for the variability among replication origins in yeast as well as in detecting and analyzing new replication origins in related species. An education component of this proposal will train and enrich students of all academic levels from Computer Science, Biology and Statistics exposing each one to all three disciplines that essentially combine to define computational biology.
View original record on NSF Award Search →