Sequence and Structural Patterns in RNA Splicing
Purdue University, West Lafayette IN
Investigators
Abstract
Purdue University is awarded a grant to further develop the MAASE resource (a database of manually Annotated Alternative Splice Events). It is a unique repository of highly validated splicing information and will use machine-learning techniques to identify signals that explain the control of alternative splicing. Firstly, both profile-based probabilistic approaches and word-based enumerative approaches will be used to attempt to identify short, imperfectly conserved, sub-sequences that are correlated with alternative splicing. The MAASE resource allows generating datasets that comprise many distinct classes of alternative splicing so class specific effects can be identified. It is likely that these signals, if present, are weak and act in groups (as in eukaryotic promoters). Methods that allow multiple motifs to be combined will be used, and new methods developed. Because secondary and tertiary structure are known to be important for RNA function, and because conserved sequence motifs are weak and difficult to find, it appears highly likely that structural motifs could be involved in control of RNA splicing. Methods to identify conserved secondary or tertiary structures are not well developed. The PIs will use existing methods and develop novel methods that are capable of identifying the greatest common structure in a training set of sequences. Such shared structures are strong candidates for structural regulatory elements. The datasets produced and used in this study will be made publicly available for other research groups. This project also provides a strong vehicle for training students at the boundary of biology and computation.
View original record on NSF Award Search →