GGrantIndex
← Search

III: Small: From Regular Expressions to Nested Words in Complex Event and Semistructured Information Processing

$499,999FY2011CSENSF

University Of California-Los Angeles, Los Angeles CA

Investigators

Abstract

The need for more powerful and efficient query languages to find complex patterns in stored sequences and data streams is shared by a wide spectrum of applications, including software analysis, complex event processing, identification of RNA structures, temporal databases and XML queries. The goal of this project is to develop a unified framework to support very powerful pattern languages and their query optimization techniques for different application domains. To achieve this goal, the project follows the approach of using (i) nested Kleene-closure (K*) constructs to achieve greater levels of expressive power for the query languages, and (ii) Nested Words and Visibly Pushdown Automata as the basis for their unified implementation and query optimization over different application domains. This K*-based approach was previously applied successfully to relational sequences and are now generalized to different computing environments and application domains. Through the unified framework, the project designs and demonstrates XML and temporal query languages that compare favorably in terms of expressive power and performance with existing ones. It then demonstrates the use of the unified framework in new application areas. In particular, it develops efficient query languages for RNA structures and software analysis. These research results will have great impacts on many applications, such as software analysis, genomic databases, complex event processing, digital government and scientific studies. This project supports Ph.D. students to pursue research in the areas of advanced query languages and data stream management systems. A new graduate-level course covering these areas and integrating the research results from the project are introduced into the curriculum. Publications, technical reports, software and experimental data from this research are available via the project web site at: http://yellowstone.cs.ucla.edu/nsf-projects/RegExpr2NestedWords.html.

View original record on NSF Award Search →