CAREER: Towards Unifying Database Systems and Information Retrieval Systems
Cornell University, Ithaca NY
Investigators
Abstract
The goal of this research project is to develop techniques for the unification of database systems and information retrieval systems. The main benefit of this approach is to provide a general data management solution for both structured (database domain) and unstructured (information retrieval domain) data. This goal is achieved by (a) using the XML data model to capture structured, semi-structured, and unstructured data, and (b) supporting both database-style and information retrieval-style queries over XML data. The specific components of the research project are (1) supporting ranked keyword search queries (i.e., information retrieval style queries) over a mix of structured and unstructured XML data, (2) supporting complex structured queries (i.e., database style queries) over semi-structured XML data, and (3) integrating the above two components into a unified data management system. The research ideas are implemented using a prototype database system, evaluated using real and synthetic data sets, and the resulting software is made publicly available. These results will have broad impact in applications to commercial data management systems. Additional information on this project can be obtained from the web site http://www.cs.cornell.edu/database/XML/XML.htm. This research also lays the foundation for the education plan, where the goal is to develop a new curriculum that presents unified data management concepts (as opposed to the standard practice of teaching largely unrelated courses on database systems and information retrieval systems). This goal is achieved by developing new undergraduate and graduate courses dealing with the management of both structured and unstructured data.
View original record on NSF Award Search →