GGrantIndex
← Search

Representing and Acquiring Knowledge of Genome Regulatio

$322,604R01FY2006LMNIH

University Of Michigan At Ann Arbor, Ann Arbor MI

Investigators

Linked publications & trials

Abstract

DESCRIPTION (provided by applicant): Knowledge in molecular biology consists of assertions about the relationship of molecular entities qualified by context which describes when and where those assertions apply. The vast majority of knowledge in molecular biology resides in the primary research literature, and only a small fraction of this knowledge is currently accessible through well-structured databases. This is a pilot project to develop automated knowledge extraction technology. We will use the regulation of gene expression in hematopoiesis as a test domain. Knowledge acquisition will be accomplished through a multi-stage process: parsing the document and sentence structure, recognizing the names of known biological entities and matching sentences to verb based templates to capture assertions (e.g. ;A binds B; or ;A contains B; A regulates B;) and preposition templates to capture context in which these assertions apply. A multi-disciplinary approach will be used drawing on experts in bioinformatics, databases, information science and computational linguistics. Four unique aspects of this project are the definition of a multi-dimensional description of molecular biological context, the use of preposition templates and hierarchical document structure to capture and make inference on context, the development of domain specific parsing techniques and the use of probabilistic representations explicitly represented in XML throughout text processing, parsing, knowledge acquisition and information integration.

View original record on NIH RePORTER →