Encoding Semantic Knowledge in Vector Space for Biomedical Information
University Of Texas Hlth Sci Ctr Houston, Houston TX
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): Effective information retrieval from the biomedical literature is essential to supporting evidence-based clinical practice. However, research suggests that clinicians have difficulty generating sufficiently specific queries using existing interfaces to electronic information resources. On account of the rapid proliferation of the biomedical research literature, there is a need for the development of tools to enable clinicians and researchers to find and retrieve documents of interest. The vector space model, in which documents are represented as vectors in a high-dimensional space, is well established in information retrieval. However, as this model indexes documents on the basis of terms (or concepts in variants of the model), this limits the specificity with which these documents can be queried. In our recent research, we have developed Predication-based Semantic Indexing (PSI), a vector-based model which encodes knowledge in the form of object-relation-object triplets (or predications) extracted from MEDLINE by the SemRep system, into vector space. In the proposed research we will develop and evaluate a new model of information retrieval based on PSI. This model will enable searching for documents using concepts and relations, in order to answer specific questions such as "what is used to treat Tuberculosis". This model represents a new direction in information retrieval research, and our hypothesis is that document representations based on predications will enable the specification of queries that are more precise than are possible with existing models. To test this hypothesis, the model will be evaluated using the OHSUMED test set, and compared to the traditional vector space model using standard performance metrics.
View original record on NIH RePORTER →