GGrantIndex
← Search

Semantic and Machine Learning Methods for Mining Connections in the UMLS

$153,203R21FY2008LMNIH

Columbia University Health Sciences, New York NY

Investigators

Linked publications & trials

Abstract

[unreadable] DESCRIPTION (provided by applicant): [unreadable] [unreadable] The Unified Medical Language System (UMLS) is an invaluable resource for the biomedical community. [unreadable] One of the intended uses of the UMLS Metathesaurus is to support the translation of terms from a source terminology into terms in a target terminology. It is evident from the research literature on the UMLS that users generally need to perform more broader types of "translations" that involve finding terms with closest meaning to source term (mapping), finding terms that are related to source term and can serve as proxy for various functions (e.g. information retrieval, knowledge discovery) or finding target terms that satisfy some structural or semantic constraint (e.g. information theoretic distance). The methods for finding such "translations" or connections between terms in Meta (other than the case of one-to-one synonymy) are not at all clear. Previous attempts to exploit such connections have depended on either manual selection of relevant connections, or problem-specific algorithms that use expert knowledge about the relative suitability of various inter-concept relationships. We believe that machine learning techniques offer automated, generalizable approaches that are appropriate for use with the UMLS, given the large set of potential connections and the need for a problem-independent approach. We hypothesize that learning strategies that exploit the relational features, scale free properties and probabilistic dependencies of connections in the UMLS will identify meaningful inter-term relationships and that a combined approach will perform better across different problem domains when compared to any of the approaches in isolation. We will evaluate the proposed learning algorithms with training connections from a variety of problem domains in biomedicine. We will disseminate the successful algorithms via the UMLS Knowledge Source API toolkit for mining and visualizing the connections. We believe that the UMLS provides a unique fertile ground to develop novel semantic relational mining methods and advance our understanding of mining large biomedical concept graphs. [unreadable] [unreadable] [unreadable] [unreadable]

View original record on NIH RePORTER →