GGrantIndex
← Search

C1F21 DIBBS: Porting Practical Natural Language Processing (NLP) and Machine Learning (ML) Semantics from Biomedicine to the Earth, Ice and Life Sciences

$1,497,785FY2014CSENSF

University Of Colorado At Boulder, Boulder CO

Investigators

Abstract

Semantics is the study of word-based information. The sciences are filled with word-based descriptive data: field observations, materials and habitat identifications, parameter names and units, events and processes. Semantics are also important in medicine, where the human body and illnesses have to be described. To enhance interoperability among these word-based (semantic) systems, and to more readily explore the rapidly growing quantities of semantic data, there has been a movement towards organizing word-based data in ways that allow machine-assisted, automated analysis. Biomedicine has made great progress in organizing and using semantic information because of substantial funding investments. This project builds upon extensive investments in the biomedical field, providing an opportunity to rapidly develop the organization of semantic concepts for other domain sciences. A toolkit developed by the Center for Computational Language and Education Research (CLEAR TK) will be used to build semantic resources (taxonomies, ontologies, and semantic networks) for three science domains (geology, cryology, and biology). CLEAR TK is a state-of-the-art natural language processing (NLP) and machine learning (ML) system that also has essential tools for machine-assisted annotation, validation, document tagging, and event extraction. The CLEAR TK system has been used operationally for biomedical semantic applications, including in high-profile hospitals. In this project, developments are focused upon the science fields of geology, ice and snow, and biology. In these fields, accurate extraction of semantic information from the word-based data is required so users can quickly find the data they really need. This project provides a valuable opportunity to expand and evaluate semantic capabilities in conjunction with several scientific domain experts.

View original record on NSF Award Search →