NLP Foundational Studies & Ontologies for Syndromic Surveillance from ED Reports
University Of Pittsburgh At Pittsburgh, Pittsburgh PA
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): Many NLP applications have been successfully developed to extract information from text. Most of the applications have focused on identifying individual clinical conditions in textual records, which is the first step in making the conditions available to computerized applications. However, identifying individual instances of clinical conditions is not sufficient for many medical informatics tasks - the context surrounding the condition is crucial for integrating the information within the text to determine the clinical state of a patient. We propose to perform in-depth studies on NLP issues requiring knowledge of the context of clinical conditions in clinical records. We will focus our research by using syndromic surveillance from emergency department (ED) reports as a case study. For this proposal, we will test the following hypothesis: An NLP system that indexes clinical concepts and integrates contextual information modifying the concepts can identify acute clinical conditions from ED reports as well as physicians can. We will identify clinical concepts necessary for surveillance of seven syndromes, including respiratory, gastrointestinal, neurological, rash, hemorrhagic, constitutional, and botulinic. To evaluate the hypothesis, we will perform the following specific aims: Aim 1. Perform in-depth, foundational studies on four NLP topics to gain a deeper understanding of the pertinent NLP research capabilities required for identification of acute clinical conditions from ED reports, including negation, uncertainty, temporal discrimination, and finding validation; Aim 2. Apply the knowledge learned from the foundational studies to develop and evaluate an automated application for ED reports that will determine the values for clinical variables relevant to identifying patients with any of seven syndromes. The research is innovative, because it will generate an in-depth study of multiple NLP topics crucial to understanding a patient's clinical state from textual records and will focus on contextual understanding and analysis. The research will be guided by linguistic principles, by the semantics and discourse structure of ED reports, and by the application area of biosurveillance. Because we will develop research methods and tools that are customized to a particular domain, we will constrain the research space, which will provide direction and enhance the chance for success. However, the methods and tools generated by this research should be extensible to other clinical report types and to other domain applications, because we will explicitly specify and study NLP concepts and relationships that are common to many application areas.
View original record on NIH RePORTER →