Pharmacovigilence using Natural Language Processing, Statistics, and the EHR
Columbia University Health Sciences, New York NY
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): The long-term objective of this proposal is to advance patient safety and reduce the cost of medical care by discovering novel adverse drug events (ADEs) through use of automated methods. We will utilize natural language processing (NLP) and data mining methodologies on vast quantities of clinical data in electronic health records (EHRs) to detect novel ADE signals. ADEs are major problems world-wide and cause hospitalizations, deaths, and incur a huge cost to health care. Therefore, continued post-marketing surveillance encompassing large and varied patient populations is crucial for patient safety. EHRs contain a comprehensive amount of clinical information, which if harnessed properly, would be invaluable for pharmacovigilance. We have already demonstrated that we can accurately encode information in clinical reports using the NLP system MedLEE, and that we can accurately detect associations among clinical events using statistical methods that we developed. Therefore, this is an excellent opportunity to continue our research accomplishments and to advance the state of the art in pharmacovigilance. More specifically, MedLEE will be used to map comprehensive clinical information in the EHR to codified data, and then statistical methods will be used to generate an extensive knowledge base of disease-symptom, disease-drug, drug-drug, and drug-symptom associations, which will be used to discover new ADEs. Additionally, we will develop methods to determine the correct sequence of drug, disease, and symptom events, which is critical for detecting ADEs. We will also develop methods to map fine-grained concepts into higher level concepts, which is important for optimizing the statistical methods. The performance of our discovery methods will be evaluated by testing the methods using drugs currently in use with known ADEs, and also by using historical rollback. We will first focus on discovery of short-term events using inpatient records, and then longer-term events using outpatient office visits. This proposal is well positioned to overcome problems associated with existing automated methods based on spontaneous reporting databases and administrative databases. We are confident the methods will be effective because a strong infrastructure is in place for us to build upon. Most importantly, the methodology developed in this proposal presents an excellent chance to dramatically improve patient safety and reduce costs.
View original record on NIH RePORTER →