Crowdsourcing Mark-up of the Medical Literature to Support Evidence-Based Medicine and Develop Automated Annotation Capabilities

$240,650UH2FY2016CANIH

Northeastern University, Boston MA

Investigators

Byron Casey Wallacecontact Zachary Ives Ani Nenkova

Linked publications & trials

Paper 30306147 Paper 30305770 Paper 29314757 Paper 29093611 Paper 28677322 Paper 28541493 Paper 27883904 Paper 27810004

Abstract

? DESCRIPTION (provided by applicant): Evidence-based medicine (EBM) promises to transform the way that physicians treat their patients, resulting in better quality and more consistent care informed directly by the totality of relevant evidence. However, clinicians do not have the time to keep up to date with the vast medical literature. Systematic reviews, which provide rigorous, comprehensive and transparent assessments of the evidence pertaining to specific clinical questions, promise to mitigate this problem by concisely summarizing all pertinent evidence. But producing such reviews has become increasingly burdensome (and hence expensive) due in part to the exponential expansion of the biomedical literature base, hampering our ability to provide evidence-based care. If we are to scale EBM to meet the demands imposed by the rapidly growing volume of published evidence, then we must modernize EBM tools and methods. More specifically, if we are to continue generating up-to-date evidence syntheses, then we must optimize the systematic review process. Toward this end, we propose developing new methods that combine crowdsourcing and machine learning to facilitate efficient annotation of the full-texts of articles describing clinical trials. These annotations will comprise mark-up of sections of text that discuss clinically relevant fields of importance in EBM, such as discussion of patient characteristics, interventions studied and potential sources of bias. Such annotations would make literature search and data extraction much easier for systematic reviewers, thus reducing their workload and freeing more time for them to conduct thoughtful evidence synthesis. This will be the first in-depth exploration of crowdsourcing for EBM. We will collect annotations from workers with varying levels of expertise and cost, ranging from medical students to workers recruited via Amazon Mechanical Turk. We will develop and evaluate novel methods of aggregating annotations from such heterogeneous sources. And we will use the acquired manual annotations to train machine learning models that automate this markup process. Models capable of automatically identifying clinically salient text snippets in full-text articles describing clinical trials would be broadly useful for biomedical literature retrieval tasks and would have impact beyond our immediate application of EBM.

View original record on NIH RePORTER →