A deep learning algorithm to detect signs of cognitive impairment in electronic health records
Massachusetts General Hospital, Boston MA
Investigators
Abstract
Alzheimerâs Disease and Related Dementias (AD/ADRD) outcomes from real-world data, such as electronic health records (EHR), offer the possibility of examining a wide variety of research questions that cannot be answered efficientlyâor at allâin other settings. A key challenge is that AD/ADRD is under-recognized in the community, under-diagnosed by healthcare professionals, and under-coded in claims dataâand can be mislabeled in any setting. Thus, approaches relying on dementia diagnosis codes or medications suffer from inaccuracies in these data. EHR has a wealth of information in clinical notes, patient health history, and health system interactions that often contain signs of cognitive decline. Deep learning algorithms can leverage and learn from these complex text and data patterns in EHR. In this proposal, we aim to develop and evaluate a deep learning algorithm to improve the detection of cognitive impairment due to underlying AD/ADRD pathophysiology (including cognitive concerns, mild cognitive impairment, and dementia) using the EHR of three large healthcare institutions. For training and evaluation of the algorithm, we will use a âseedâ reference standard set with detailed chart review and adjudication of cognitive diagnosis by an expert clinician (n=1,000), and then apply active learning strategies with diversity sampling to better reflect the characteristics of US older adults and iteratively increase sample size to n=20,000. We will rigorously evaluate the algorithm using EHR from all three institutions, and develop openly available guidelines and resources for the research community. Our specific aims are: 1) To develop and evaluate a deep learning NLP tool to identify patients with cognitive impairment using EHR at one institution; 2) To refine and evaluate the performance of our EHR deep learning algorithm at two other healthcare institutions; and 3) To develop open guidelines, resources, and tools for EHR data use in dementia research. We will measure the marginal improvement in accuracy of our deep learning- based classification relative to models based on diagnosis codes and medications alone, and characterize the predictors of poor model performance, both to improve the model and to understand potential biases. As such, our tool will provide a better understanding of the limitations of using diagnosis codes and/or medications in dementia research. Cutting-edge deep learning algorithms have been applied to many real-world tasks but in a limited manner to AD/ADRD. We anticipate that our state-of-the-art deep learning algorithm, which will be rigorously developed and validated with large representative datasets at multiple institutions, will more efficiently and accurately detect signs of cognitive impairment and can be readily deployed by practitioners. Improved screening of cognitive impairment in EHR will enhance dementia research studies and enable large- scale pragmatic trails. In the future, we hope, the proposed tool will also be useful in clinical settings to flag patients with cognitive impairment who could benefit from an evaluation or be referred to specialist care.
View original record on NIH RePORTER →