Discovering and Applying Knowledge in Clinical Databases

$386,810R01FY2010LMNIH

Columbia University Health Sciences, New York NY

Investigators

Linked publications & trials

Paper 39638783 Paper 38782170 Paper 38712282 Paper 38605048 Paper 38547660 Paper 38519026 Paper 38501749 Paper 38370787 Paper 38260285 Paper 38175665 Paper 38010062 Paper 37984547 Paper 37847668 Paper 37604272 Paper 37565138 Paper 37208468 Paper 37128407 Paper 36935011 Paper 36518108 Paper 36188566 Paper 36117719 Paper 36108816 Paper 35998962 Paper 35873596 Paper 35752163 Paper 35680274 Paper 35653017 Paper 35559254 Paper 35482996 Paper 35392566 Paper 35345821 Paper 35196735 Paper 34981073 Paper 34937726 Paper 34899334 Paper 34415908 Paper 34333606 Paper 34304580 Paper 34083350 Paper 33975825 Paper 33882595 Paper 33791740 Paper 33791732 Paper 33775125 Paper 33725121 Paper 33661754 Paper 33653035 Paper 33585936 Paper 33554609 Paper 33367288 Paper 33319713 Paper 33269356 Paper 33164065 Paper 33140068 Paper 33120430 Paper 33114631 Paper 33107944 Paper 33099616 Paper 33024121 Paper 33012341 Paper 32909033 Paper 32864627 Paper 32827027 Paper 32734169 Paper 32632237 Paper 32587982 Paper 32568364 Paper 32511507 Paper 32511443 Paper 32471884 Paper 32379955 Paper 32374408 Paper 32335224 Paper 32134687 Paper 32065600 Paper 31866433 Paper 31668726 Paper 31642211 Paper 31454628 Paper 31365089 Paper 31325501 Paper 30646124 Paper 30414475 Paper 30395248 Paper 30312445 Paper 30172760 Paper 30082302 Paper 29779949 Paper 29531023 Paper 29523157 Paper 29369797 Paper 29337804 Paper 29079501 Paper 29040596 Paper 29024976 Paper 28456512 Paper 28448498 Paper 28410982 Paper 28334070 Paper 28269874

Abstract

DESCRIPTION (provided by applicant): The long term goal of our ongoing project, "Discovering and applying knowledge in clinical databases", is to learn from data in the electronic health record (EHR) and to apply that knowledge to relevant problems. The increasing adoption of the EHR promises to provide data for clinical research and informatics research, but secondary use of the data has been limited. Challenges include the complexity, incompleteness, and inaccuracy of the record. We propose to study the EHR from an information theoretic point of view, treating the EHR as a natural object worthy of study, and applying methods from non-linear time series analysis. Armed with a better understanding of the record, we hope to measure and account for data completeness and to improve interpretation and use of the data. We hypothesize that we can characterize an electronic health record using a formal information theoretic framework, and that the measured properties can help answer informatics and clinical questions. Our aims are to (1) develop an information theoretic framework for characterizing the electronic health record, (2) use the information theoretic framework to study EHR and sampling issues, and (3) use the framework and traditional data mining to answer clinical and informatics questions. We will approach the EHR as a complex time series and characterize the information in the record using univariate sequential mutual information (the degree to which observations of a variable predict future observations) and a network of pair-wise mutual information among all variables, discreet and continuous. The result will be a measure of the predictability of the record and a set of associations among clinical features. We will use the predictability results to study the completeness of a patient's record, the appropriateness of a clinician's sampling rate, outlier data points, and changes in patient acuity. We will use predictability and associations to link narrative abstractions with their primary data, to interpret narrative modifiers, to cluster terms, to find associations (in the context of phenome-wide association studies), and to carry out exploratory analyses of defining phenotype profiles and of mutual information-based surveillance.

View original record on NIH RePORTER →