GGrantIndex
← Search

Improving precision health approaches through large-scale EHRs and biobanks

$1,338,175ZIAFY2022HGNIH

National Human Genome Research Institute

Investigators

Linked publications & trials

Abstract

During the current reporting period we have focused on these major projects: 1. All of Us Smoking PheWAS The All of Us Research Program (All of Us) is a multi-site, national study with the goal of recruiting at least one million participants to further precision medicine. We assessed the extent to which the currently-available dataset for All of Us can replicate known findings reported in meta-analyses with a commonly studied environmental exposure, cigarette smoking, as defined in EHR data and via survey responses completed upon enrollment. We conducted 3 PheWAS studies on EHR-reported smoking history, survey responses of ever having smoked, and survey responses that indicate current smoking status. In a manuscript currently in preparation, we found that the majority of phenome-wide significant phenotypes (that were able to be matched to a meta analysis) found in each of the three PheWAS were able to rapidly replicate the direction of at least one previously published meta-analysis effect size. Survey driven smoking provides a more sensitive and detailed measure of smoking behavior, but both EHR and survey smoking behavior replicated known associations well. 2. Study of hypertension traits in large-scale biobanks According to the CDC, nearly half of adults in the United States (108 million, or 45%) have hypertension defined as a systolic blood pressure >=130 mm Hg or a diastolic blood pressure >=80 mm Hg or are taking medication for hypertension. We wanted to evaluate associations of geographic genetic ancestry with hypertension and underlying blood pressure traits. In collaboration with others from the Million Veterans Program (MVP), we tested genetically inferred ancestry proportions from five 1000 Genomes reference populations (GBR, PEL, YRI, CHB, and LWK) for association with 4 continuous blood pressure (BP) traits (SBP, DBP, PP, MAP) and the dichotomous outcomes hypertension and apparent treatment-resistant hypertension in European American, African American, and Hispanic American individuals from the MVP database and were able to demonstrate that risk for BP traits varies significantly by genetic ancestry. More recently, as part of a multi-institution collaborative effort, we are exploring the genetic determinants of hypertension in African-ancestry individuals in the All of Us Research Program, taking advantage of the rich diversity of that database to identify African ancestry-specific loci that may contribute to the development of hypertension. In a related study, our group and others sought to estimate the effect of BP traits and BP-lowering medications (via genetic proxies) on peripheral artery disease. Genome-wide association studies (GWAS) summary statistics were obtained for BP, peripheral artery disease (PAD), and coronary artery disease. Causal effects of BP on PAD were estimated by 2-sample Mendelian randomization (MR) using a range of pleiotropy-robust methods. In a manuscript published this year, we found that higher BP is likely to cause PAD. BP-lowering through blockers, loop diuretics, and thiazide diuretics (as proxied by genetic variants) was associated with decreased risk of PAD. 3. Personalized drug therapy/drug repurposing Electronic Health Records (EHRs) contain a comprehensive collection of dense disease history, medication exposures, and demographic data that can be utilized for personalized drug therapy and repurposing of existing medications to treat other conditions. In a recently published study, we and our co-authors incorporated tissue-specific predicted gene expression summary statistics to proxy therapeutic effects of two lipid control medications on T2D risk using two-sample MR methods. Sex, race, and ethnicity can affect the efficacy, pharmacokinetics, and side effects of certain medications. In another study currently underway, we are leveraging EHR records in All of Us to identify demographic-based disparities in efficacy for anti-hypercholesteremia, anti-hypertensive, and anti-diabetic drugs. Preliminary data suggest that this study design is an effective way to evaluate drug efficacy and establishes the potential of real-world evidence in the form of EHR data to guide medication therapy selection. Our results demonstrate a systematically lowered antihypertensive drug efficacy in Black All of Us participants, due partially to a higher systolic blood pressure at the start of medication. 4. COVID-19 Variation in outcomes for patients infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus which causes coronavirus disease 2019 (COVID-19), is still poorly understood. Epidemiological investigation of disease status and mortality with EHR-based outcomes may elucidate phenotypic patterns to improve the accuracy of prognoses. The National COVID Cohort Collaborative (N3C) provides researchers with a wealth of observational data to assist their goals in understanding disease progression and treatment. N3C currently consists of more than 15 million patient EHRs from over 75 institutions. The goal of one sub-project of our lab has been to conduct a PheWAS in order to find common phenotypes associated with COVID-19 in the N3C enclave. We found highly statistically significant findings from PheWAS analyses to be symptoms most notably associated with COVID-19 morbidity and mortality. We also investigated common phenotypic patterns associated with COVID-19 between the N3C and All of Us Research Program data enclaves. In addition to directly comparing phecode frequencies from All of Us and N3C, we also stratified the data by demographic information and COVID-19 diagnosis, and are seeking out explanations for the differences in rates of occurrence for phecodes across these substrata. Another recent COVID-related project sought to discover the relationships within and between quarantine- and pandemic-related coping mechanisms and their effects on mental health outcomes by utilizing the All of Us COVID-19 Participant Experience (COPE) survey data. A manuscript for this analysis is now in development. 5. GWAS in All of Us With the release of whole genome sequencing data for nearly 100,000 participants in the All of Us dataset this past March, many in our lab turned their focus to genetic studies in this phenotypically rich, and racially/ethnically diverse cohort. Genome-wide association studies are currently underway to investigate the genetic variants associated with syndromes such as Clonal Hematopoiesis of Indeterminate Potential (CHIP), endometriosis, primary hypothyroidism, Syndrome of Inappropriate Antidiuretic Hormone Secretion (SIADH), Neurofibromatosis type 1 & 2, multiple sclerosis, and uterine fibroids, among others. 6. PheWAS and PheRS of Mendelian Diseases We have previously demonstrated a scalable approach to discover new phenotypes associated with rare genetic disorders utilizing EHR data. This year we published a manuscript that used PheWAS to replicate known gene-phenotype relationships for hereditary cancer syndromes from the eMERGE Network, but also identified a set of novel phenotypes that were subsequently replicated in an independent EHR-derived cohort. Pathogenicity of rare variants was accurately predicted by utilizing PheRS. We have found that a similar approach can be applied to other datasets, such as UK Biobank and All of Us, and for other Mendelian diseases. For example, in a new collaborative effort, we are using these techniques to aide with identification of pathogenic variants in genes associated with Familial Mediterranean Fever (FMF). In this and other such studies currently underway, we hope to validate PheRS as an approach for improved variant interpretation and identification of participants with Mendelian disease across multiple racial and ethnic groups.

View original record on NIH RePORTER →