GGrantIndex
← Search

Center for Alzheimer's and Related Dementias (CARD): Harmonized Data-Derived Resources for the Alzheimer's Disease and Related Dementias Community

$15,505,643ZIAFY2023AGNIH

National Institute On Aging

Investigators

Linked publications & trials

Abstract

A major collaborative project underway at CARD Advanced Analytics is being carried out in conjunction with Konica Minolta and its Invicro subdivision to standardize and harmonize longitudinal imaging data from the UK Biobank and multiple neurodegenerative disease specific resources such as ADNI and PPMI for hundreds of thousands of brain MRI images (https://invicro.com/invicro-dti-and-nih-bring-imaging-and-genomics-data-together/). Current early deliverables include machine learning derived maps of brain atrophy and the integration of these outcomes with genomics data at scale soon to result in publicly shared code, data and manuscripts (Dadu et al., 2023). CARD has curated all currently available public domain AD/ADRD genetics data, its relevant meta-data and is currently shifting focus to access more deep molecular data. To make all of this data easily discoverable across repositories such as local NIH (Biowulf cluster), commercial cloud resources (Terra.bio and the Alzheimers Disease Data Workbench), we have embarked on building a lightweight tool to locate data we have curated based on existing RedCap infrastructure in place at NIA. We are building an easy to use and low cost of entry tool called the Data file Inventory and Verification Environment for Research (or DIVER). DIVER interfaces with the National Library of Medicines common data elements (CDEs) library to aid in harmonization of AD/ADRD relevant data (https://cde.nlm.nih.gov/home). DIVER also aids in harmonization projects underway as part of collaborations with the University of Mississippi Medical Center on harmonizing extant studies of cognitive aging at NIA such as the Baltimore Longitudinal Study of Aging and the Health Aging and Body Composition Study. In particular the CDEs curated for DIVER have allowed us to accelerate the research of collaborators at the UK Dementias Research Institute to facilitate early work on automated metadata harmonization across global repositories. The goal of CARDs Advanced Analytics team is not to reinvent the wheel and build a new data sharing and analysis platform but leverage current gold standard tools after an internal systematic review of similar public offerings. Curated and harmonized data including deep molecular data from iNDI, clinical and genetic data from GP2 as well as tools have been shared appropriately to GitHub, Terra.bio and the Alzheimers Disease Data Workbench. We have been liaising with internal NIA as well as external/extramural teams to ensure we are following best practices and receive feedback on what we can improve. As part of this data harmonization strategy, we aimed to facilitate biobank scale collaborations by standardizing electronic medical record codes for both the UK Biobank, Finnish Biobank and AllOfUs Study with special attention paid to AD/ADRD relevant data. We are currently beginning collaborations to accomplish similar harmonization and analysis efforts with the Welsh Biobank in Cardiff (SAIL). One early deliverable of these efforts is an analysis of viral exposures associated with risk of neurodegeneration up to 15 years prior to disease manifestation. We identified and replicated twenty two novel pairs of viruses and neurodegenerative diseases in over 500,000 biobank samples as well as replicated the previous association between Epstein-Barr exposure and multiple sclerosis published recently in Neuron (Levine et al., 2022). The follow-up to this report includes an in depth analysis of sleep disturbances as a major contributor to risk of neurodegeneration using an expanded version of this data and codebase. Longitudinal data harmonization and analysis poses a unique set of challenges. We have built a democratized and easily deployable longitudinal data analysis pipeline tailored for genomics data. We are currently expanding its functionality and usability to identify AD/ADRD related imaging and CSF biomarker associations with genetics to provide insights into the genetics of disease progression (https://longitudinal-gwas-pipeline.readthedocs.io/en/latest/). Some proofs of concept for this pipeline include evaluations of cognitive decline in Parkinsons datasets, as well as mortality and depressive symptom studies (Tan et al., 2020). In parallel to our work on genetic clustering across diseases mentioned above, we have also utilized harmonized clinical and genomic data to identify progression phenotypes in ALS/FTD and Parkinsons, with Lewy body dementia and Alzheimers underway (Faghri et al., 2022). Work showcasing biomarker discovery that have identified potential new targets for CSF pTau (Ta et al., 2023, preprint) using these tools have recently been published as has the application of unsupervised learning within a longitudinal context on various types of biomedical data (Dadu et al., 2023). From data management and discoverability, to aggregation and harmonization, CARD Advanced Analytics' proof of concept for this aspect of our scope of work is our current multi-ancestry analysis of Alzheimers disease genetic risk. This project accurately quantifies risk heterogeneity across diverse continental ancestries, evaluates risk prediction generalizability and discovers two novel risk loci while leveraging genetic diversity to fine map genetic risk at nine loci (Lake et al., 2023). Finally, making harmonized datasets, tools and web resources is only useful if the research community can actually use them. CARD Advanced Analytics has been working with external collaborators and CARDs own newly formed Training Team to support hackathons, office hours and one-on-one interactions with members of the research community from a variety of backgrounds to not only show them the resources available to them but also to understand and use these resources efficiently. It is our goal to help democratize complex data science research in the biomedical space at CARD and understand the needs of the research community we are part of. Additional preprints that have resulted from this work: Alvarado CX, Makarious MB, Weller CA, Vitale D, Koretsky MJ, Bandres Ciga S, Iwaki H, Levine K, Singleton A, Faghri F, Nalls MA, Leonard H. omicSynth: an Open Multi-omic Community Resource for Identifying Druggable Targets across Neurodegenerative Diseases. medRxiv Preprint. 2023 Jul 14:2023.04.06.23288266. doi: 10.1101/2023.04.06.23288266. PMID: 37090611; PMCID: PMC10120805. Alvarado CX, Weller CA, Johnson N, Leonard HL, Singleton AB, Reed X, Blauwendraat C, Nalls MA. Human brain single nucleus cell type enrichments in neurodegenerative diseases. medRxiv Preprint 2023.06.30.23292084. Doi: https://doi.org/10.1101/2023.06.30.23292084 Ta M, Blauwendraat C, Antar T, Leonard HL, Singleton AB, Nalls MA, Iwaki H; Alzheimers Disease Neuroimaging Initiative (ADNI); Fox Investigation for New Discovery of Biomarkers. Genome-wide meta-analysis of CSF biomarkers in Alzheimer's disease and Parkinson's disease cohorts. medRxiv Preprint. 2023 Jun 19:2023.06.13.23291354. doi: 10.1101/2023.06.13.23291354. PMID: 37398091; PMCID: PMC10312859.

View original record on NIH RePORTER →