GGrantIndex
← Search

Predicting Phenotype by Deep Learning Heterogeneous Multi-Omics Data

$368,260R01FY2025LMNIH

University Of Texas Hlth Sci Ctr Houston, Houston TX

Investigators

Linked publications, trials & patents

Abstract

Project Summary Numerous genetic/epigenetic, transcriptomics, and single-cell omics data, as well as phenotypic and medical record data have been generated. Many important findings regarding potential genetic markers, their related mechanisms, and promising drug targets have been reported. However, the strong complexity of brain diseases, such as Alzheimer’s disease (AD), and the currently limited size of the data in genetic and functional studies make feature engineering difficult. Current analytical approaches primarily focus on individual data or simple integration strategies that cannot have sufficient power to identify potential biomarkers or reveal the complex genetic architecture of aging-related diseases such as AD. Recently, multi-scale models and cross-modality deep learning (DL) algorithms have shown powerful for identifying patterns in complex, heterogeneous data that are difficult for humans to discern. In this renewal R01 project for deep learning heterogeneous multi-omics data, we hypothesize that the multi-scale modeling can effectively detect the regulatory modules at the cell-type and sub-phenotype levels, leading to a precision medicine strategy. We bridge spatial scales (molecules, cells, and tissues) and temporal scales. Specifically, we will develop a novel, timely, AI-based framework, MICA-Brain: Multi-scale, Integrated, and Contextualized Approaches for Brain Aging and Disease. We propose three specific aims to contextualize cognitive function risks into their specific cell-types and regulatory map to deeply understand the shared or unique etiological mechanism across brain regions, cell types, and disease stages, which will be applied to AD and other broad brain diseases. Aim 1: we will build a brain molecular chronological age predictor by leveraging the single-cell generative pre-trained transformer (GPT) model trained with millions of brain cells from normal aging individuals, and subsequently predict the chronological age of cells from the individuals with mild cognitive impairment (MCI) and AD. Aim 2: We will develop a self-attention-based single- cell multiomics GPT model to harness the dynamics of gene regulatory networks (GRNs) related to normal aging and AD. Aim 3: We will apply the methods to AD and broad brain diseases. We will use AD as case study because of our rich experience in its extensive data collection and discovery. We will then apply to other brain diseases by comparing the cellular and regulatory features among three neurodegenerative diseases and major neuropsychiatric disorders using three cohorts: BioVU, UK Biobank (UKBB), and All of Us. We will further develop BrainGeneBot, an advanced GPT-powered Retrieval Augmented Generation (RAG) tool, to customize the genomic analysis of brain disease. MICA-Brain will help characterize and transfer the genetic signals at multi- molecular levels to identify the shared and unique molecular mechanism across brain regions, cell types, and disease stages. MICA-Brain will enable us to identify the critical molecular regulators at the cell-type and sub- phenotype levels and help us gain deep insights into aging and brain disease (e.g. AD) progression dynamics.

View original record on NIH RePORTER →