Predicting Phenotype by Deep Learning Heterogeneous Multi-Omics Data

$368,260R01FY2025LMNIH

University Of Texas Hlth Sci Ctr Houston, Houston TX

Investigators

Linked publications, trials & patents

Paper 39424325 Paper 39229048 Paper 39149497 Paper 38940937 Paper 38925326 Paper 38807368 Paper 38796699 Paper 38737008 Paper 38660773 Paper 38651367 Paper 38496574 Paper 38326887 Paper 38306043 Paper 38280850 Paper 38167548 Paper 38063850 Paper 38041473 Paper 38037121 Paper 37950041 Paper 37886464 Paper 37841839 Paper 37790454 Paper 37720047 Paper 37479893 Paper 37425929 Paper 37372402 Paper 37127280 Paper 36863698 Paper 36796294 Paper 36754068 Paper 36720042 Paper 36526218 Paper 36522785 Paper 36451277 Paper 36282535 Paper 36253801 Paper 35885993 Paper 35883662 Paper 35879967 Paper 35777578 Paper 35774010 Paper 35763977 Paper 35640139 Paper 35610053 Paper 35545758 Paper 35470070 Paper 35451016 Paper 35412907 Paper 35198011 Paper 34848761 Paper 34716463 Paper 34430929 Paper 34284104 Paper 34214162 Paper 34155559 Paper 34104955 Paper 34048560 Paper 33977077 Paper 33923155 Paper 33788962 Paper 33741950 Paper 33619490 Paper 33371872 Paper 33330858 Paper 33324010 Paper 33300042 Paper 33288841 Paper 33272935 Paper 33234712 Paper 33211888 Paper 33178176 Paper 33137204 Paper 32615059 Paper 32578842 Paper 32528130 Paper 32510566 Paper 32241273 Paper 32241259 Paper 32170004 Paper 32091591 Paper 31965022 Paper 31727022 Paper 31680168 Paper 31672653 Paper 31598702 Paper 31589286 Paper 31481703 Paper 31262291 Paper 31141125 Paper 31122291 Paper 31024628 Paper 30886153 Paper 30824912 Paper 30793038 Paper 30712509 Paper 30704473 Paper 30649169 Paper 30625331 Paper 30577873 Paper 30577846

Abstract

Project Summary Numerous genetic/epigenetic, transcriptomics, and single-cell omics data, as well as phenotypic and medical record data have been generated. Many important findings regarding potential genetic markers, their related mechanisms, and promising drug targets have been reported. However, the strong complexity of brain diseases, such as Alzheimerâs disease (AD), and the currently limited size of the data in genetic and functional studies make feature engineering difficult. Current analytical approaches primarily focus on individual data or simple integration strategies that cannot have sufficient power to identify potential biomarkers or reveal the complex genetic architecture of aging-related diseases such as AD. Recently, multi-scale models and cross-modality deep learning (DL) algorithms have shown powerful for identifying patterns in complex, heterogeneous data that are difficult for humans to discern. In this renewal R01 project for deep learning heterogeneous multi-omics data, we hypothesize that the multi-scale modeling can effectively detect the regulatory modules at the cell-type and sub-phenotype levels, leading to a precision medicine strategy. We bridge spatial scales (molecules, cells, and tissues) and temporal scales. Specifically, we will develop a novel, timely, AI-based framework, MICA-Brain: Multi-scale, Integrated, and Contextualized Approaches for Brain Aging and Disease. We propose three specific aims to contextualize cognitive function risks into their specific cell-types and regulatory map to deeply understand the shared or unique etiological mechanism across brain regions, cell types, and disease stages, which will be applied to AD and other broad brain diseases. Aim 1: we will build a brain molecular chronological age predictor by leveraging the single-cell generative pre-trained transformer (GPT) model trained with millions of brain cells from normal aging individuals, and subsequently predict the chronological age of cells from the individuals with mild cognitive impairment (MCI) and AD. Aim 2: We will develop a self-attention-based single- cell multiomics GPT model to harness the dynamics of gene regulatory networks (GRNs) related to normal aging and AD. Aim 3: We will apply the methods to AD and broad brain diseases. We will use AD as case study because of our rich experience in its extensive data collection and discovery. We will then apply to other brain diseases by comparing the cellular and regulatory features among three neurodegenerative diseases and major neuropsychiatric disorders using three cohorts: BioVU, UK Biobank (UKBB), and All of Us. We will further develop BrainGeneBot, an advanced GPT-powered Retrieval Augmented Generation (RAG) tool, to customize the genomic analysis of brain disease. MICA-Brain will help characterize and transfer the genetic signals at multi- molecular levels to identify the shared and unique molecular mechanism across brain regions, cell types, and disease stages. MICA-Brain will enable us to identify the critical molecular regulators at the cell-type and sub- phenotype levels and help us gain deep insights into aging and brain disease (e.g. AD) progression dynamics.

View original record on NIH RePORTER →