GGrantIndex
← Search

Statistical approaches for integrating multi-view, multi-section and multi-sample spatial transcriptomics data to decipher disease mechanisms

$384,722R35FY2025GMNIH

Emory University, Atlanta GA

Investigators

Abstract

Project Summary Recent technological advances in Spatially Transcriptomics (ST) have enabled gene expression profiling with spatial information in tissues. Data from ST technologies are typically complemented by high-resolution images of the same tissue section creating invaluable assistance for the cellular morphology examination. Given that the availability and affordability of ST technologies are increasing, a number of studies have generated large ST datasets with multiple sections collecting from multiple samples that profile a broad range of disease conditions and stages. Population-scale ST data has made it possible to examine intra- and inter-patient heterogeneity in the tissue microenvironment and holds the potential to uncover biological insights that are not detectable when individual section is analyzed separately. However, analyzing multi-view, multi-section, and multi-sample (3M) ST data presents significant challenges due to its rich information and complex data structure. This proposal aims to provide researchers with a set of analytic tools to tackle these challenges sequentially, beginning with multi-view analysis, progressing to multi-section integration, multi-sample comparison, and ultimately achieving cohort-scale ST data analysis. Building upon my expertise in statistical genomics, I propose the following aims: In Aim 1, we plan to develop a pattern-similarity-aware model to jointly denoise multi-view ST data. We expect our method to selectively utilize the mutual information across gene expression and morphological features derived from H&E images by considering their spatial coherence. This approach should provide denoised data of high quality while minimizing false artifacts, ensuring the quality of downstream analysis. In Aim 2, we will develop a machine learning method to integrate sequential multi- section ST data. We anticipate that our method will reconstruct the 3D tissue structure and uncover new findings through 3D volumetric measurements, which cannot be achieved with 2D analysis. In Aim 3, we will develop a label transfer method to extrapolate pathologists' annotations from a limited number of samples to encompass all samples. This will facilitate the construction of a comprehensive multi-sample ST data atlas with annotations. Subsequently, we aim to establish a systematic framework for cohort-scale comparisons of the ST data atlas. By analyzing the heterogeneities within specific regions across samples at varying disease stages, we can gain a comprehensive understanding of various disease mechanisms from multiple perspectives. Such knowledge is essential for the development of precision medicine, therapeutic strategies involving small molecules, and their targeted delivery to affected regions. Upon completion of the proposed projects, we will have developed the very first methods for population-level ST analysis. Our proposed computational tools will help researchers maximumly mine the rich information in 3M ST data and establish a pipeline for systematic comparisons between the multi-sample ST data collected at different disease progression stages.

View original record on NIH RePORTER →