GGrantIndex
← Search

CAREER: Statistical Models and Parallel-computing Methods for Analyzing Sparse and Large Single-cell Chromatin Interaction Datasets

$559,251FY2023BIONSF

Rowan University, Glassboro NJ

Investigators

Abstract

The 3D organization of nuclear chromatin plays a critical role in maintaining normal cellular functions. Recent development of single-cell Hi-C (scHi-C) technologies allows researchers to delineate the genome-wide chromatin interactions in individual cells and answer fundamental biological questions. However, computational methods for analyzing scHi-C data are largely lagging behind because of data sparsity, high diversity and complicated hierarchy of chromosomal organization. This project will address these challenges by building a suite of models and parallel tools for both experimental and computational biologists to analyze sparse and large scHi-C datasets, which will help them understand 3D chromatin interactions and gain deep insights into functional outcomes. The saturation model produced by this project will lead to a major improvement in experimental quality, enhanced data analysis, and reduced experimental costs. The integrated research and educational activities include diverse students training and curriculum development for both undergraduate and graduate courses in interdisciplinary subjects spanning computer science, statistics, bioinformatics and biology. The objectives of this project are to study the cell-specific chromatin organization at different scales by using stochastic theory, wavelet transform, and parallel-computing techniques. First, a stochastic model will be established to understand the saturation status of scHi-C data by evaluating the major protocol steps and sequencing depth. Second, multiscale-based methods will be designed for detecting topologically associated domains and protein-mediated loops, cell clustering, and comparative analysis. Third, all models and tools will be fully parallelized for processing large datasets. The high performance and running speed of these methods will help researchers achieve biological discoveries in a one-stop service, and save time and money for experiments and computational analysis. The software pipelines will be implemented by Python/R languages and publicly accessible through the GitHub repository https://github.com/chenyongrowan and a webserver at Rowan University. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →