Developing novel machine learning approaches to studying cell development
The University Of Texas Health Science Center At Houston, Houston TX
Investigators
Abstract
The overall goal of this research is to develop novel machine learning (ML) methods to accurately characterize the dynamic process and states of a cell cycle, to model the lineage commitment along differentiation process, and to predict the key elements that regulate cell cycle and differentiation processes with single-cell multi-omics data. This will lead to insights into the mechanisms coupling between cell cycle differentiation for cell growth and development. The tools will significantly improve our understanding of stem cells, germinal cells and tissue development and functions. The ML methods can be modified and applied to other general problems in biology. Graduate and undergraduate students will work under this project and gain experience in doing leading-edge research. The project will develop two ML approaches: integrated sinusoidal and piecewise autoencoder (SPAE) for cell cycle estimate and cell development study, and cell cycle-aware domain separation network (CADSN) to study the dynamic coupling of cell cycle and differentiation. SPAE employs a distinct sinusoidal autoencoder to characterize the circular process in the cell cycle and piecewise autoencoder to capture the inherent nonlinear data structure in high-dimensional space, formed by single cells sampled from various stages of a periodic process. This model can also characterize the connections between the cell cycle and cell development while it effectively unfolds the circular manifold onto a non-linear space to obtain precise pseudo-time. Secondly, CADSN is proposed for multi-omics single-cell data integration and label transfer. An autoencoder-based cell-cycle-aware domain separation network is proposed to predict and remove cell cycle effects from the integrated multi-omics single-cell data while keeping the cell type-specific heterogeneity. This is the first computational model to study the cell-cycle effect in the integrated analysis for multi-omics single-cell data. Finally, some of the inferred results will be experimentally validated by using the team’s established models. Software prototypes and the gene biomarkers regulating cell cycle and differentiation will be made publicly available to the research community via a project website at https:/ccsm.uth.edu/NSF-SSL. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →