GGrantIndex
← Search

Collaborative Research: CDS&E: Scalable Inference for Spatio-Temporal Markov Random Fields

$249,998FY2022MPSNSF

Regents Of The University Of Michigan - Ann Arbor, Ann Arbor MI

Investigators

Abstract

Modern systems are known to be massive-scale, with a hierarchy of complex, dynamic, and unknown topologies. For example, in genomics, the interactions among genes can be modeled via spatio-temporal gene regulatory networks across different cells. The inference of temporal and spatially-rewired gene expression networks carries enormous implications for dynamic disease processes, offering key mechanistic insights into the dynamic variations of interacting biological processes in space and time. The behavior of such interconnected systems can be captured via spatio-temporal graphical models. The existing methods for inferring these models suffer from several statistical and computational drawbacks which render them impractical in realistic settings. With the goal of bridging this knowledge gap, this project aims at developing efficient computational tools for the inference of spatio-temporal graphical models that are not only provably optimal, but also adaptive, parallelizable, and implementable in meaningful scales. The methods developed in this proposal will be studied in the context of inferring gene networks underlying oncogenesis. The datasets generated through these efforts will be accompanied with well-developed analytics tools to derive mechanistic insights into the nature of gene-networks underlying biological processes. More broadly, the proposed machinery will give rise to models that are interpretable by domain experts, and will lead to a rich set of publicly-available datasets that can be used as test-bed for different inference methods, resulting in broader artificial intelligence (AI)-human collaborations. Much of the progress in the inference of graphical models is based on the maximum likelihood estimation (MLE) with relaxed regularization, which neither result in ideal statistical properties nor scale to dimensions encountered in spatio-temporal settings. This project will address these challenges by departing from the regularized MLE paradigm, and resorting to a new class of constrained optimization problems with combinatorial nature that can systematically capture the hidden-but-useful structure of the spatio-temporal graphical models. Due to the prohibitively complex nature of the MLE-based methods, their practical implementations cannot simultaneously guarantee computational efficiency and favorable statistical performance. Therefore, the proposed approach will be the first systematic inference framework that can achieve the best of both worlds in a unified fashion. The new class of estimation methods will have a profound impact in statistical learning: it will lead to a renewed interest in the use of tractable discrete approaches and their statistical properties, and will pave the way towards the discovery of new inference methods suitable for the large-dimensional and spatio-temporal settings. In addition, the proposed project will be the first systematic study of a class of discrete optimization problems that are currently poorly understood, thus contributing to the combinatorial and mixed-integer communities as well. Given its interdisciplinary nature, the project will also largely contribute to training of future generations of researchers in data science. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →