Statistical and Computational Foundations of Deep Generative Models
New York University, New York NY
Investigators
Abstract
Complex data are continuously generated across all areas of science and engineering on a daily basis, from photographs or news articles to biological or cosmological experiments. In order to extract meaningful information out of this stream of material, it is necessary to build appropriate statistical models that faithfully represent each data modality. Indeed, such statistical models are critical to assess the expected performance of data analysis methods on future events, and form a key component of several data processing pipelines called `inverse problems’. For example, removing noise and defects from an image, or predicting the most likely folding of a protein are instances of inverse problems that at their core require a faithful statistical model of the desired output. The main goal of this project is to advance the theoretical foundations of statistical models based on neural networks. Such classes of models provide greater flexibility than traditional statistical modeling, but as a result are harder to analyze and manipulate. The investigators will cover a wide background in machine learning, probability, statistics, and mathematical physics; their combined expertise will result in guiding principles to combine neural networks into a theoretically sound statistical modeling, as well as novel algorithms with statistical guarantees. The research outcomes will be directly applicable to a wide range of problems in science and engineering, ranging from cosmology, climate modeling, chemistry, and signal processing, and they will be tightly integrated into educational courses. The success of deep learning (DL) across science and engineering suggests that Deep Neural Networks (DNN) are effective function approximation models for complex high-dimensional data, yet the reasons for such capability are still poorly understood. To make headway on this problem, this project focuses on generative probabilistic modeling. Understanding their inner-workings is essential to explaining the success of DL on typical problem instances, as opposed to worst-case (too pessimistic) or unstructured (too simplistic) data distributions. Additionally, probabilistic models are at the core of computational tools used in many scientific disciplines, yet they often rely on domain expertise preventing them to scale efficiently with dimension. This project puts forward a unified view on generative modeling that simultaneously addresses approximation, estimation, and optimization aspects. Specifically, it covers both explicit modeling, given by Boltzmann-Gibbs distributions, and implicit modeling, given by Transport-based models (Generative Adversarial Networks, Normalizing Flows). It will establish guarantees of learning and sampling from these models when using DNNs as function approximation. This project will rely on methods for importance-sampling developed in computational sciences (such as Replica Exchange and Thermodynamic Integration) and upgrade them to operate alongside DNNs. It will also derive novel algorithms that combine implicit with explicit generative modeling. Finally, it will exploit physical priors such as symmetries and multiscale structure, and assess their benefits on challenging domains such as molecular prediction, turbulence, statistical mechanics, and exploration for reinforcement learning. The investigators have combined expertise in all these areas, making them well qualified to carry out the project. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →