Mean Field Asymptotics in Statistical Inference: Variational Approach, Multiple Testing, and Predictive Inference
University Of California-Berkeley, Berkeley CA
Investigators
Abstract
The era of big data poses unprecedented statistical and computational challenges in high-dimensional statistical inference. One challenge is the “dual objective” nature of various statistical inference tasks: statisticians hope to design procedures that achieve near-optimal statistical efficiency and satisfy desired validity guarantees even under model misspecification. Furthermore, many statistical inference procedures involve a Bayesian component, and performing exact Bayesian inference on large-scale datasets is computationally challenging. This project will address these challenges in some high dimensional statistical inference tasks. The techniques and methods developed in the project will further advance the interplay between a broad range of areas including high-dimensional statistics, statistical physics, optimization, information theory, and statistical machine learning. Results from this project are anticipated to have applicability in computational biology, computer vision, neuroscience, natural language processing, and multiple testing. Graduate and undergraduate students will be exposed to these results through involvement in the project, and the results will be incorporated in courses. This project aims to resolve statistical and computational challenges in multiple testing and predictive inference, using the mean field asymptotic theory of statistical inference. Focusing on a few stylized problems, the program consists of three major research thrusts: 1) analyze the non-convex landscape of Thouless-Anderson-Palmer (TAP) variational inference objective functions and design efficient algorithms for optimizing these functions; 2) in the task of false discovery rate (FDR) control, design procedures that maximize the number of discoveries when models are correctly specified while controlling the frequentist FDR even under model misspecification; and 3) in the task of predictive inference, design procedures that give reasonably small prediction sets while maintaining the frequentist validity of coverage in the presence of model misspecification. This research will develop new techniques for studying the mean field asymptotics of high-dimensional statistical models, which will likely be applicable beyond the specific statistical models and will be relevant in other areas of science and engineering. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →