AF:Small: Semidefinite Programming for High-dimensional Statistics

$450,000FY2020CSENSF

University Of California-Berkeley, Berkeley CA

Investigators

Abstract

With the deluge of data in almost all areas of human endeavor, there is a growing need for efficient algorithms to uncover underlying patterns and make inferences from the data. The field of statistics has long studied the task of making inferences from data, while largely ignoring issues of computational efficiency, but instead focusing on the quantity of data needed. As data gets increasingly high-dimensional, noisy and corrupted, computational efficiency becomes a vital concern. This project will use the powerful technique of semidefinite programming arising from optimization towards computational tasks in high-dimensional statistics. Specifically, the goal of this research is to develop a systematic approach for using semidefinite programming to design algorithms in high-dimensional statistics. The project will lead to an exchange of problems, definitions and technical tools between the areas of algorithms, statistics and statistical physics. The project includes synergistic educational and outreach activities such as devising a graduate class on complexity of statistics, facilitating undergraduate and graduate research work. There are three central themes to this research work. First, exploiting structure, how can algorithms based on semidefinite programming (SDP) exploit distributional assumptions about the input? More precisely, what is the correct way to encode information about the prior distribution of the input into the SDP-based algorithm? Second, how can algorithms be designed that are robust to noise, outliers or large deviations in the data? In its extreme form, how algorithms be designed with guarantees even in presence of an overwhelming number of outliers? Finally, using techniques inspired by statistical physics, many statistical problems are conjectured to exhibit a sudden change in their computational complexity -- a "computational phase transition". Can the lens of SDP be used to shed more light on these computational phase transitions? The proposed research would not only lead to a flurry of new SDP-based algorithms for this class of problems, but also potentially yield algorithms that provably match the performance of belief-propagation/message-passing algorithms, an algorithmic approach believed to be optimal for Bayesian inference. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →