CAREER:Reducibility among high-dimensional statistics problems: information preserving mappings, algorithms, and complexity.

$650,000FY2020CSENSF

Massachusetts Institute Of Technology, Cambridge MA

Investigators

Abstract

Large-scale data collection and analysis is transforming science, engineering, and industry. At their core, applications rely on statistical inference whereby meaning is extracted from noisy data. This is challenging due not only to the sheer scale, but also the complex underlying structure that must often be exploited. The two basic resources for carrying out an inference task are (1) data, which can vary in quality and quantity; and (2) computation. There appears to be a trade-off: The minimum amount of data needed by computationally infeasible algorithms is often far less than what is required by all known computationally efficient algorithms. As this trade-off is poorly understood, the objective of the project is to create the mathematical foundations for reasoning about the optimal computational and statistical efficiency of algorithms for structured inference problems. Significant societal impact may be possible via new fundamental insights relevant to basic procedures and methods used across business, economics, medical technology, social network analysis, genetics, neuroscience, and many other domains. Furthermore, as part of a synergistic research and education plan, the project will take a multi-faceted approach to STEM education, including: (1) mentorship of both undergraduate and graduate students in research; (2) outreach to both high school and undergraduate groups; (3) creation of a collaborative research experience for undergraduate students on topics related to the project; and (4) development of new courses at MIT, both at the advanced undergraduate and PhD levels. The project aims to develop a comprehensive average-case complexity theory of statistical inference, analogous to the classical P versus NP theory for combinatorial problems, characterizing when inference is or is not efficiently solvable and showing strong equivalences between problems. The bulk of the technical approach is based on developing new techniques for average-case reductions that yield precise relationships between different problems. Average-case reductions transform one problem into another, implying a comparison between the data and computation resources required by each. In addition to characterizing fundamental limits, the approach pursued in this project will result in algorithms and insights. By creating a web of reductions between problems, it will be possible to translate algorithms for one problem into algorithms for other problems. Furthermore, strong equivalences obtained via such two-way reductions will show that the same phenomenon appears in seemingly different problems because they are at their core the same problem. The project has the potential to change the basic methodology for studying statistical inference problems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →