III: Small: Combinatorial Algorithms for High-dimensional Learning
College Of William And Mary, Williamsburg VA
Investigators
Abstract
This project designs a new suite of statistical learning algorithms for high-dimensional problems in which the number of observations is insufficient to support conventional methods. Solving these problems is vital for modern scientific discoveries in social and environmental issues that rely on both complex models and massive data. For example, many empirical asset pricing models can use only 10 years of daily market data, for a total of approximately 3,000 observations, to fit significantly more than 3,000 parameters. The algorithms created in the project integrate key tools from theoretical computer science with data to deliver more accurate predictions with significantly fewer samples. The suite of new algorithms will assist with data-centric problems in financial econometrics, social network analysis, recommender systems, and skillset inferences. From a technical standpoint, this project uses graph-based algorithms and average case analysis to exploit statistical patterns that cannot be addressed by existing tools. These patterns include, for instance, block diagonal structures of the learnable parameters in vector regression models, and covariance matrices of mildly correlated features that exhibit heavy-tailed spectra. The project consists of two thrusts. First is the design of learning algorithms by relating high-dimensional problems to graph-learning problems, and generalizing graph-learning techniques. Second is the construction of new algorithms based on average case analysis, motivated by (i) the avoidance of over-conservativeness by using distributional assumptions, and (ii) the use of ``promises'', a notion borrowed from theoretical computer science, that guides prediction power using input structures. Utilizing weak distributional assumptions and promises together in turn allows the design of effective algorithms for a multitude of problems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →