CAREER: Semidefinite Programming with Applications in Statistical Learning

$376,073FY2009ENGNSF

Princeton University, Princeton NJ

Investigators

Abstract

CAREER: Semidefinite Programming with Applications in Statistical Learning The research objective of this Faculty Early Career Development (CAREER) project is the design of a set of scalable information extraction algorithms that can turn large-scale data sets into sparse, hence interpretable, models. Many intensely active research topics such as sparse recovery in coding theory, compressed sensing and basis pursuit in signal processing, lasso and covariance selection in statistics, feature selection in machine learning, all revolve around the core idea that seeking sparse models is a meaningful way of simultaneously stabilizing statistical inference procedures, and highlighting structure in the underlying data set. More specifically, this project stems from two fundamental questions in statistical learning. One is about variable selection: Is a particular variable key to the modeling of our observations? The other question is about model structure: Is the relationship between any two variables key to explain these observations? In this spirit, this project combines results in statistical learning and information theory with recent mathematical programming techniques to produce realistic performance bounds on sparse statistical estimation and decoding algorithms. From a theoretical perspective, these results should help shed light on a fundamental tradeoff in statistics between model consistency on one hand and computational complexity on the other. Early results have clearly illustrated the significance of this tradeoff on a few particular problem instances, but systematic results are scarce. Statistical problems also pose an entirely new set of algorithmic challenges as they require solving very large-scale problems with relatively coarse precision targets, which is the exact opposite of classical assumptions in mathematical programming. The results of this project will thus improve our understanding of optimization algorithms in this context. From a practical perspective, efficient sparse inference algorithms will make the output of classic statistical techniques directly interpretable by non-experts and should help us highlight key structural patterns in complex data sets.

View original record on NSF Award Search →