Fast First-Order Methods for Large-Scale Structured and Sparse Optimization
Columbia University, New York NY
Investigators
Abstract
Algorithms for large-scale optimization have traditionally exploited sparsity and structure in problem data. Many important optimization problems today, such as those that arise in statistical machine learning (ML) and in compressive sensing (CS) are extremely large-scale convex problems with completely dense and/or unstructured problem data. However, there is often sparsity and structure in the solutions to these problems. The goal of this research project is the development of first-order algorithms, including gradient methods for non-smooth functions, smoothed penalty methods for constrained problems, multiple splitting methods, alternating-direction augmented-Lagrangian methods, and block coordinate descent methods, for extremely large-scale convex optimization problems that take advantage of solution structure and/or sparsity. Rigorous convergence analysis for these methods will be provided and robust software implementations will be developed. Although these methods are expected to have wide applicability, the focus will be on applications in CS and ML. Specifically, the investigators propose to develop and analyze new scalable algorithms for (i) CS signal recovery, including algorithms that are able to exploit more detailed a priori knowledge in addition to sparsity; (ii) matrix rank minimization, the matrix analog of CS, and its variants; and (iii) a broad array of ML problems that exploit the special sparsity/structure of the solutions to these problems. The research that is proposed under this grant is focused on the development of algorithms with provable performance guarantees that are capable of solving extremely large scale optimization problems whose solutions are either sparse or have special structure. Such problems arise under the paradigm of compressive sensing, which allows signals (e.g., radar) and images (e.g., CT and MRI scans) to be obtained with far fewer measurements than predicted by traditional theory, various extensions of CS, and in a broad array of problems in machine learning. All of these problems are aimed at extracting a "sparse" or low-dimensional true model from a high dimensional or dense empirical model or data. They have important applications in extracting information from surveillance video and hyper-spectral images, face recognition, medical imaging and data mining,as well as many other areas of strategic interest such as national security and biotechnology.
View original record on NSF Award Search →