Collaborative Research: Algorithms for Large-scale Stochastic and Nonlinear Optimization

$136,371FY2016MPSNSF

University Of Colorado At Boulder, Boulder CO

Investigators

Abstract

The promise of artificial intelligence has been a topic of both public and private interest for decades. Starting in the 1990s the field has been benefited from the rapidly evolving and expanding field of machine learning. The intelligent systems that have been borne out of machine learning, such as search engines, recommendation platforms, and speech and image recognition software, have become an indispensable part of modern society. Rooted in statistics and relying heavily on the efficiency of numerical algorithms, machine learning techniques capitalize on increasingly powerful computing platforms and the availability of very large datasets. One of the pillars of machine learning is mathematical optimization, which, in this context, involves the computation of parameters for a system designed to make decisions based on yet unseen data. The goal of this project is to develop new optimization algorithms that will enable the continuing rise of the field of machine learning. The research consists of two projects, which are thematically related and address the solution of optimization problems that are nonlinear, high dimensional, stochastic, involve very large data sets and in some cases are non-convex. Two families of algorithms will be developed to garner the benefits of both stochastic gradient methods and batch methods, while avoiding their shortcomings. One of these algorithms uses a gradient aggregation approach that re-uses gradient values computed at previous iterations. The challenge is to design an algorithm that is efficient in minimizing testing error, not just training error. The second approach employs adaptive sampling techniques to reduce the noise in stochastic gradient approximations as the optimization progresses. An important aspect of this research is the design of an efficient strategy for incorporating second-order information that captures curvature of the optimized loss function, even in the case when Hessian estimates are based on inaccurate gradients. In all cases, the goal is research is to design and implement algorithms in software, and test them on realistic machine learning applications.

View original record on NSF Award Search →