CAREER: Optimization Landscape for Non-convex Functions - Towards Provable Algorithms for Neural Networks

$399,999FY2019CSENSF

Duke University, Durham NC

Investigators

Abstract

Deep learning, a machine learning method that is based on artificial neural networks, has greatly improved the performance of learning algorithms for many tasks that are related to understanding complicated data such as natural images, videos and language. Products based on deep learning have already made real-life impact in face recognition, machine translation, and shown promise for more applications such as self-driving cars. However, despite the practical success of deep learning, theoretical understanding for why these algorithms work has been scarce. One of the main difficulties in understanding deep learning algorithms is that these algorithms need to solve very complicated optimization problems that try to find out what are the best ways for the neurons to be connected. In the most general form, these optimization problems are known to be intractable. This research project will identify properties of the real-world problems that make these problems special and tractable, and provide new optimization algorithms with theoretical guarantees that are applicable to deep learning. The materials developed in the project will be disseminated through conferences and workshops that try to connect different research communities, and used to create new machine learning courses. The algorithms designed in the project will also be implemented in standard deep learning frameworks and made publicly available. The specific approach of this project revolves around the new concept of optimization landscape. For an optimization problem, its optimization landscape includes clear understanding of the location and values of its local and global optimal solutions. The research goals are divided into three categories. First, the research project will focus on a class of locally optimizable functions for which local minima are all globally optimal. The research project will develop simple and efficient algorithms for optimizing such functions, as well as a new framework to prove several problems of practical interest are locally optimizable. Second, the project will develop stronger optimization algorithms that can work even when the optimization landscape is not as ideal. Finally, the research will focus on optimization problems that arise in deep learning and show how the techniques developed in the previous two parts can be applied. These projects will bring more theoretical insights into the heuristics for training neural networks. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →