CAREER: From Shallow to Deep Representation Learning: Global Nonconvex Optimization Theories and Efficient Algorithms

$633,280FY2022CSENSF

Regents Of The University Of Michigan - Ann Arbor, Ann Arbor MI

Investigators

Abstract

This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2). Machine learning is by now transforming many fields of science and engineering. However, as data sets continuously increase in both volume and dimensions, the performance of modern machine-learning methods has become critically dependent on the data representations being used. In the past decade, although many deep-representation learning methods have enjoyed remarkable empirical success, the underlying principles behind this success have largely remained a mystery, a situation which has hindered further development and broader adoption. A major difficulty stems from the nonlinearities of the data representation models that often lead to complicated and challenging non-convex optimization problems. This project aims to advance the theoretical foundations from shallow-representation learning (e.g., learning sparsifying dictionaries) to deep-representation learning (e.g., learning deep neural networks) by exploiting the geometric properties of non-convex optimization landscapes and the intrinsic structures of the data. The impact of this research will take the form of new guiding principles for better model/architecture design, optimization and robustness in both supervised and unsupervised scenarios. The research program will be integrated with education activities that include training at both undergraduate and graduate levels and designing machine learning modules appropriate for dissemination to K-12 students. This project seeks to bridge the gap between the theory and practice of representation learning by developing a principled and unified mathematical framework based on recent developments in global non-convex optimization theory. The mathematical foundations for both shallow and deep representation learning will be advanced by leveraging both the geometric properties of the corresponding non-convex optimization landscapes, and the low-dimensional structures of high-dimensional data. The resulting geometric insights will clarify the kind of representations can be learned through optimization, and will guide the development of efficient and globally convergent training algorithms. The proposed framework will be applied to the study of a wide spectrum of representation learning problems in supervised, self-supervised and unsupervised learning; correspondingly, their generalization and robustness will be investigated by understanding the learned representations. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →