CAREER: Toward a Foundation of Over-Parameterization
University Of Washington, Seattle WA
Investigators
Abstract
This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2). Over-parameterized models, a modern machine learning technique to make predictions, are revolutionizing application domains, including computer vision, natural language processing, and robotics. However, these powerful models have not yet become the predominant method in many data-driven domains because they are extremely resource-hungry: recent models cost millions of dollars to train. This project will tackle this challenge by thoroughly characterizing the theoretical properties of over-parameterization. Based on this solid foundation, the investigator will design resource-efficient methods to make modern machine learning technologies accessible to a broader audience. An education plan is integrated into this project --- the investigator will mentor students, develop new courses, organize workshops, and develop course materials for a high school machine learning curriculum. This project has three major components. The first thrust characterizes over-parameterization from width. The grand goal is to develop a unified theoretical framework, based on which the investigator will design new width shrinkage methods to reduce over-parameterized model resource requirements. The second thrust will develop principles on the acceleration and regularization effects of over-parameterization from depth. Insights will help us design more explicit optimizers and regularizers to compress the model so shallow neural networks can achieve performance similar to deep neural networks ones. The third thrust will identify conditions under which pre-trained over-parameterized models can learn representations that improve sample efficiency in downstream tasks. The investigator will then design methods that select a small subset of data in pre-training without degrading performance, reducing computational resource requirements. In addition to theoretical developments, the project also aims to implement all algorithms developed as open-source software, evaluate them on standard benchmarks and deploy them on real-world applications in the transportation domain. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →