Algorithms and Theory for Compressing Deep Neural Networks

$276,611FY2022MPSNSF

Suny At Albany, Albany NY

Investigators

Abstract

Deep neural networks (DNNs) have been the main driving force for recent advancements in artificial intelligence (AI) technology, profoundly impacting society in the areas of transportation, public safety, entertainment, health care, and other areas of public life. One of the biggest obstacles to AI's even broader impact on our daily lives is the typically enormous power consumption of DNNs upon deployment. The aim of this project is to develop mathematical and computational approaches for DNN compression to realize the fast and efficient deployment of AI systems on mobile platforms with low-power budgets such as smartphones. Results of this work will have a variety of applications which include video security systems, autopilot, smart robots, and face identification. The project will involve training of graduate students, development of data science courses, as well as collaboration with industry. The PI plans to (1) develop and analyze coarse gradient algorithms, featuring a biased first-order oracle, for the discretization of various neural architectures including transformer-based networks; (2) develop and analyze efficient thresholding-based algorithms for compressing networks via structured sparsity on both balanced and unbalanced data; (3) investigate the model capacity of compressed DNNs and establish universal finite-sample expressivity theory. The proposed research will also explore the applications of coarse gradient algorithms to other machine learning problems with discrete-valued loss functions and advance knowledge in discrete optimization. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →