Collaborative Research: SCALE MoDL: Advancing Theoretical Minimax Deep Learning: Optimization, Resilience, and Interpretability
University Of Utah, Salt Lake City UT
Investigators
Abstract
The past decade has witnessed the great success of deep learning in broad societal and commercial applications. However, conventional deep learning relies on fitting data with neural networks, which is known to produce models that lack resilience. For instance, models used in autonomous driving are vulnerable to malicious attacks, e.g., putting an art sticker on a stop sign can cause the model to classify it as a speed limit sign; models used in facial recognition are known to be biased toward people of a certain race or gender; models in healthcare can be hacked to reconstruct the identities of patients that are used in training those models. The next-generation deep learning paradigm needs to deliver resilient models that promote robustness to malicious attacks, fairness among users, and privacy preservation. This project aims to develop a comprehensive learning theory to enhance the model resilience of deep learning. The project will produce fast algorithms and new diagnostic tools for training, enhancing, visualizing, and interpreting model resilience, all of which can have broad research and societal significance. The research activities will also generate positive educational impacts on undergraduate and graduate students. The materials developed by this project will be integrated into courses on machine learning, statistics, and data visualization and will benefit interdisciplinary students majoring in electrical and computer engineering, statistics, mathematics, and computer science. The project will actively involve underrepresented students and integrate research with education for undergraduate and graduate students in STEM. It will also produce introductory materials for K-12 students to be used in engineering summer camps. In this project, the investigators will collaboratively develop a comprehensive minimax learning theory that advances the fundamental understanding of minimax deep learning from the perspectives of optimization, resilience, and interpretability. These complementary theoretical developments, in turn, will guide the design of novel minimax learning algorithms with substantially improved computational efficiency, statistical guarantees, and interpretability. The research includes three major thrusts. First, the investigators will develop a principled non-convex minimax optimization theory that supports scalable, fast, and convergent gradient-descent-ascent algorithms for training complex minimax deep learning models. The theory will focus on analyzing the convergence rate and sample complexity of the developed algorithms. Second, the investigators will formulate a measure of vulnerability of deep learning models and study how minimaxity can enhance their resilience against data, model, and task deviations. This theory will focus on the statistical limits of deep learning. Lastly, the investigators will establish the mathematical foundations for a set of novel visual analytics techniques that increase the model interpretability of minimax learning. In particular, the theory will provide guidance on visualizing and interpreting model resilience. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →