Robust Preconditioned Gradient Descent Algorithms for Deep Learning

$335,978FY2022MPSNSF

University Of Kentucky Research Foundation, Lexington KY

Investigators

Abstract

Deep learning is at the forefront of research in artificial intelligence and machine learning, impacting a variety of applications in data science such as computer vision, speech recognition, natural language processing, and bioinformatics. A key challenge in deep neural network learning is model optimization, which is used for network training. However, traditional optimization algorithms are not applicable, primarily due to the high complexity and nonlinearity of deep neural networks. The goal of this project is to develop novel robust optimization algorithms that can effectively address these difficulties and can more efficiently train deep learning models in practice. The project also involves the application of this work to the translation of equivalent chemical representations used in drug design as well as Bayesian inference for uncertainty quantification. As part of this project, graduate and undergraduate students will be trained in deep learning research, and software will be developed and made freely available. This project includes the development of two new classes of optimization algorithms that are built on the frameworks of traditional preconditioning and conjugate gradient methods but incorporate ideas from some successful specialized deep learning optimizers such as normalization methods and momentum methods. Specifically, the project will develop a new class of preconditioning methods as a widely applicable alternative to the normalization methods and a new class of adaptive momentum methods as a robust alternative to the fixed momentum methods. Related convergence theory will be established, and the new methods will be adapted to state-of-the-art neural network architectures such as transformer and graph neural networks. The novel algorithms developed in this project intend to bring some of the most fruitful ideas in numerical analysis to the advancement of neural network optimization. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →