Random Neural Networks
Texas A&M University, College Station TX
Investigators
Abstract
Neural networks are algorithms that in the past several years have achieved state of the art in a variety of important machine learning tasks, ranging from computer vision (e.g. self-driving cars) to natural language processing (e.g. Echo, Alex, Google Translate, etc) and reinforcement learning (e.g. AlphaGo and AlphaStar). Despite these impressive successes, it is not clear why neural nets work so well. In this project, the PI will use tools from probability to develop our theoretical understanding of neural networks. The goal is to give us a deep understanding of why neural nets are so efficient at overcoming challenges in optimization and high-dimensional data analysis. These theoretical insights will, in turn, inform the intuition of engineers for building the next generation of neural net-based machine learning systems. Mathematically, the study of neural networks is a cross between approximation theory and optimization, touching on topics from random matrix theory, Gaussian processes, hyperplane arrangements, tensor decompositions, and optimal transport, to name a few. The PI will focus specifically on (i) the stability of gradient-based optimization of neural networks to both the linear statistics and spectral asymptotics of random matrix ensembles given by products of many random matrices in the regime where both the number of terms in the product and the sizes of the matrices simultaneously group, and (ii) computing the correlation functions of neural networks at initialization (e.g. with random weights and biases). Questions of type (i) give quantitative information on the numerical stability of neural network architectures at initialization. Questions of type (ii), in contrast, aim at principles for data-driven architecture selection. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →