Collaborative Research: Probabilistic, Geometric, and Topological Analysis of Neural Networks, From Theory to Applications

$299,860FY2022MPSNSF

Boston College, Chestnut Hill MA

Investigators

Kathryn A Lindseycontact Julia E Grigsby

Abstract

One of the most exciting technical developments of the last decade is the widespread adoption of a family of algorithms called neural networks, used in cutting-edge industrial applications ranging from self-driving cars to predicting the three-dimensional shapes of proteins from their amino acid sequences. The goals of this project are twofold. First, the investigators seek to use tools from mathematics (specifically probability and combinatorics) to better understand how neural networks behave and then to fashion this understanding into new, more efficient, and safer algorithms. This involves a collaborative effort between mathematicians, computer scientists, and electrical engineers. The project team seeks to unravel a fundamental mystery: why is it that neural networks appear to be incredibly complex, yet despite their seeing intricacy, still learn parsimonious and useful ways of making predictions? Put another way, the investigators aim to define and analyze different mathematical notions of neural network complexity and then to use them as theoretically grounded guides in the search for ever more efficient and interpretable algorithms related to neural networks. The second goal is to create a series of educational resources, ranging from videos to course notes, that will enable various segments of society at large (e.g. students, policy makers, scientists, and so on) to engage with and get a usable appreciation for the ideas, challenges, and opportunities surrounding modern neural networks. The research in this project consists of three interconnected parts. The first is a probabilistic analysis of a variety of neural network complexity measures before, during, and after training. Relevant tools come from probability, functional analysis, information theory, and geometry. Key theoretical questions include quantifying implicit bias and bounding generalization error for learning structured functions. The second is a topological and geometric analysis of both individual ReLU network functions and spaces of ReLU networks. Relevant tools come from Morse Theory and low-dimensional topology. Key theoretical questions hinge on understanding topological implicit bias and topological depth separation. Finally, the investigators seek theory-guided insights for applied deep learning via (i) principled, efficient neural architecture search using average case complexity measures as surrogates for practical expressivity, trainability, and generalization and (ii) novel approaches to model compression and scaling via topological expressivity of ReLU networks. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →