Collaborative Research: FMitF: Track I: Differentiable Probabilistic Programming with Recursive Structured Models
Indiana University, Bloomington IN
Investigators
Abstract
Symbols (like the letters of the alphabet) and structures (like words formed out of letters) are natural for humans to work with: they are ubiquitous in daily life, they are easy for us to understand, and it is easy to write programs that work with them. But current artificial intelligence (AI) systems learn by making many small changes to see which ones improve the performance of the system; they are therefore good at working with representations that allow small changes, like numbers, and not so good with symbols and structures, like letters and words. This can be an obstacle both to building AI systems and to understanding why they work. A typical way for an AI system to learn to work with symbols and structures is to consider all choices and make small changes to their probabilities. But what if there are not 26 choices, but 26 trillion? For example, the grammatical structure of a sentence can be represented by a tree, one out of a large or even infinite number of possible trees. In such cases -- which are the rule rather than the exception -- one can resort to approximations, like randomly selecting a few thousand possibilities, or one can use carefully constructed algorithms to consider all of them. But it is not easy to do the latter or even to know when it is possible. This project's novelty is to develop a new programming framework to make it easy to code such algorithms, so that writing a program that learns to use trees can be as easy as writing a program that uses trees. If successful, the project's impact is to help make machine learning an everyday part of computer programming, not only for researchers but even for beginners. This project draws on and contributes to the fields of machine learning, programming languages, and formal language theory. In machine learning, there is growing interest in neural networks that make probabilistic decisions about discrete structures such as trees that represent the possible grammatical structures of a sentence. In programming language research, there has been much work on probabilistic programs and operations on them that preserve meaning exactly. However, in existing frameworks for both neural networks and probabilistic programs, it is still difficult to represent distributions over recursive structures exactly and to efficiently perform operations on them like differentiation. This project uses ideas from formal language theory to bridge this gap, making it easy to work on these distributions exactly and efficiently. The project has three stages: First, it is extending and vectorizing exact transformations on probabilistic programs so that they work on programs parameterized by differentiable tensors. Second, the project is using hyperedge replacement graph grammars (HRGs) to represent distributions over recursive structures. HRGs generalize both graphical models and string/tree automata, providing a single highly expressive formalism for structured models. Methods for efficient inference on HRGs are also being developed. Third, the team is automating the translation of probabilistic code that uses recursive data structures into HRGs. The techniques developed are being implemented in an open-source deep-learning framework. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →