FMitF: Track I: Principles for Modular Probabilistic Programming and Inference

$750,000FY2022CSENSF

Northeastern University, Boston MA

Investigators

Abstract

The concept of a model is central to the field of machine learning: it is a simplified representation of the world that facilitates automated reasoning and learning. Designing these models is a critical part of any machine-learning project, and there is a pressing need for tools that make it easier and faster for practitioners in the field to apply machine learning to a relentlessly broadening set of applications. Probabilistic programming languages (PPLs) are an answer to this pressing need: they endow a typical programming language with the ability to manipulate and reason about probability distributions, empowering software developers who are less comfortable with the standard tools and techniques of machine learning with the ability to make their own machine-learning models. However, programming in PPLs today remains very difficult, even for experienced programmers. This project tackles one of the main shortcomings of today's PPLs: a lack of strong notions of modularity. Modularity is a critical property for designing scalable and efficient software systems. This project's impact is to develop a new foundation for scalable and modular probabilistic programming languages that empower a broader audience to create and maintain machine learning and AI models, transforming the way machine learning models are developed and deployed today. This project's novelties are (1) the development of a new PPL called ModPPL that enables programmers to develop modular machine learning models; and (2) new methods for creating large-scale ModPPL programs. The primary aim of this project is to develop a new family of PPLs with a rich and general-purpose notion of modularity. Accomplishing this will require deeply integrating techniques from formal methods into the foundations of the language. The core of the approach will be a new separation logic for specifying the independence structure of probabilistic programs. The investigators will integrate this logic into ModPPL, borrowing ideas from Hoare Type Theory; in doing so, the type system itself will capture the necessary structure to empower programmers to reason locally about their programs. Next, the structure of this type system will be leveraged to state and certify correctness criteria for heterogeneous inference algorithms, and further leveraged to design new and more scalable heterogeneous inference strategies. Finally, the type system will be enriched by making it cost-aware. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →