GGrantIndex
← Search

CAREER: Scalable Black-box Optimization for Scientific Discovery

$553,510FY2022CSENSF

University Of Pennsylvania, Philadelphia PA

Investigators

Abstract

Scientists across the natural sciences and engineering increasingly rely on data-driven approaches to assist them in making their discoveries. Searching for a new scientific discovery can frequently be cast as an optimization problem. For example, a biochemist searching for new therapeutics might seek to optimize the antiviral activity of a new molecule, or an engineer might optimize the aerodynamic efficiency of a new vehicle. Qualities like antiviral activity are difficult to estimate in advance and require experimentation to measure, making these optimization problems “black-box” and particularly challenging. This project will build novel technologies that enable practitioners to leverage large quantities of data to aid in solving these challenging problems, even for highly complex and structured objects like molecules or vehicles. This will empower scientists to more rapidly design the next generation of therapeutics, energy technologies and more. This research is coupled with education and outreach to a broad set of stakeholders, including (a) professionals in the natural sciences and engineering through direct collaboration, public tutorials, and open source software and (b) students interested in data-driven scientific discovery, from outreach at the middle school level to the development of new undergraduate curricula designed to reach the broadest possible science and engineering audiences. This project will focus broadly on two key research challenges: (1) developing methods for black-box optimization that scale to high dimensional and structured optimization problems over challenging domains like molecules, and (2) developing novel large scale probabilistic machine learning methods that enable careful consideration of the exploitation versus exploration trade-offs inherent in these optimization problems. A core theme through both of these challenges will be reducing complex discrete search spaces into well-organized continuous latent spaces extracted by deep neural networks. This project will develop novel deep representation models tailored specifically to the optimization domain, leveraging large quantities of unsupervised and multi-task data to enable optimization over broad classes of objects that can be represented as graphs, strings, point clouds or images. The project will be grounded in concrete, specific applications and collaborations in therapeutic design, cognitive science for learning and Alzheimer's disease, and the development of energy technology This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →