CDS&E: Chemist-Machine Collaborations for Reaction Mechanism Discovery
Regents Of The University Of Michigan - Ann Arbor, Ann Arbor MI
Investigators
Abstract
With support from the Chemical Theory, Models and Computational Methods program in the Division of Chemistry, Professor Paul Zimmerman of the University of Michigan is developing so-called chemist-in-the-loop machine learning approaches for predicting the outcomes of chemical reactions. Chemical reactions can be controlled and tuned to desirable outcomes whenever deep knowledge of their mechanisms is available. This information can be gained through first principles computational tools, which are well known for their ability to explore and explain reaction mechanisms. In recent years, alternatives to conventional computational models have appeared, in particular machine learning models which are highly useful at recognizing patterns hiding within datasets. Since first principles methods are an oracle for chemical behavior, this project envisions the combination of machine learning methods with first principles techniques to quickly and thoroughly explore chemical space. The Zimmerman research group will develop and test new hybrid strategies involving the two method types, and apply them to emerging, poorly understood chemical transformations. The overall project strategy will not only lead to discovery of chemical reaction mechanisms, but will provide an environment to train young scientists in the art of mechanism development. In particular, graduate students on the project will learn data science methods, software development, and help mentor neurodiverse interns that will participate in the research activities. These learning activities are crucial to the development of a next-generation workforce, where individuals from diverse backgrounds work together using complementary skillsets to push forward scientific outcomes. The Zimmerman group will combine advanced machine learning methods with graph-based reaction discovery tools, where the latter provides the data needed to train the former. These two strategies can be synergized through active-transfer learning and chemist-in-the-loop learning to overcome the data scarcity that is inherent in reaction discovery tasks. Active-transfer learning leverages small data to quickly train machine learning models, and is best informed by expert knowledge coming from chemists who can separate physically relevant, causal inferences from mere statistical correlations. These approaches therefore support each other, forming a strategy for reducing the computational burden of first principles evaluation of hypothetical elementary steps within complicated reaction networks. With these fundamental strategies in hand, the Zimmerman group will work to develop a reaction explorer interface to stitch together the new approaches and make them easily accessible to chemists. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →