GGrantIndex
← Search

Statistical Inference of Molecular Ensembles and Application to Crystalline Solids

$556,697FY2024MPSNSF

University Of Colorado At Boulder, Boulder CO

Investigators

Abstract

Prof. Michael Shirts of the University of Colorado Boulder is supported by an award from the Chemical Theory, Models and Computational Methods program in the Division of Chemistry to improve approaches for predicting molecular crystal properties using generative modeling that can accurately explore molecular conformational ensembles. Understanding the thermodynamics of small molecule crystals and quantitatively predicting their properties is vital for faster and cheaper pharmaceutical development pipelines, as most drugs are distributed as pills in crystalline form. Dr. Shirts’s group will use generative machine learning models to augment physics-based methods to estimate crystal thermodynamics. The methods developed will also be useful for computational studies of a broad range of other complex crystalline materials such as pigments, agrochemicals, food additives, and electronic materials. These methods will be developed in concert with open-source molecular mechanics and machine learning software, with extensive tutorials provided for easier adoption. Dr. Shirts will also work to expand the Living Journal of Computational Molecular Science (LiveCoMS), a free open-source journal for scientific articles which can and should be updated, by incorporating coverage of machine learning best practices and reviews in molecular science. Dr. Shirts will also work to improving the inclusiveness of LiveCoMS effort and expand existing online educational resources for computational drug design techniques. Dr. Shirts will explore ways to use generative machine learning methods, such as Boltzmann generators, to model crystalline systems and capture the relevant configurational ensembles. Crystal thermodynamics provides ideal test cases for developing better generative models for molecular ensembles, due to their relative simplicity, but are still of significant scientific and practical interest. The planned work will introduce the use of sampling from multiple thermodynamic states to improve configuration space coverage for generative models, as well as the development of molecular crystal-specific mapping approaches that learn the differences from physically relevant mappings rather than being forced to learn the mappings from generator latent space de novo. The proposed work also adapts these approaches to bridge between simulated polymorph ensembles, including learning the differences between configurational ensembles generated with differing levels of chemical theory, such as from force fields to quantum mechanical potentials. Finally, work is proposed to extend these generative methods to model systems with differing degrees of freedom between polymorphs such as hydrates. These methods will be distributed to practitioners to be of practical use via existing open-source molecular mechanics and machine learning tools, as well as in stand-alone Python implementations with extensive tutorials for easier adoption. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →