NSF Center for Computer-Assisted Synthesis
University Of Notre Dame, Notre Dame IN
Investigators
Abstract
The NSF Center for Computer-Assisted Synthesis (C-CAS) is a nexus of collaboration, innovation, and education that brings together data science and chemical synthesis. The highly interdisciplinary C-CAS team, composed of synthetic organic chemists, computational chemists, and computer scientists, is developing data science tools and computational workflows that will likely shape the future of synthetic chemistry and the fields it enables, such as medicine, materials science, and energy research. This site’s impacts are being further amplified by an extensive network of academic, industrial and non-profit partners and research centers, and its data chemistry tools are being shared with the research community through open-source clearinghouses. All of this provides C-CAS with a unique opportunity to develop, exchange, and evaluate ideas in the field of data chemistry, and its shared tools and training will empower students, practicing chemists, and the chemical industry to effectively apply data science to their own chemical research. Led by organic chemists at every stage, C-CAS focuses on use-inspired data science research that drives the development of new data types and machine learning (ML) methods that enable the discovery of novel reactions and yield new scientific insights. The four scientific thrusts include (i) developing effective ML tools for optimizing chemical reactions, (ii) gaining mechanistic understanding through interpretable statistical models and electronic structure calculations, (iii) predicting reaction outcomes to anticipate and discover new reactivity and (iv) integrating these tools for the efficient planning and execution of multistep syntheses of complex molecules. To accomplish these goals, three themes are interwoven into each of the thrusts: (a) new structured data types that are amenable to high-throughput experimentation and predictive models from the ground up, going beyond the information from commonly used databases, (b) molecular and reaction representations that bridge descriptor-based and structure-based deep learning paradigms, and (c) algorithms specifically designed for the low data regimes prevalent throughout chemistry. Through these integrated research themes and thrusts, C-CAS constructs and shares data chemistry platforms that are expected to enable chemists to tackle ambitious challenges that the field is currently under-equipped to pursue. The data chemistry platform also will open up new training opportunities through partnerships with the Data Chemists Network at primarily undergraduate institutions. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →