Collaborative Research: C1: Learning the Universal Free Energy Function
Columbia University, New York NY
Investigators
Abstract
NONTECHNICAL SUMMARY This award brings materials science and materials engineering together with data science to develop data-intensive methods to create phase diagrams or "roadmaps" of materials. The discovery and design of new materials requires the ability to predict how different chemical elements can combine to make different compounds depending on the temperature. One example of great technological relevance are metallic alloys that form by combining multiple metallic elements at elevated temperatures. Over the past century, materials scientists have measured such compound-formation processes for many materials systems, but the available data still represents only a tiny fraction of the entire space of all possible combinations of chemical elements and temperatures. Meanwhile, machine-learning and data science have made great strides in discovering new patterns and connections, and being able to “fill in” missing information from large data sets. The research team will extend and develop state-of-the-art machine learning approaches to apply to mathematical models and data for metallic alloys to learn new connections between chemical elements and discover new alloys. If successful, the research team will enable the development of new and improved lightweight structural alloys and longer-lived, higher power density batteries. All of the developed software tools will have publicly available implementations throughout the funding period to accelerate such developments. The research team’s approach uses close collaboration between domain and data scientists with strong “cross-training” to develop the next generation of scientists and engineers, and data scientists enabling convergent approaches to the challenging problems of science and engineering. TECHNICAL SUMMARY This award brings together materials science and engineering, and data science to develop data-intensive methods to determine materials phase diagrams. Design and discovery of new materials relies extensively on phase diagrams that quantify what phase(s) are stable at a given temperature and chemical composition, which is determined by the free energy of different phases. Moreover, many equilibrium material properties are derived from free energies or free-energy differences. Extensive resources have been devoted to experimental determination of phase diagrams for many material systems, but despite these efforts only a tiny fraction of the entire space of possible materials has been explored. High-throughput computational approaches have added to our knowledge, but it is time-consuming to extrapolate from the easy-to-compute zero temperature results to experimentally relevant finite temperature results. While some qualitative chemical and structural trends have been identified—the periodic table being the most well-known example—leveraging this for quantitative predictions is difficult. Simultaneously, significant developments in machine learning have expanded the range of non-linear functions that can be interpolated with uncertainty quantification, advanced the field of dimensionality reduction, and revealed new underlying patterns in data. Continual expansion of computational and experimental open data sets of materials thermodynamics presents a tipping point where constructing machine-learned models for thermodynamic extrapolation becomes feasible, and offers a significant advance beyond high-throughput methods alone. The research team will develop a novel thermodynamic machine learning engine and demonstrate it for the modeling of materials at relevant conditions with a focus on: (1) lightweight metallic alloys to predict of phase diagrams at new compositions, and (2) extending to native oxide thermodynamics. The PIs will employ a combination of semi-supervised learning, a generative adversarial network framework for discriminative and generative learning, and functional quantile learning including uncertainty quantification. If successful, the thermodynamic machine learning engine can be expanded to other material spaces including high-temperature alloys, and battery and fuel cell materials. It can drive future high-throughput computation and experiment. The team will interact with TRIPODS centers for dissemination, discussions, and collaborations as it develops deeper connections with data science driven by the challenges of domain science and engineering. Developing an accurate, predictive, and computationally efficient free energy function for the full range of materials space is a transformative innovation for the design and discovery of materials. The underlying dimensionality reduction inherent in the universal free energy function permits the discovery of new relationships between chemical elements and solid phases, beyond existing qualitative relationships. Uncertainty quantification can identify unexplored but valuable regions of chemical and structure space to provide a new paradigm for high-throughput computation and experimental methods to optimally expand our knowledge of materials and chemical relationships. The data science innovations will extend the scope of Gaussian process-based modeling, enable machine learning with functional data and couple it with recent advances in data-depth, advance generative adversarial networks and related Bayesian studies for functional data generative models with uncertainty quantification, and extend quantile regression to function-valued responses. The Division of Materials Research, the Division of Mathematical Sciences, the Civil, Mechanical, and Manufacturing Innovation Division, and the Office of Advanced Cyberinfrastructure contribute funds to this award. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →