Conference: Community Development of the Material Core Metadata (MatCore) Standard
University Of Minnesota-Twin Cities, Minneapolis MN
Investigators
Abstract
This award supports a community effort to develop a standard for metadata associated with computational materials science (CMS) datasets for solid state materials. Metadata contains information about a dataset, which may include what kind of data it is, how it was created, and technical information about the data relevant to how it can be used. Effective community-required metadata enables the data products of individual research efforts to be used by other investigators beyond those that created the data and the public as well. This project comes in response to the memorandum issued on August 25, 2022, by the Office of Science and Technology Policy (OSTP). The OSTP memo tasks federal funding agencies to “develop strategies to make federally funded publications, data, and other such research outputs and their metadata findable, accessible, interoperable, and re-useable, to the American public and the scientific community in an equitable and secure manner.” A good CMS metadata standard also enables meaningful data sharing that can advance materials research generally and supports the Materials Genome Initiative (MGI). MGI seeks to accelerate and sustain the process from materials discovery to deployment in products, so that the process takes place in half the time and a fraction of the cost. This would be enabled in part by developing advanced infrastructure for computation, experiment, and data, and innovative ways to combine computation, experiment, and data-centric approaches including machine learning. A second goal is to promote reproducibility in CMS by enabling others to regenerate datasets. The MatCore Standard includes a required minimal set of metadata that would accompany all CMS datasets, and additional optional metadata specific to particular CMS techniques. Development of the MatCore Standard involves the determination of what information regarding a CMS dataset must be recorded, and in what form, to make it findable, accessible, interoperable and reusable (FAIR), and ideally reproducible. This requires a careful assessment within each CMS domain regarding the nature of the scientific computation that was performed to generate the data. Understanding of this type lies at the heart of the fundamental principle of the scientific method that scientific research must be repeatable. An extensive review will be carried out to identify previous work on characterizing CMS datasets with the intent to build on existing approaches and standards wherever possible. This project will stimulate discussion in the materials community. Conference and town hall style activities are included for community input to be incorporated into the released MatCore standard. If the result of this project is broadly adopted, it will affect virtually all aspects of CMS, which can contribute to a wide range of technologically and socially important domains. The development of the MatCore Standard may help researchers to build rapidly and effectively on each other’s work and may make materials science more equitable by providing easy and equal access to everyone. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →