EAGER: Cyberinfrastructure Reproducibility Project: Computational Science and Engineering
George Washington University, Washington DC
Investigators
Abstract
The progress of science today relies heavily on the use of computers, from lab workstations to national supercomputers. This project confronts the issue of ensuring that scientific knowledge stemming from computational research complies to high standards of reproducibility and rigor. The NSF CISE Directorate "Dear Colleague Letter: Encouraging Reproducibility in Computing and Communications Research" (NSF 17-022, 2016) encourages researchers to embrace completeness and transparency in developing rigorous protocols. Reproducibility concerns often focus on transparency via open data, code, and other research objects. But providing the data and code to run all analyses again is, however, a minimum standard. In the context of high-performance computing where research uses multi-million-dollar facilities the understanding of where lie the sources of non-reproducibility is still lacking. This project uses methodical replication of previous studies in computational fluid dynamics as the model for reaching that understanding and for developing the guiding principles of study design that can guarantee, as much as possible, reproducible findings. The results of this project will serve NSF's mission to promote the progress of science; to advance the national health, prosperity and welfare; to secure the national defense by bringing new and necessary understanding about how to achieve rigorous, reproducible computational research. This project conducts methodical replication of published studies in computational fluid dynamics, as a model for the broad field of computational science and engineering. Its aims are to 1) identify and characterize sources of non-reproducibility, and 2) develop guidelines on Design for Reproducibility, i.e., study design guaranteeing reproducible findings. The project addresses the role of scientific software libraries (like linear algebra solvers), the influence of hardware architectures , the role of new technologies for reproducible research (e.g., containers, cloud services). The project also provides guidelines for making assessments about the success of replication studies in the context of high-performance computing, including when it is reasonable or not to expect bit-by-bit numerical reproducibility. The research tackles three phases of computational research: Research methods; Research communication; Research assessment. The project also develops openly licensed training materials on reproducible computational research, including a graduate seminar course and a Responsible Conduct of Research (RCR) module.
View original record on NSF Award Search →