CSR: Small: Collaborative Research: Tuning Extreme-Scale Storage System Through Deep Learning
Iowa State University, Ames IA
Investigators
Abstract
Many research domains, such as high-energy physics, climate science, astrophysics, combustion science, and computational biology, need to process large amounts of data. Such domains are heavily relying on the capabilities of high performance computing (HPC) systems to manage and efficiently process massive amounts of data. Consequently, applications in the aforementioned research domains require highly optimized performance on the HPC storage systems that store, manage, and manipulate data. This project aims to utilize deep reinforcement learning methods to fine-tune the HPC storage system for optimized performance. This research explores the feasibility of leveraging deep reinforcement learning to optimize HPC storage systems by: (a) Creating a deep learning based HPC storage stack model; (b) Remodeling existing HPC storage stack to support automated configuration and tuning; (c) Collecting training datasets and training the storage stack model; and (d) utilizing the model as a responsive and playable virtual environment to learn the best policy to tune parameters. As a collaborative project, this research aims to advance the domain knowledge of both HPC storage systems and machine learning. The enhanced performance on the HPC storage stack will in turn benefit scientific discovery and thus our society. The investigators will integrate research, education, and outreach efforts during the course of this project, including recruiting and retaining of underrepresented students, mentoring graduate and undergraduate students, integrating research findings into curriculum, and publishing and disseminating results. The data collected to train the storage stack model will be shared at https://discl.cs.ttu.edu/tuningstorage while the code of machine learning at https://github.com/forrestbao/DL4SC. Results and data will be made available by the time of publication. The data will be annotated as appropriate to facilitate interpretation. The principal investigators will strive to maintain the repositories as long as possible. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →