MRI Collaborative: Development of a Data-Intensive Scalable Computing Instrument for High Performance Computing
Texas Tech University, Lubbock TX
Investigators
Abstract
Proposal #: 13-38078 PI(s): Chen, Yong; Gropp, William D.; Smith, Philip W.; Sun, Xian-He; Zhuang, Yu Institution: Texas Tech University Title: MRI Collab/Dev.: Data Intensive Scalable Computing Instrument for High Performance Computing Project Proposed: This project, developing DICSI, an all-around computing instrument that compensates the limitations of existing computing-centric HPC instruments toward data-intensive applications, supports five large research projects in HPC system design, computational chemistry, biotechnology, and atmospheric science. Based on research introducing the application-aware and decoupled-execution paradigm concept, the project addresses the big gap between research prototypes and the engineering solution. Impact on future applications, algorithms, and instruments design is expected since the instrument could open up new research areas in supporting data-intensive sciences and possibly reshape HPC instruments adopted in National Computing Facilities and some institutions. In addition to the conventional HPC compute nodes, DISCI has a set of specially designed data nodes. The data nodes offer in-situ data processing to reduce data movement and data-access delay and dynamic provisioning as 'fat' compute nodes when necessary, while the functionality of compute nodes remains the same as in conventional instrumentation. These data nodes work with compute nodes in concert and together they provide an optimum system performance for data-intensive HPC. Based on a hardware-software co-development principle, the instrument consists of two components: the DISCI system architecture and the DISCI runtime software. The system architecture builds an HPC instrument with a data-centric view. The runtime software extends the MPI (Message Passing Interface) and MPI-IO library to support data nodes and their associated in-situ processing. The instrument will enable and foster research activities in the areas of chemical dynamics simulation, simulations of turbulent flows, atmospheric data assimilation and weather forecasting, computational biology, and computer systems that PIs and senior personnel conduct. Broader Impacts: This development project will enable academic departments, cross-disciplinary units, organizations, and multi-organization collaborations to integrate their development, education, and outreach efforts. To attract underrepresented students into the DISCI development, the institution will coordinate with institutional projects at respective organizations. The experience gained will be integrated into undergraduate and graduate courses and summer orientation trainings to get students involved in the development. The education plan concentrates on supporting data-intensive HPC and training a broadly inclusive and globally competitive science workforce. The project is expected to provide the pathway to future national HPC instruments to support data-intensive sciences. Furthermore, it could have a direct impact on building exascale HPC instruments.
View original record on NSF Award Search →