GGrantIndex
← Search

CSR: Large: VarSys: Managing Variability in High-Performance Computing Systems

$1,189,778FY2016CSENSF

Virginia Polytechnic Institute And State University, Blacksburg VA

Investigators

Abstract

The usefulness of the smallest mobile system in a pocket and the largest and fastest supercomputers in datacenters around the world require unrelenting advances in systems software design. These advances make computers faster, more reliable, more secure, better able to analyze large data sets, and ultimately essential to the lives of nearly everyone on the planet. Variability can wreak havoc on the performance of large-scale computer systems that support high-performance computing and e-commerce. In high-performance computing, variability threatens U.S. competitiveness and our ability to achieve exascale performance within the cost and energy constraints of supercomputers. In e-commerce (e.g., Amazon and Wall Street trading), variability threatens profit margins by requiring greater capital expenditures to compensate for potential swings in the performance of datacenters and the cloud. System variability also impairs our capacity to separate malware from normal system activity. This project will develop techniques to increase the ability to identify and manage variability in advanced computing systems. Specifically, this project will focus on developing the VarSys software framework to control aspects of variability and ultimately improve the design and operational efficiencies of both high-performance and cloud systems. Furthermore, to highlight the broader impact of VarSys beyond computer systems design, applying variability identification and management to improve malware detection will be an important component. In addition to publishing the results and creating open source software, the project team will be hosting technical meetings to encourage broad participation in the development of variability metrics and benchmarks for advanced computer systems. The intent is to develop an ecosystem of stakeholders dedicated to progress in the emergent area of computer system software variability. In addition to training graduate students, there will also be hack-a-thons targeting undergraduate students to educate them on the future impact of variability and encourage their involvement in related research. At present, computer system variability is often viewed as noise or an unavoidable consequence of complex designs. This project isolates causes of variability and identifies conditions when variability can be managed. When complete, the resulting VarSys software will enable scientific computer systems research previously impracticable. The usefulness of the techniques to improve the designs of advanced computing systems including high-performance supercomputers, cloud datacenters, and malware detection software will be demonstrated. These advances ultimately impact the lives and livelihood of consumers as well as ensuring U.S. competitiveness in science and e-commerce.

View original record on NSF Award Search →