GGrantIndex
← Search

SHF: CSR: Small: Toward Smart HPC through Active Learning and Intelligent Scheduling

$498,800FY2014CSENSF

Illinois Institute Of Technology, Chicago IL

Investigators

Abstract

As high performance computing (HPC) continues to grow in scale, energy and resilience become first-class concerns, in addition to the pursuit of performance. These concerns demand significant changes in many aspects of the system stack including resource management and job scheduling. In order to harness the great potential of extreme scale systems, this project aims to incorporate intelligence into resource management and job scheduling. More specifically, it will develop a framework named SPEaR (Scheduling for Performance, Energy, and Resilience efficiency) for dynamically optimizing the three-dimensional performance, energy, and resilience scheduling. The research focuses on two thrusts: one is active learning to automatically extract valuable performance, energy, and resilience patterns and tradeoffs out of application and system data, and the other is intelligent scheduling to improve and control performance, resilience, and energy efficiency in resource management and scheduling. An event-driven scheduling simulator is being developed for comprehensively evaluating scheduling policies and their aggregate effects. The simulator, along with system logs, will be made available to the broad community under an open source license. This project creates critical technologies to promote system productivity and makes important advances essential toward smart HPC. Additionally, the learning techniques developed in this project are useful to other big data problems of national interests. The education plan enhances the undergraduate and graduate curricula and broadens the participation from underrepresented groups.

View original record on NSF Award Search →