Collaborative Research: Personalized Benchmarks for High Performance Computing Applications

$330,000FY2015CSENSF

University Of Illinois At Urbana-Champaign, Urbana IL

Investigators

Abstract

As high-performance computing applications target ever-larger problems, data input and output (I/O) takes up more and more run time. Users, software developers, and platform administrators often find it difficult to understand what an application's I/O code is doing, why it is slow, how it might be improved, or how well it would perform on a different platform. I/O benchmarks help address this problem, but they are expensive to produce and thus are not available for most applications. This project is providing user-friendly personalized I/O benchmarks for all applications, by leveraging existing lightweight I/O profilers that already monitor the behavior of applications on high-performance computing platforms. The resulting personalized benchmarks will help researchers, developers, and purchasers in evaluating potential new storage system architectures, evaluating existing or new versions of storage systems and I/O libraries, planning for purchases, comparing performance of application clusters or workloads across platforms, and improving the performance of parallel I/O libraries and applications. The analytics and benchmark generation software, and example benchmarks, will be publicly released. This project uses two methods to construct personalized I/O benchmarks. First, the project is making existing applications self-benchmarking across all of their runs, by providing analytics and visualization facilities to convey to stakeholders the information already automatically captured by lightweight I/O profilers such as Darshan during each run. Second, the project is creating platform-customized benchmark suites that represent the mix of application-level workloads observed on a given platform. To accomplish this, the project is clustering observed production jobs based on their I/O behavior and using both new and existing I/O kernel generation techniques to generate a compact benchmark for each cluster. The resulting benchmark suite will advance the state of the art by serving as a proxy for real-world, platform-specific production I/O workloads, and by providing previously unavailable insight into how prevalent those workloads are at a given facility.

View original record on NSF Award Search →