SI2-SSE: BONSAI: An Open Software Infrastructure for Parallel Autotuning of Computational Kernels

$499,977FY2016CSENSF

University Of Tennessee Knoxville, Knoxville TN

Investigators

Abstract

Most supercomputers today accelerate the computations by using processors with many cores to solve important problems in science and engineering. Although this reduces the cost of the hardware system, it greatly increases the complexity of writing and optimizing ("tuning") software. This project extends a previously funded NSF project: Benchtesting Environment for Automated Software Tuning (BEAST) program to create a software toolkit that allows for semi-automatic optimization of software, thereby reducing the programming overhead. This project, BEAST OpeN Software Autotuning Infrastructure (BONSAI) will greatly increase the efficiency of scientists and engineers to develop fast and efficient programs to solve their problems. BONSAI has tremendous support from various computer processor manufacturing companies and academic institutions. The BONSAI system will be available as open-source software for academic and commercial use and many students will be trained in using the software. The emergence and growing dominance of hybrid systems that incorporate accelerator processors, such as GPUs and coprocessors, have made it far more difficult to optimize the performance of the different computational kernels that do the majority of the work in most research applications. The BONSAI project aims to create a transformative solution to this problem by developing a software infrastructure that uses parallel hybrid machines to enable large autotuning sweeps on computational kernels for GPU accelerators and many-core coprocessors. The system will go beyond just measuring runtimes, allowing for collection and analysis of non-trivial amount of data from hardware performance counters and power meters. The system will have a modular architecture and rely on standard data formats and interfaces to easily connect with mainstream tools for data analytics and visualization. The BONSAI project will leverage the experiences of the BEAST project, which established a successful autotuning methodology and validated an autotuning workflow. BONSAI will equip the community with a software environment for applying parallel resources to the tuning and performance analysis of computational kernels. Specifically, the work will be organized around the following objectives: (1) Harden and extend the programming language called BeastLang, which was created during prior research as a way of defining the search space that the autotuning infrastructure generates and explores. BeastLang enables users to create parameterized kernel specifications that encode the interplay between the kernels themselves, the compilation tools, and the target hardware. It will be integrated with the other components of BONSAI, have its Python syntax enhanced and extended, its compiler improved, and be supplemented with a runtime that supports it with multi-way parallelism for the autotuning process. (2) Develop and test a benchtesting engine for making large scale parallel tuning sweeps, using large numbers of GPU accelerators or many-core coprocessors. This engine will support both parallel compilation and parallel tests of the resulting kernels, using many distributed memory nodes and multithreading within each node, with dynamic load balancing. It will produce an extensive collection of performance information from hardware counters, and possibly energy meters, as well as collection of information about the saturation of the compiled code with different classes of instructions. (3) Develop and test a software infrastructure for collecting, preprocessing, and analyzing BONSAI performance data. The system will a) simplify the task of instrumenting the kernel and provide a simple interface for selecting the metrics to be collected with sensible defaults; b) simplify the process of collecting hardware counters and performance data from various open source and vendor specific libraries; and c) provide tools that allow the user to quickly and efficiently transform output data to a format that can be easily read and analyzed using mainstream tools such as R and Python. (4) Document and illustrate the process of using BONSAI to tune various different types of kernels. These model case studies will include discussions of how BeastLang was applied to create the parameterized kernel stencil, how the parallel benchtesting engine is invoked to generate and explore the search space, and how the data collected from the operation of the engine can be analyzed and visualized to gain insights that can correct or refine the process for another iteration. The BONSAI project has the potential to fundamentally transform autotuning research by: 1) Making autotuning accessible to a broad audience of developers from a broad range of computing disciplines, as opposed to a few selected individuals with the wizardry to set up a successful experiment within the confines of serial execution. 2) Changing the general perception of autotuning as not just the means of producing fast code, but as a general technique for performance analysis and reasoning about the complex software and hardware interactions, and positioning the technique as one of primary tools for hardware-software co-design. 3) Boosting interest in exploring neglected avenues of computing, such as exploration of unorthodox data layouts, and challenge the status quo of legacy software interfaces. BONSAI has the potential to bring autotuning to the forefront of software development and to help position autotuning as a pillar of software engineering.

View original record on NSF Award Search →