SHF Small: A Compiler-Based Auto-Tuning Framework for Many-Core Code Generation
University Of Utah, Salt Lake City UT
Investigators
Abstract
This project provides a compiler-based foundation for data-parallel optimization and code generation that can yield very high performance across a range of different multi-core and many-core architectures while increasing the productivity of application programmers. The project is targeting two distinct architectures, a conventional quad-core microprocessor and a graphics processing unit (GPU) with 240 cores, as well as a heterogeneous system that combines both. The technology represents a significant departure from the organization of contemporary compilers, automating as much of optimization as possible, but opening up the mapping process to provide savvy programmers the control they so often desire. The unique features of the system include: (1) transformation recipes as a programmer's or high-level compiler's interface to optimization and code generation; (2) a polyhedral framework to mathematically represent code structures and optimizations for robust code generation; and (3) auto-tuning technology to compactly describe and systematically evaluate a range of possibleimplementations of a computation. The most novel of these is the is the transformation recipe, which factors out the commonality of OpenMP, CUDA and OpenCL code generation, thus supporting the heterogeneous platform and facilitating late-binding decisions of which implementation to use. The software system from previous work is being released, and has accelerated both libraries and scientific applications. With the right transformation recipe for the auto-tuning system to evaluate, the resulting automatically-generated code yields high performance, in some cases competitive or even better than manually-coded implementations. The PIs plan to expand this capability to a broader set of applications and architectural features.
View original record on NSF Award Search →