SHF: SMALL: Parallelization and Memory System Techniques for Heterogeneous Microprocessors
University Of Maryland, College Park, College Park MD
Investigators
Abstract
Traditional discrete GPUs have demonstrated dramatic performance gains over CPUs, but the improvements have been limited to simpler codes with massive parallelism. Recently, computer manufacturers have been producing heterogeneous microprocessors in which a CPU and GPU are integrated on the same die. Such processors provide shared memory between the CPU and GPU, and fast CPU-GPU communication. These features potentially enable acceleration of new types of computations, allowing the GPU to gainfully execute smaller loops and to support more complex codes. This project investigates techniques for exploiting heterogeneous microprocessors and to realize their benefits. If successful, the project will enable most programs, not just those with massive parallelism, to utilize GPUs. The project will also provide valuable training to both graduate and undergraduate students, and improve graduate-level coursework. On the research side, the project will pursue the following research directions. First, it will create a suite of benchmarks exhibiting complex loop nests that demonstrate the new capabilities of heterogeneous microprocessors. Second, the project will also develop novel parallelization schemes for the new benchmark suite. The parallelization schemes will map multiple levels of parallelism to CPU and GPU cores simultaneously to fully utilize the compute resources in a heterogeneous microprocessor. Third, new cache coherence protocols will be developed that efficiently support the bulk producer-consumer communication that occurs as finer-grained computations migrate from CPU to GPU and back. And finally, adaptive memory address mapping schemes will be investigated that exploit DRAM page locality for CPU accesses and channel-level parallelism for GPU accesses.
View original record on NSF Award Search →