CAREER: Addressing Scalability Challenges in Designing Next-generation GPU-Based Heterogeneous Architectures
College Of William And Mary, Williamsburg VA
Investigators
Abstract
Graphics processing units (GPUs) are becoming default accelerators in many domains such as high-performance computing (HPC), deep learning, and virtual/augmented reality. Their close integration with high-performance multi-core CPU architectures is also allowing very efficient heterogeneous computing. Going forward, it is imperative that such GPU-based systems scale both in terms of performance and energy efficiency to meet the exascale (and beyond) computing demands of the future. However, sustained scaling of these systems is challenging primarily because a) fabricating a single large die provides very low yield, making it prohibitively expensive, b) memory hierarchy remains a critical performance and energy efficiency bottleneck, and c) programmability and application scalability is hindered by inefficiencies in the shared virtual memory and multi-application support. This project seeks to address these scalability challenges by rethinking the design of future large-scale GPU-based systems. In particular, this research project revolves around three major components: a) design space exploration of cores (including their organization) and the entire memory hierarchy, b) development of data movement optimization techniques by identifying and then exploiting cache locality via novel synergistic caching and scheduling techniques, and c) improving resource utilization of large-scale system resources by enhancing shared virtual memory and multi-application execution support. All three research components will be evaluated on a newly-developed comprehensive evaluation infrastructure. The findings of this research will be incorporated into new and existing undergraduate and graduate courses. It is expected that the insights resulting from this research would have a long-term positive impact on GPU-based computing, thereby making our daily lives more productive.
View original record on NSF Award Search →