GGrantIndex
← Search

MRI: Acquisition of a Heterogeneous Multi-GPU Cluster to Support Exploration at Scale

$399,700FY2020CSENSF

Northeastern University, Boston MA

Investigators

Abstract

This project aims to acquire a heterogeneous Multi-GPU cluster, constructed out of state-of-the-art GPUs devices, interconnected with emerging NVLink and HDR networks, network-attached non-volatile memory (NVM) storage for GPU caching, and interconnected by a smart HDR infiniband switch, to enable, accelerate, explore, and support applications at scale from different domains that include: • Distributed deep neural networks for retinopathy, • Wireless network forensics, • Adversarial machine learning, • Computational social science, • Mathematical optimization and big data analytics, • Coastal engineering modeling, and • Multi-GPU system (including NVMe technology to support caching in GPU network and a smart network switch that can offload collective operations) These features will enable computational scientists to exploit GPU parallelism in new ways by programming the smart network switch and caching selectively to hide memory and interconnect latency. Currently, graphics processing units (GPUs) provide high computational throughput by lunching a large number of threads by overlapping compute and memory operations. Combined with low-overhead thread swapping, GPUs can hide long memory operations. But underlying system architectures have not kept up as the size and complexity of GPU applications grow. The multi-GPU solutions are less programmer friendly and result in lower scalability when their architectural support is compared with the multi-CPU systems. Current GPUs systems treat GPUs as discrete devices, with limited support for a truly shared memory programming model. Since multi-GPU interconnect bandwidth has become a limiting factor for scaling multi-GPU systems, exploration of new network topologies, smarter network elements, and enhanced software layers for caching and prefetching, that meet the needs of tomorrow’s demanding data applications are necessary. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →