SHF: Small: ML Accelerator Cohort Architecture

$600,000FY2022CSENSF

University Of Southern California, Los Angeles CA

Investigators

Abstract

Machine learning (ML) models play an increasingly crucial role in people's daily lives: from autonomous driving, health care, data center management, to machine translation. To deal with the range of usages, model diversity has been growing: from convolutional neural networks, to embedding table-based recommendation systems, to graph neural networks to natural language processing. As model usage embeds in sensitive domains, such as speech- and emotion-recognition engines, data privacy is also a key requirement for the next generation ML hardware. As the model diversity has grown, the compute and communication demands also vary widely, both across models and within a model. And new privacy protecting execution models such as multi-party computing (MPC) demand fundamentally different hardware support. Hence, there is a need to rethink the design of ML hardware accelerators for the new era of privacy-preserving heterogeneous model usage. To address these concerns, this project focuses on the development of the ML Accelerator Cohort Architecture (MLACA). MLACA is collection of heterogeneous ML accelerator tiles that work in unison to adapt to the dynamically changing ML execution demands. MLACA uses multiple research thrusts to achieve its goals: the first thrust focuses on building a heterogeneous MLACA compute fabric that supports a wide range of dense and sparse execution paradigms, including novel support for private computing. The second thrust focuses on MLACA's memory and acceleration fabric that performs in-memory indexing acceleration for embedding tables, prefetching that uses perfect future knowledge of training data. MLACA's communication thrust focuses on distributed training acceleration, using techniques such as dynamic tensor decomposition that tradeoff computation and communication costs. The runtime system thrust manages MLACA fabric allocation across competing ML models to maximize the resource utilization and improve power efficiency. Technology transition is planned through strong industry collaborations with the USC-Meta center and Intel's Private AI Institute. This research uses NSF’s research experience for undergraduate funding, and USC's internal SURE and K-12 SHINE programs to engage high school students and teachers, and undergraduate students in preparing them for a career in ML systems design. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →