CAREER: Compiler and Runtime Support for Sampled Sparse Computations on Heterogeneous Systems

$269,174FY2024CSENSF

University Of Iowa, Iowa City IA

Investigators

Abstract

Sampling-based algorithms are gaining popularity in data-intensive applications because they help reduce computation costs. However, their efficiency on hardware is limited due to random memory access and computation patterns. This research aims to address this issue by developing a suite of compiler and runtime tools. Unlike previous approaches that were specific to certain applications, the project's novelties include: 1) a high-level programming interface that allows users to specify data management and preprocessing for different types of sampled sparse computation, and 2) techniques that exploit both hardware and randomized algorithm features to improve performance. The impact of the project lies in its ability to simplify the implementation of sampling-based algorithms, improve the performance of many scientific computing and big data applications, and enhance our capacity to solve real-world problems at a large scale. This project will also make several contributions towards education and improving diversity, including: 1) enhancing course projects and experimental platforms for the operating systems and high-performance computing courses at the institution, and 2) providing research opportunities to undergraduate students, including students from the underrepresented groups, with the goal of creating interest in system-related research. There are three main challenges that this work will address. First, with the incorporation of heterogeneous computing devices and interconnects in modern computer systems, it is nontrivial to divide and synchronize data across multiple devices. To address this challenge, the investigator will develop a runtime system that provides a virtualized view of the partitioned data. Additionally, a compiler tool is developed to optimize data placement across devices using a statistical performance model for random data accesses. Second, the constantly changing computation patterns in sampling-based algorithms make existing expensive data preprocessing techniques impractical for accelerating sparse computation on each device. To overcome this, the investigator will explore sampling-based methods to reduce the overhead of data preprocessing. Third, sampling-based algorithms generally expose tradeoffs between accuracy and efficiency. However, it is not clear how to systematically explore these tradeoffs. The investigator will develop a runtime system based on speculative execution to adaptively adjust the sampling strategies and achieve the optimal balance between accuracy and efficiency. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →