GGrantIndex
← Search

SHF: Small: Collaborative Research: Accelerated Data Transformation: A Software-Hardware Stack for Transducers

$266,000FY2019CSENSF

University Of Chicago, Chicago IL

Investigators

Abstract

Recent years have seen an explosive rise of "big data" and data-intensive computing. Many scientific and data analytics applications that operate on large data sets perform data transformation at their core. For example, many genomics applications translate DNA sequences into protein sequences and must perform this transformation on large volumes of data (petabytes) generated by DNA sequencers. Recent studies have shown that popular data analytics systems spend significant amount of time performing data transformation operations such as data compression, decompression, serialization, deserialization and error correction. While application-specific hardware accelerators can be useful, their narrow applicability can significantly limit their impact. On the other hand, accelerating a common computation at the core of many applications can have a broader impact, and benefit not only existing, but also future applications. This research targets the problem of general acceleration of data transformation. More specifically, to allow breadth of utility, the project aims to provide a software-hardware stack to accelerate the computational abstraction at the core of data transformation, namely, finite-state transducers. Given the societal importance of big data computing, a significant broader impact of this work is the uptake of research ideas and technology into the scientific base, and their resulting impact on a wide range of 'big data' applications for science, industry, and society. In addition, this project allows students to experience in first hand how abstract concepts such as finite-state transducers can be applied to practical problems, connecting elements of theory of computation, algorithm design and optimization, applications and systems architecture. The research investigates the transducers computational model and its efficient implementation with the goal of providing performance and energy-efficiency gains in data analytics systems all of which rely on data transformation. In particular, this work aims to reduce transducer theory to practical use by mapping transducer programs onto emerging data processing accelerators. To this end, this work targets the following issues. First, design a software stack to map transducers onto novel hardware accelerators. In particular, the investigators build on their previous work on the design and implementation of the Unstructured Data Processor, a novel hardware accelerator for data transformation shown to give high performance, but that at present lacks a high-level programming model. Accomplishing this goal requires investigating a set of platform-independent and platform-specific optimizations aimed to minimize the code size, minimize the memory utilization, and leverage the coarse- and fine-grained parallelism inherent in the computation. Second, improve and extend the underlying hardware accelerator based on the insights acquired in the design of the software stack. Third, extend the transducer model to express the full range of data transformations in popular data analytics systems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →