GGrantIndex
← Search

CAREER: Information Theoretic Methods in Data Structures

$306,138FY2019CSENSF

Columbia University, New York NY

Investigators

Abstract

Data structures constitute the backbone of algorithm design and information retrieval, and underlie the performance of countless industrial-scale applications, from internet routing and road navigation to cloud storage, search and compression. As such, understanding what data structures can, and cannot, compute efficiently is a fundamental question in both theory and practice. Motivated by this question and the growing amounts of data to be processed in modern databases, this project is focused on two primary themes: (1) developing new mathematical tools for proving unconditional lower bounds on the operational time and memory consumption of static and dynamic data structures, thereby reducing the (huge) existing gaps from known upper bounds; (2) exploring the interplay between information theory and data structures in large-scale storage applications. In particular, the second part of the project will focus on designing novel data structures for information retrieval (mainly compression and search) over correlated datasets. The project reflects a scientific agenda for conducting interdisciplinary research in the intersection of theoretical computer science and information theory, bridging and impacting research and education in both communities. In recent years, information theory and communication complexity have emerged as powerful mathematical tools for proving unconditional lower bounds on the operational time of data structures and for analyzing the power of pre-processing information. Proving time-space lower bounds in the "cell-probe" model has long been recognized as one of the major challenges of complexity theory, with many implications on theory and practice. The first part of this project will develop new information-theoretic tools for proving such trade-offs, and will explore connections between data-structure lower bounds and other areas in computational complexity and mathematics, such as locally-decodable codes, circuit complexity, streaming, functional analysis, rigidity and algebraic geometry. Apart from its role as an analytic tool for computation, the interplay between information theory and data structures is particularly vital in large-scale storage applications. The increasing size and complexity of data sets being uploaded and processed on remote data centers, pose a real problem of scalability for traditional data compression and database architectures. The second part of this project establishes a new research program for storage and information-retrieval problems that arise over massive correlated data sets, focusing on the (unexplored) problem of locally decodable source coding for correlated files in static and dynamic scenarios. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →
CAREER: Information Theoretic Methods in Data Structures · GrantIndex