GGrantIndex
← Search

DC: Small: Data Streaming through a Complexity-Theoretic Lens

$336,456FY2009CSENSF

Dartmouth College, Hanover NH

Investigators

Abstract

The modern, highly digitized world is replete with human activities that generate copious streams of data on a seemingly-continuous basis. There is knowledge to be gained by suitably analyzing, mining and monitoring the wealth of information in these data streams. However, due to the sheer scale of the data involved, traditional algorithmic thinking is inadequate and one needs data streaming algorithms that can process inputs memory-efficiently in one, or a few, scanning passes, ideally operating in sync with data generation. The area of data streaming algorithms, though dating back to about 1980, has truly come alive only in the last decade or so, with a wealth of algorithms having been developed for, e.g., various statistical analyses, geometric problems and graph-theoretic problems. An area rich with algorithmic ideas deserves sound theoretical underpinnings. Accordingly, this project shall investigate a number of fundamental questions in this area, focusing on the delineation of what can and cannot be achieved in this important computational model. By investigating foundational questions, rather than focusing too much on particular applications, the project's approach shall be that of algorithms-and-complexity theory. Some representative research goals of this project are as follows. (1) Refining our understanding of the space complexity of various statistical measures, especially in the multi-pass setting. (2) Understanding the power of randomness in order-dependent streaming problems. (3) Proving lower bounds in stronger variants (extensions) of the basic stream model. (4) Attacking fundamental questions in communication complexity that lie at the heart of current open questions in data stream complexity. Results obtained in the course of the project will be disseminated at international conferences, workshops and seminars at various institutions. The project's educational component shall consist of the development of a graduate-level course on data stream algorithms, covering the basics of the field and leading up to the most significant recent discoveries.

View original record on NSF Award Search →