Geometric Computing over Distributed and Streaming Data
University Of California-Santa Barbara, Santa Barbara CA
Investigators
Abstract
Computational geometry algorithms traditionally have been designed for centralized data settings, where the data are available to algorithms locally and in a persistent form. Yet, a growing number of emerging applications no longer fit such a conventional data model. For instance, in sensor networks and location-aware mobile computing, data is geographically distributed, and in high-volume data monitoring, such as analysis of Internet traffic or web clicks, data must be processed as a stream, without being stored. Motivated by these technological trends, the investigator develops distributed algorithms for geometric computing. Designing mathematically grounded geometric algorithms for distributed or streaming data is challenging because the algorithms must operate with limited computational resources. In particular, nodes in a sensor network have very limited battery power, memory, and bandwidth, and so collecting data from the network requires nodes to construct approximations of their spatial measurements. Similarly, data stream algorithms must process the data in a single pass and compute synopsis data structures that summarize important features of the data. This research develops novel geometric algorithms and data structures that deal with insufficient resources (bandwidth, memory, power) in a graceful manner, so that the solution quality adapts to the resources available --- the better the resources, higher the solution quality. In particular, the research proposes such resource-adaptive methods to discover epsilon-cuts in sensor networks, compute bounded-memory approximations of sensor observations, discover hierarchical heavy hitters in multi-dimensional data streams, and shape-preserving clustering methods for geometric streams. Many challenges of national importance concern the protection of our physical as well as cyber infrastructures. Being able to monitor these systems remotely and analyze their data with flexible, resource-adaptive, and programmable software tools is critically important. Because many of these systems deal with distributed or streaming data, this research has direct relevance to those applications.
View original record on NSF Award Search →