GGrantIndex
← Search

AF: Small: Collaborative Research: Principles of Robust Cooperative Computing in Dynamic Distributed Systems

$199,727FY2010CSENSF

University Of Colorado At Denver, Aurora CO

Investigators

Abstract

This project investigates computability, algorithmics, and complexity in distributed settings with large numbers of machines cooperating on solving computation problems in the presence of dynamic changes in the computing medium. The computation problems are modeled as collections of tasks, such as the problems frequently occurring in the Internet supercomputing. This research has the potential to impact massive cooperative computing in dynamic settings through the use of rigorously developed algorithms with formally proved efficiency and fault-tolerance guarantees. In the longer term this research will also have impact on the development of broader classes of adaptive distributed systems. This research develops advanced algorithms enabling efficient and fault-tolerant cooperative computing in dynamic distributed settings. The basic problem is formulated in terms of a collection of processors that need to perform a set of tasks, where the tasks must be executed using either at-least-once or at-most-once semantics. The research considers dynamic settings that are representative of situations in Internet supercomputing. The models of cooperation include: (a) oracle model, where a central authority provides information to the participants. (b) master-worker model, where a single master coordinates the workers in performing the tasks, a typical setting in Internet supercomputing. (c) peer-to-peer mode, where fully-distributed solutions will be investigated, enabling much more dynamic and fault-tolerant solutions. The investigation includes the associated resource discovery problem of identifying the nodes in the distributed system that are willing and able to cooperate; here the investigation focuses on more realistic settings than previously considered. This research also studies supporting services that can provide effective implementations of communication and shared-memory primitives in target dynamic platforms. The quality of algorithms is evaluated using corresponding lower bounds and impossibility results.

View original record on NSF Award Search →