GGrantIndex
← Search

CAREER: Operating System Support For Transactional Memory: Construction and Performance Scalability of Parallel Programs

$400,000FY2007CSENSF

University Of Texas At Austin, Austin TX

Investigators

Abstract

CAREER: Self-Managing Resource Allocation in Unsupervised Distributed Systems Recent years have seen a growing deployment of distributed computing infrastructures such as Grids, PlanetLab, @home, and peer-to-peer systems, that run a variety of Web, commercial, and scientific applications. Many of these infrastructures are unsupervised---they consist of large number of loosely-connected nodes that contribute computational and storage resources but are not centrally managed. Such unsupervised infrastructures are characterized by uncertainty in their resource availability caused by failures, varying load conditions, and node churn, thus putting undue burden on application writers and system administrators for the successful deployment and execution of applications. This project is developing a self-managing resource allocation framework that would hide the infrastructure uncertainties and dynamics from applications, while transparently adapting to changing conditions within the infrastructure. As part of this framework, this project is developing techniques for: (i) Predictable resource aggregation to provide resource guarantees to applications in the presence of dynamic loads and changing resource availability, (ii) Reliability-aware resource management to provide desired levels of reliability and availability, and (iii) System inference and prediction to enable decentralized inference of global system conditions for proactive response to dynamic infrastructure changes. These techniques are based on cooperation and redundancy among nodes in the infrastructure to provide scalability and decentralization. The proposed research will have significant impact on distributed computing by enabling effective deployment of large-scale scientific and commercial applications on resource-rich but unreliable infrastructures.

View original record on NSF Award Search →