GGrantIndex
← Search

CAREER: Transparent, Interactive Desktop Parallel Computing for Scientific Data Processing

$400,000FY2006CSENSF

North Carolina State University, Raleigh NC

Investigators

Abstract

Desktop computing remains indispensable in scientific exploration, largely because it provides people with devices for human interaction and environments for interactive job execution. In fact, the proliferation of supercomputers and clusters has been driving the need for more efficient desktop processing, to assist high-end computing in tool validation, data analysis, and visualization. However, with the rapidly growing data volume and task complexity, it is increasingly hard for individual workstations to meet the demands of interactive scientific data processing. The increasing cost of such interactive processing is hindering the productivity of end-to-end scientific computing workflows. The project will develop a novel desktop parallel computing framework to speed up scientific data processing tasks routinely executed on desktop machines. It will allow users to preserve the interactiveness and convenience of desktop processing, without explicit resource request or waiting for batch execution, while their computation is seamlessly accelerated by aggregating idle computing and storage resources in local-area networks. This framework comprises several closely coupled innovative techniques to be developed in this research: data processing semantics specification based on relational algebra, on top of which we design application interfaces for automatic and flexible program parallelization; integrated computing resource aggregation and storage resource aggregation, on top of which we develop parallel I/O for desktop parallel computing; asymmetric task scheduling that exploits the central role of the client workstation, for guaranteed interactive execution, better fault tolerance, and diverse self-configuration opportunities; and quantitative and explicit performance impact control based on impact benchmarking and real-time workload monitoring that protects the performance of resource donors native workloads, in the presence of aggressive and persistent resource stealing. The proposed framework possesses three key properties that distinguish itself significantly from existing parallel execution platforms: interactiveness (immediate execution regardless of the availability of external resources), transparency (hiding the availability of and fluctuation in external resources from both application developers and users), and customized performance impact control (throttling a parallel jobs resource consumption according to the resource usage measured from the native workload of individual resource owners). Our research takes the initial steps towards a new parallel computing paradigm suitable for heterogeneous and opportunistic environments. With this new paradigm, users do not have to specify the amount of resources needed, nor do they have to wait for such resources to become available.

View original record on NSF Award Search →