GGrantIndex
← Search

CSR: EAGER: Exploratory Research on Scalable Resiliency Through Shadow Computing and Differential Data Replication

$299,800FY2013CSENSF

University Of Pittsburgh, Pittsburgh PA

Investigators

Abstract

As our reliance on IT continues to increase, future applications will involve the processing of massive amounts of data and will require an exascale computing infrastructure in which the number of computing, communications and storage elements will increase by several orders of magnitude. Such an infrastructure will inevitably incorporate new classes of high density, low latency and low power non-volatile memory. This, in turn, will increase by orders of magnitude the rate of failures making resiliency a major concern. This project addresses this resiliency challenge by taking a radical approach to fault-tolerance, which goes beyond the current approach of checkpointing and rollback recover. It introduces innovative and scalable fault-tolerance mechanisms, namely shadow-computing and quality-of-data (QoD) aware replication, as building blocks for a ?tunable? resiliency framework that leverage the new and emerging memory technology and takes into consideration the nature of the data and the requirements of the underlying application. It is expected that the project will lead to new insights into the multi-faceted and challenging resiliency problem in exascale computing platforms. The expected outcomes of the project are a new fault-tolerance computational model and a suite of QoD-aware replication methods that, when combined with storage level resiliency, will lead to high availability with minimized access delay in exascale computing environments. The project seeks to involve graduate and undergraduate students in all its research thrusts. In addition to their contributions in the research activities, involved students also participate fully in outreach, dissemination and community efforts activities. The project also seeks to leverage existing collaboration with industrial partners to involve students in summer internships and provide them with first hand exposure to research and development in an industrial setting. A main objective of the recruiting effort is to seek the involvement of students from minorities and under-represented groups in the project.

View original record on NSF Award Search →