GGrantIndex
← Search

Incorporating Fault Tolerance at the Application Level

$210,607FY2001CSENSF

University Of Massachusetts Amherst, Amherst MA

Investigators

Abstract

Preliminary work by the PIs has shown that application-level information can be exploited to greatly reduce the amount of redundancy required to deal with transient failures, which are by far the most common type of failure. For example, in a radar target-tracking application, our approach required only 15% redundancy to provide complete fault-tolerance against transient faults. Another use of ALFT is in providing a temporary patch in the event of a permanent processor failure, allowing the system more time to execute a recovery algorithm. ALFT is orthogonal to other approaches to fault-tolerance, so that it can be used either by itself or in combination with them. For example, a designer might use ALFT to guard against transients, and make a small amount of hardware redundancy available, in the form of line-replaceable spares, to deal with permanent failures. The objective of this project is to develop Application-Level Fault Tolerance strategies and investigate their effectiveness for various classes of real-time applications. The main focus will be to generalize our approach to include as many different types of applications as possible, develop the most suitable strategy for each application type and evaluate the efficiency of the developed scheme.

View original record on NSF Award Search →
Incorporating Fault Tolerance at the Application Level · GrantIndex