GGrantIndex
← Search

CSR --- SMA: Reliability Modeling and Evaluation of Fault-Tolerant Hierarchical Computer Systems

$40,000FY2006CSENSF

University Of Massachusetts, Dartmouth, North Dartmouth MA

Investigators

Abstract

Abstract Due to the ever increasing system complexity, reliability modeling and analysis are becoming increasingly essential components in the design and tuning of fault-tolerant hierarchical computer systems. Tremendous research efforts have been expended in this area, but two practical issues, modular imperfect coverage (MIPC) resulted from imperfect recovery mechanisms and common-cause failures (CCF) arising from a shared root cause, have generally been missed or not been fully considered in existing computer system reliability models. Failure to model either of them accurately results in over/understated system reliability, which makes reliability analysis less effective in the design and tuning of computer systems. The primary goals of this project are to develop novel reliability models for fully describing MIPC and CCF, and to explore efficient model evaluation methods leading to more accurate analysis of hierarchical computer system reliability. This project involves three phases: model development, model evaluation, and the development of a reliability analysis software tool that applies the concepts and methods developed through this research. The new models and evaluation methods developed through this work are fundamental contributions to the body of knowledge on the computer system reliability. Research results from this project will support the design of reliable computer systems subject to MIPC and CCF. The PI will disseminate information and knowledge to the academic community and the industry through seminars, classroom materials, conference/journal publications, and an Internet website for the project.

View original record on NSF Award Search →