CSR: Medium: Collaborative Research: Wizard: Exploiting Disk Performance Signatures for Cost-Effective Management of Large-Scale Storage Systems
Wayne State University, Detroit MI
Investigators
Abstract
The tremendous advances in low-cost, high-capacity magnetic hard disk drives, flash-based solid state drives and non-volatile memory have been among the key factors supporting big data applications and various computing-storage services that the modern society deeply relies on. However, storage drives are reported to be the most commonly replaced hardware components because of failures. This causes service downtime and even data loss, costing enterprises multi-trillion dollars per year. Existing disk failure management approaches are mostly reactive and incur high overheads; they do not provide a cost-effective solution to managing large-scale production storage systems. The goal of this project is to achieve a deep understanding of the reliability of the real-world storage systems, and to develop a cost-effective data and storage resource management system for reliability enhancement. The investigators' approach to building reliable, large-scale storage systems is carefully designed to support storage health monitoring, modeling, prediction and proactive recovery in a systematic fashion. In particular, they first categorize and model storage failures to derive disk performance signatures and explore disk performance signatures to forecast occurrences of disk failures. They then characterize I/O workload dependency in disk performance degradation and integrate the performance signatures of heterogeneous disk devices to effectively reconfigure and manage storage resources. Furthermore, the project will provide easy-to-use APIs for storage users and developers to employ the developed tools and techniques for proactive data rescue and preventive disk reliability enhancement. Finally, the project provides excellent opportunities for training graduate students, especially minority and female students, and for developing new curriculum materials on reliable storage systems.
View original record on NSF Award Search →