GGrantIndex
← Search

MRI: Development of PetaShare: A Distributed Data Archival, Analysis and Visualization System for Data Intensive Collaborative Research

$957,678FY2006CSENSF

Louisiana State University, Baton Rouge LA

Investigators

Abstract

Review Analysis: Major Research Instrumentation (MRI) Program FY06 Proposal #: CNS 06-19843 PI(s): Kosar, Tevfik Allen, Gabrielle D.; Seidel, Edward; Twilley, Robert R.; Wischusen, E. William Institution: Louisiana State University Baton Rouge, LA 70803-2701 Title: MRI/Dev: Dev. of PetaShare: A Distributed Data Archival, Analysis and Visualization System for Data Intensive Collaborative Research Ratings: E, V, V, V Panel Ranking: Competitive (C) Result: Recommend Amount Req: $ 957,678 Amount Rec: $ 957,678 Project Proposed: This project, developing a distributed data archival, analysis, and visualization instrument (called PetaShare) for data intensive collaborative research, enables transparent handling of underlying data sharing, archival, and retrieval mechanisms and makes data available to the scientist for analysis and visualization on demand. Designed to scale to the petabyte level, the instrument responds to an urgent need of scientists working with large data generation, sharing, and collaboration requirements. Involving five universities in the state (LSU, LaTech, Tulane, ULL, and UNO), the infrastructure consists of three layers of storage distributed at multiple sites: Primary very high speed RAM storage for data visualization; Secondary disk storage for data analysis and processing; and Tertiary tape storage for data archival and long term studies. Unlike existing approaches, PetaShare treats data resources and the tasks related to data access as first class entities just like computational resources and compute tasks, and not simply the side effect of computation. Expected key technologies include data-aware storage systems and data-aware schedulers, which take the responsibility of managing data resources and scheduling data tasks from the user, performing these tasks transparently. The instrument supports many important data intensive applications from different fields, including coastal and environmental modeling, geospatial analysis, bioinformatics, medical imaging, fluid dynamics, petroleum engineering, numerical relativity, and high energy physics. Broader Impact: The system complements the high-performance computing resources at the five interconnected campuses in this EPSCoR state, boosting interdisciplinary research among them. In addition to directly servicing and promoting research, PetaShare contributes in the training of hundreds of students. The system exhibits a high potential of increasing the accuracy and efficiency of storm surge models and hurricane tracking predictions, thereby enabling rapid and effective disaster responses that could affect millions of people in the world.

View original record on NSF Award Search →