CAREER: Data Intensive Grid Computing on Active Storage Clusters
University Of Notre Dame, Notre Dame IN
Investigators
Abstract
Clusters and grids have achieved great successes in executing computation intensive workloads. However, executing data intensive workloads has proven to be much more difficult: data services are a serious bottleneck in large computing systems. To address this problem, cluster and grid computing systems must closely integrate computation with data storage. A variety of new system structures are being developed according to this principle. To provide scalable computation and storage, a cluster constructed from active storage units can serve as a combined archival file system and batch computing system. Distributed data structures built on active storage can guide data and computation placement in a coordinated manner. A new cluster programming language allows distributed data structures to be efficiently harnessed by large workloads. The result of this project will be a dramatic increase in the capacity of clusters and grids to process data intensive scientific workloads. This project will serve as an exciting environment for class projects in operating systems and distributed systems classes, and the software developed will be distributed as "courseware" for use by other educators. This work will have an impact on a wide variety of scientific disciplines, but will work particularly closely with active users in astrophysics, bioinformatics, biometrics, and molecular dynamics. The software created by this project will be disseminated in open source form for use by the broader community.In addition, this software will also be used as a vehicle for making live astrophysics data accessible to high school students and teachers through the Notre Dame Extended Research Community.
View original record on NSF Award Search →