SCI: ETF Grid Infrastructure Group: Providing System Management and Integration for the TeraGrid
University Of Chicago, Chicago IL
Investigators
Abstract
The Extensible Terascale Facility (ETF) is the next stage in the evolution of NSF large-scale cyberinfrastructure for enabling high-end computational research. The ETF enables researchers to address the most challenging computational problems by utilizing the integrated resources, data collections, instruments and visualization capabilities of nine resource partners. On October 1, 2004, the ETF concluded a three-year construction effort to create this distributed environment called the TeraGrid (TG) and we are now entering the production operations phase. The TeraGrid resource partners include: the University of Chicago/Argonne National Laboratory, the San Diego Supercomputer Center at UCSD, the Texas Advanced Computing Center at UT-Austin, the National Center for Supercomputing Applications at UIUC, Indiana University, Purdue University, Oak Ridge National Laboratory, and the Pittsburgh Supercomputing Center. A separate proposal was submitted to NSF on October 19, 2004 for the TeraGrid Grid Infrastructure Group (GIG). Under the direction of Charlie Catlett at UC/ANL, in general, the GIG will be responsible for coordination of development activities for the TeraGrid with subcontracts to the partner sites. The resource partners (RP) will each have independent cooperative agreements with NSF, but will work closely with the GIG to implement the vision of the TeraGrid. This proposal outlines the Grid Infrastructure Group (GIG) plans to participate as a System Management and Integration Group within the TeraGrid team to provide the Resource Partners and the scientific community with ongoing access to this computational science facility. This proposal covers the period November 1, 2004 through October 31, 2009. TeraGrid, a world-class networking, computing, and storage infrastructure has been built and deployed. This initiative now faces the challenge of further engaging the science and engineering community to guide the tailoring of this generic infrastructure to better support their needs, catalyzing new discoveries and broadening the base of computational science. TeraGrid integrates some of the nation's most powerful resources to provide high-capability production services to the scientific community. In addition, NSF supports common software through the NSF Middleware Initiative (NMI) and community-specific infrastructure through its Information Technology Research (ITR) projects. The TeraGrid Grid Infrastructure Group (GIG) will build on these foundations to broaden the community benefiting from cyberinfrastructure and to harden and deepen TeraGrid's unique capabilities. Collaborating with 16 science partners, the GIG has developed infrastructure priorities to simplify research modalities that remain difficult (or infeasible), even with today's cyberinfrastructure. For example, the TeraGrid aims to make routine the following frequently requested, but currently difficult tasks: 1. Drive complex workflows with multiple computational and data access steps across TeraGrid and smaller scale resources in other Grids in an integrated and automatic manner. 2. Harness TeraGrid resources in an on-demand mode, to provide computational decision-support for time-critical events ranging from weather to medical treatment. 3. Optimize turnaround, costs, and utilization by creating resource brokers that present a single point of access to schedule computational and data management tasks across all TeraGrid resources based on resource availability information. A five-year roadmap has been presented. But recognizing that user needs continue to evolve in response to scientific opportunities, it is planned to reevaluate this roadmap annually based on a widening set of science partner discussions. TeraGrid will encourage the scientific community to leverage this resource to tackle the most important computational problems in virtually every scientific discipline. The infrastructure and community-driven grid service bridges and portals, which are called science gateways, will bring increased productivity to a large numbers of scientists who have not heretofore used NSF's high-performance computing resources. The problems targeted by current and planned TeraGrid users are among the most computationally intensive areas for modern science and represent a class of problems that cannot be addressed effectively by either smaller-scale grid environments or stand-alone supercomputer centers. Leveraging software and infrastructure partners, the TeraGrid will develop policy for software, security, and resource sharing necessary to underpin international cyberinfrastructure. TeraGrid, NMI, ITR projects, and discipline-specific infrastructure projects will be integrated, thus, forming a coherent cyberinfrastructure. This cyberinfrastructure will provide common software components and use the TeraGrid network as a national grid resource backplane, reaching thousands of scientists through science gateways and collaboration with other grids. Working with the software partners, the Grid Infrastructure Group intend to develop a set of policies and software that will be widely used by other grid projects, with an eye toward sustaining infrastructure beyond the end of this decade. The GIG will coordinate education outreach and training (EOT) initiatives across the nine TeraGrid resource provider sites to support a broad EOT program for cyberinfrastructure. We have set quantitative objectives for growing the TeraGrid user community by an order of magnitude: to 5,000 users by FY09. To empower all TeraGrid users, the GIG has addressed heterogeneity and the policy requirements for unique national resources and high-availability production services, developing a coordinated software environment across these heterogeneous resources and a powerful verification and validation system. In a complementary approach to TeraGrid, community-oriented ITR projects such as the Grid Physics Network (GriPhyN ) and Linked Environments for Atmospheric Discovery (LEAD are addressing scaling and software packaging capabilities necessary for aggregation of many departmental-scale resources. Similarly, computational science projects at DOE, via the SciDAC initiative, and at NIH, via the NIH Roadmap, are also important components of the cyberinfrastructure landscape. Moreover, our collaborators in Europe, Asia-Pacific, and elsewhere are building scientific grid infrastructure in projects such as, the UK eScience Programme ], Enabling Grids for E-Science in Europe (EGEE), and Japan's National Research Grid Initiative (NAREGI). The TeraGrid will partner with these and other grid projects, NSF's core centers program, and software providers such as the NMI GRIDS Center to catalyze an integrated NSF cyberinfrastructure program with cross-agency and international impact.
View original record on NSF Award Search →