CC* Integration-Large: An Extensible Internet for Science Applications and Beyond
University Of California-Berkeley, Berkeley CA
Investigators
Abstract
Modern scientific research often involves huge datasets and many geographically distributed collaborators. The collaborative sharing of data in such research efforts requires rapid, flexible, and automated data transfers between sites. These sites have different computational and networking infrastructures, and are managed by different entities. To provide a uniform and extensible set of data sharing tools to such collaborations, the network should serve as a Data Valet for collaborating scientific institutions. This main technical challenge in achieving this goal lies not in each individual piece of data-sharing functionality. Rather, it lies in how one builds an extensible architecture where current data-handling features can synergistically coexist and new ones can be seamlessly incorporated. To meet this challenge, this effort will leverage a design called the Extensible Internet (EI). EI allows network operators to insert new data processing functionality in a way that (i) any host that is aware of the new functionality can use it and (ii) hosts that are unaware of the new functionality can continue to function. This system will be deployed on ESnet, which is a high-performance network optimized for large-scale science, interconnecting the National Laboratory System in the United States. The National Laboratory System pursues research in a wide array of fields. The work described here will facilitate any National Laboratory effort that involves the sharing of large datasets among geographically distributed collaborators. Potential opportunities for impact range from the biological sciences (e.g., genomic research) to the environmental sciences (e.g., climate research) to the physical sciences (e.g., Large Hadron Collider) to the energy sciences (e.g., National Synchrotron Light Source). This collaborative effort brings together researchers from Mount Holyoke College, New York University, the University of Washington, and the University of California at Berkeley. All source code, data, papers, documentation, and course materials produced as a part of the proposed work will be available from http://extensibleinternet.org/ . All material will remain available for at least five years after the end of this project or five years after publication, whichever is later. All source code will be released under a permissive license. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →