CC*IIE Integration: Collaborative Research: EPSON: Embracing Parallel Networks and Storage for Predictable End-to-End Data Movement
Northern Illinois University, Dekalb IL
Investigators
Abstract
Geographically distributed scientific communities require increasingly sophisticated data transfer mechanisms that can handle the challenges of sharing large datasets over heterogeneous networks. These challenges include optimization of networks with different configurations and protocols, I/O mechanisms to efficiently read and write to parallel storage, and the varying demands of widely different data transfer workloads. To address these challenges, the EPSON project is developing, implementing, and evaluating application programming interfaces and tools that facilitate end-to-end parallel data transfers. EPSON researchers focus on three areas: (1) enabling parallel network data movement, by taking into account the diversity of parallel network characteristics of both shared networks and infrastructures with dedicated circuits and paths and effectively balancing the flows among paths for more predictable performance; (2) developing a GridFTP data storage interface, enabling scalable I/O to and from parallel filesystems -- critical for campus infrastructures to deal with large-scale datasets; and (3) devising mechanisms that overlap network transfers with storage I/O and incorporate data-staging heuristics, matching the impedance between storage and networking capabilities to improve end-to-end data transfers. The project involves close collaboration with application scientists, with the objective of providing advanced networking tools to support the requirements of the applications deployed at the University of Chicago and Northern Illinois University campuses.
View original record on NSF Award Search →