Elements: Streaming Molecular Dynamics Simulation Trajectories for Direct Analysis: Applications to Sub-Picosecond Dynamics in Microsecond Simulations
Arizona State University, Scottsdale AZ
Investigators
Abstract
Massive increases in computing capabilities are increasing the data volume generated in numerical simulations, while data storage and transfer capabilities are not growing at the same pace. This is especially true for all-atom molecular dynamics (MD) simulations that are used to study the function of biomolecules and novel materials on timescales of microseconds in systems containing hundreds of thousands to millions of atoms. Such simulations evaluate atom positions and velocities for >109 distinct times, which corresponds to >2 petabytes of data and exceeds reasonable data storage and transfer capacities. As a result, simulation data is stored only at coarse time intervals even though this approach loses information on fast molecular processes that is essential to computing many experimental observables. The software infrastructure created in this project avoids such information loss in molecular dynamics simulations via convergence of data generation and analysis. Instead of storing data for post-processing (status quo), it is passed directly to a parallel software platform via a streaming protocol. The implementation of the streaming protocol into existing software packages with a large user base allows for a direct integration into established simulation protocols. As a result, molecular dynamics simulations produce more usable information, and associated computational resources are used more efficiently. The streaming protocol for MD simulations uses a TCP/IP socket application programming interface (API) to transfer data directly from a running simulation to the analysis software. This approach enables the simultaneous analysis of fast (sub-picosecond) and slow (microsecond) processes in molecular simulations without creating bottlenecks or requiring massive trajectory output files. The analysis is performed with the open-source MDAnalysis platform for which tools are implemented as MDAKit plugins that significantly benefit from the streaming interface, such as 3D-2PT (spatially resolved two-phase thermodynamics) and related tools for the analysis of spatially resolved protein solvation maps. This award by the NSF Office of Advanced Cyberinfrastructure is jointly supported by the Division of Chemistry. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →