GGrantIndex
← Search

Development Of Advanced Computer Hardware And Software

$745,223ZIHFY2023HLNIH

National Heart, Lung, And Blood Institute

Investigators

Linked publications, trials & patents

Abstract

Constant pressure simulation on GPUs Most molecular dynamics packages do not handle the non-pairwise contributions to the virial, resulting in non-rigorous isobaric ensemble simulations. Through the derivation of these contributions and its implementation via the Langevin Piston algorithm in our apoCHARMM code, constant pressure (isobaric ensemble) as well as constant surface tension, constant surface area and related ensembles can now be simulated efficiently. Mixed precision iterative refinement solution for induced dipoles using tensor cores Solving the induced dipoles in a self-consistent fashion is one of the most challenging aspects of polarizable force fields. Since it is the most time consuming component, we have devised an iterative mixed precision solution for this calculation using the hardware units on the GPUs called Tensor Cores. These special units allow GEMM operations using only 16-bit precision matrices. However, they provide high performance. While FP64 throughput is limited to 9.7 TFlops, tensor cores provide up to 312 FP16 TFlops. By iteratively refining the residual, we improve the accuracy up to 64-bit precision. Overall we are able to speed up the induced polarization calculation by 3-4X. This work will make polarizable force field simulations more accessible on the GPUs. P21 reciprocal space calculation Reciprocal space calculation in P21 is challenging because the classical Ewald formulation works with only translational symmetry and does not support rotational symmetry.We have developed the Ewald formulation for long-range electrostatics for systems under P21 periodic boundary condition. By expressing the contribution from the rotated asymmetric unit in terms of the primary asymmetric unit, we were able to express the reciprocal space potential only in terms of the latter. This reduces the computational work by half. P21 periodic boundary conditions are important for balancing the stress disequilibrium between the layers of the lipids during the simulation. Quantifying the Effects of Lossy Compression on Energies Calculated from Molecular Dynamics Trajectories MD simulations can now be run for increasingly longer lengths of time and on larger systems (> 100,000 atoms). There is a need to store these trajectories in as efficient a way as possible without sacrificing too much precision. We have explored how quantization and compression affects the precision of not only atomic positions (as is typically done), but also the energies calculated from such trajectories, and have compared to a wide variety of new and existing trajectory formats (21 total). We found that while many geometric properties (distances, RMSD, RDFs, etc.) can be reproduced from existing compressed trajectory formats with low precision (like XTC, 0.01 ), bond energies involving hydrogen are particularly sensitive to precision loss. As a result, we have developed a quantization-based compression new format that compresses to about 66% of the size of the original NetCDF trajectory, has a positional accuracy of 5x10-5 , has an energy root-mean-square error of less than 0.1 kcal/mol, and is almost as fast to read as the original uncompressed trajectory. Adding Automatic Parameter Downloads to a Software Tool for Fast PDB-to-Parameter Generation for Molecular Dynamics Simulations Setting up molecular dynamics simulations from experimentally-determined structures is often complicated by a variety of factors, particularly when the structure to be simulated contains carbohydrates (e.g. the SARS-CoV-2 spike protein), since these have several forms and be linked in a variety of ways. Previously, we developed a stand-alone tool called prepareforleap, implemented in the widely used and freely-available software CPPTRAJ (which now has close to 4k citations) that facilitates the preparation of structures for molecular dynamics simulations with the Amber Biomolecular simulation package. This software tool is a stand-alone program that requires no internet access and little-to-no user intervention, which differentiates it from existing web-based tools. Addition of Clustering via Extended Similarity Metrics to CPPTRAJ Cluster analysis is data-mining technique that can be applied to a collection of data points to create groups of points according to some measure of similarity. In the context of MD simulations, this typically means identifying important and unique clusters of conformations from trajectories that typically contain thousands to sometimes millions of structures. As such, cluster analysis is a very important tool in the analysis of MD simulations for guiding future analysis by reducing the dimensionality of extremely large data sets. The calculation of pairwise distances between any two points, which can be a bottleneck and scales poorly (O(n2)). Miranda-Quintana et al. recently introduced a new clustering method based on determining extended similarity indices instead of pairwise distances, which reduces the scaling to O(n). In collaboration with them, we are now implementing this new clustering method into CPPTRAJ. This will enable comparatively rapid cluster analysis of extremely large data sets. Parallelization of Grid Inhomogeneous Solvation Theory Calculations The enthalpic portion of the GIST method requires the calculation of water-water and water-solute energy on the grid; this is usually the most time-consuming part of GIST. We are now working to increase the speed of this calculation by parallelizing it with MPI, dividing up the incoming trajectory frames among multiple processors. This is particularly attractive because it requires very little communication between individual processes during trajectory processing, meaning the calculation should scale well to large processor counts. In addition to the energy calculation, we are also parallelizing the entropy calculation via MPI, which must happen after trajectory processing since it requires information from all trajectory frames. This parallelized GIST method is being developed and will be freely available in the CPPTRAJ analysis software. GPU-parallelization of Time-consuming Calculations in CPPTRAJ We are porting several methods to run simultaneously on multiple GPUs. Acquisitions and Hardware Upgrades: Nvidia A100 Systems: Over the past year, strategic investments have been made in our infrastructure. We acquired several systems, specifically designed to enhance our high-performance computing capabilities. Among these are two 8-way Nvidia A100 compute nodes and one 4-way Nvidia A100 compute node. These systems are equipped with multiple GPUs connected via NVLink, Nvidia's cutting-edge GPU-to-GPU interconnect technology. Nvidia H100 System: As a testament to our commitment to staying abreast with the latest in computational technology, we procured an 8-way Nvidia H100 system in August 2023. The rationale behind these acquisitions is twofold: Scalability: The availability of systems with multiple GPUs, especially when interconnected via NVLink, offers an unparalleled opportunity for our team. With these systems, we are not only gearing up for current workloads but also supporting our development of advanced codebases. Specialization: Molecular biology, especially when approached through computational methods, demands high granularity and computational power. The Nvidia A100 and H100 systems are the best way for us to meet our large computational needs. They promise enhanced parallel processing capabilities, faster memory access, and improved data transfer rates, all of which are vital for large-scale molecular simulations and computations. In conclusion, our investments over the past year have been methodical, targeting both current and future demands of computational biology and biophysics.

View original record on NIH RePORTER →