Acquiring a GPU server to accelerate developing deep learning methods to reconstruct protein structures from cryo-EM data
University Of Missouri-Columbia, Columbia MO
Investigators
Linked publications & trials
Abstract
Project Summary The goal of this supplement is to acquire a Dell high-performance computing server with 8 Nvidia A100 Graphics Processing Units (GPUs) to accelerate the development of deep learning methods to reconstruct protein structures from cryogenic electron microscopy (cryo-EM) image data accurately and automatically. The cryo-EM technology can determine the quaternary structure of large protein complexes and assemblies consisting of many chains that are difficult or even impossible for traditional techniques such as X-ray crystallography or nuclear magnetic resonance (NMR) to determine. As the cryo-EM technology routinely reached high resolution in recent years, it has been revolutionizing the field of structural biology and widely used to determine structures of large protein complexes and assemblies. However, the computational reconstruction of protein structures from cryo-EM image data is still a time-consuming and labor-intensive process. The advanced artificial intelligence (AI) methods such as deep learning hold the key to automate the process and improve the reconstruction accuracy. The parent R01 grant of this supplement aims to develop cutting-edge deep learning models such as 2D and 3D transformers to automate the key tasks of reconstructing protein structures from cryo-EM data: (1) picking protein particles in cryo-EM images (micrographs); (2) denoising cryo-EM density maps built from protein particle images; (3) reconstructing protein structures from cryo-EM density maps; and (4) integrating the methods of (1), (2) and (3) as a pipeline to automatically reconstruct high-accuracy protein structures from cryo-EM image data without human intervention. Our substantial progress in the first eight months of this project has demonstrated that the proposed methods are fully feasible and highly promising. However, training and testing the large deep learning transformer models on big cryo-EM datasets efficiently and effectively need a large amount of GPU computing power. Using the current GPU resource available to us, it takes about one year for a developer to complete the development of one deep learning method. Although the speed can yield significant progress, it is not fast enough to maximize the potential and impact of the cutting-edge deep learning methods of the parent R01 project. This supplement will enable us to acquire a high-performance computing server consisting of 8 Nvidia A100 80GB GPUs to drastically speed up the research in the parent R01 project. This GPU servers can reduce the time of completing the development of one deep learning model from about one year to less than two months, and therefore drastically improve the productivity of the developers and greatly accelerate publishing and releasing the methods and tools developed in this project. Moreover, the large (80GB) memory of each GPU will enable us to train high-quality deep transformers consisting of millions of parameters to maximize the accuracy of reconstructing protein structures from cryo-EM image data.
View original record on NIH RePORTER →