Database-centric data analysis of molecular simulations
University Of South Florida, Tampa FL
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): Molecular simulations (MS) have become an integral part of molecular and structural biology. By pro- viding model descriptions for biochemical and biophysical processes at nano-scopic scale, MS can provide fundamental understanding of diseases and help discovery of drugs. MS, by their nature, generate large amounts of data. Although many of the MS software are carefully designed to achieve maximum computational performance in simulation, they seriously fall short on storage and handling of the large scale data output. The objective of the proposed research is to use database technologies to improve the efficiency, ease of maintenance, and security of MS data analysis. We propose to accomplish this by developing novel data structures and query processing algorithms in the kernel of the database management system (DBMS), in addition to leveraging the advantages of such systems in their current forms. Based on the success of above database-centric techniques, we will also develop automatic feedback control mechanisms in MS to improve the online tuning of simulations that is needed in studying many biochemical processes. The project has three specific aims: 7 Development of a Database-centric MS (DCMS) data analysis framework that stores simulation data collected from various sources, provides standard application programming interfaces (APIs) for data retrieval, and allows global data access to research community while ensuring fine data security policies. 7 Augmenting DCMS with novel data structures and algorithms for efficient data retrieval and query processing. We focus on creative indexing and data organization techniques, and query processing and optimization strategies. 7 Integration of DCMS and steering-based MS programs into one unified simulation framework that can greatly improve the efficiency of the MS process. This framework will be demonstrated as part of the efforts to solve real biomedical problems. We believe DCMS will produce a revolutionary high throughput technique for MS researchers and accelerate the discovery process in medical research. Such innovations will bring significant intellectual merit from which both the biomedical and database management communities will benefit.
View original record on NIH RePORTER →