Qualitative Analysis of Molecular Dynamical Systems

$1,597,363FY2009MPSNSF

Stanford University, Stanford CA

Investigators

Vijay S Pandecontact Gunnar E Carlsson Leonidas J Guibas

Abstract

The investigators develop a class of rigorous methods and tools for analyzing sets of molecular trajectories, yielding mechanistic results that are easily interpreted, yet which can take advantage of the large data sets that are now possible to generate. The above simulation engines generate similar data: many (100 to 10,000) trajectories, often on relatively long timescales (100ns to 10µs). Each trajectory is a time series of atomic configurations. From this data set, the goal is to understand at some more macroscopic level the structure of the paths taken during the simulation. The challenge here is that while there are many degrees of freedom, many of these are not important and in fact obscure the potentially relevant dynamics in progress. For example, the collection of attracting fixed points is a primary important invariant when studying any dynamical system. Unstable critical points, as well as invariant manifolds, including attracting submanifolds, are also of a great deal of interest in studying dynamics. Even when the ambient space on which the dynamics is taking place is trivial topologically (Euclidean space, for example), it is well-understood that fixed sets and invariant sets frequently carry interesting topological structure. Moreover, there exists a powerful set of tools (Conley index theory) which often allow one to derive useful information about invariant subsets, and the interrelationship between them. This methodology has been used to develop computational methods which permit one to describe dynamics in an algorithmic and provable way for dynamics on low-dimensional systems. Moreover, these tools permit the tracking of qualitative behavior as parameters in a parametrized system are changed, so as to produce "databases" of dynamical system behavior of a given physical form. This family of ideas is being rapidly developed, and is expected to provide very useful information about many chemical dynamical systems. The investigators will emphasize challenges involved from the point of view of the inherent biology and chemistry. The simulation of biology at the molecular scale has come a long way since its origins decades ago. Now, with powerful individual processors, as well as with very large distributed clusters of processors, one can routinely generate very large quantities of simulation data for a given phenomenon of interest, often in full-atomic detail along many trajectories. A growing challenge is to mine such massive data sets to gain insight into the fundamental phenomena under study. The investigators develop rigorous methods to perform the analysis of these massive data sets, yielding mechanistic results that are easily interpreted, yet which can take advantage of the large data sets that are now possible to generate. The above simulation engines generate similar data: many (~100 to 10,000) trajectories, often on long timescales (100ns to 10µs). Each trajectory is a time series of atomic configurations. From this data set, the goal is to understand at some more macroscopic scale the structure of the paths taken during the simulation. The challenge here is that while there are many degrees of freedom, many of these degrees of freedom are not important and in fact obscure the potentially relevant dynamics in progress.

View original record on NSF Award Search →