NSF-BSF: Computational Methods for Shape Space Analysis in Structural Biology
University Of Texas At Austin, Austin TX
Investigators
Abstract
Modern scientific applications require analyzing massive and complex datasets. A prevalent situation is that each data point is itself geometrical (due to being an image for example), while the totality of the dataset carries shape structure as well (due to underlying motion for instance). Consider structural biology, where experimental techniques known as cryo-electron microscopy and X-ray free electron lasers can be used to capture hundreds of thousands of noisy images of a protein of interest. Scientists then use software tools to reconstruct the three-dimensional shape of the protein and its conformations, which are vital to basic science and drug discovery. In this project new computational and mathematical methods will be developed for analyzing image sets and volumetric datasets which exhibit continuous variability. The project will provide graduate student training, and opportunities for students in the US and Israel to collaborate. The first part of the research will integrate metrics from the field of optimal transport with machine learning methods for dimensionality reduction and clustering, in a way that is tailored towards the analysis of shape space datasets. Graph-based methods will be combined with the Wasserstein metric with emphases on noise robustness, sample and computational efficiency, and interpretability in terms of geometric deformations. The second part of the research will be to develop an algebraic framework for dealing with symmetries in shape space datasets, such as rotational symmetries. Group representation theory will be used within principal component analysis and graph-based learning methods to achieve better efficiency than current approaches. This project will produce rigorous mathematical algorithms with broad applicability, and specialized software libraries for pressing scientific applications. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →