GGrantIndex
← Search

CAREER: Statistical methods and algorithms for the analysis of combinatorial mass spectrometry data

$358,241FY2019BIONSF

University Of Montana, Missoula MT

Investigators

Abstract

Mass spectrometry is a crucial modern research tool that allows analysis of the components of samples at several scales: nuclear, small chemicals and biological molecules. In biological research, mass spectrometry is used in the analysis of protein ("proteomics") and metabolic ("metabolomics") data, while in non-research areas it is deployed, for example, to detect bomb-associated chemicals in routine airport security screenings. This research addresses three unmet needs in the processing of the data from mass spectrometry machines: The first is statistical identification of proteins in a biological sample; this is important for understanding what makes cells different, e.g., what makes a skin cell different from a blood cell. The second is identification of which biological species are in a sample; this is crucial in applications such as, for example, enabling accurate and automated disease diagnostics. The third is finding the "alphabet" of basic molecular ingredients in a sample. This research addresses these aims by developing new algorithmic and statistical methods that can correctly separate the basic elements of a complex mixture. The researchers working on this project create mathematical tools that are implemented as researcher-friendly software tools for solving the listed problems. To help make the ideas more accessible to both scientific and non-scientific audiences, the researchers will create teaching modules and podcast episodes to explain how the algorithms work, and what math tricks were developed to break down the complexity of the problem so it is amenable to a useful solution. Problems with combinatorial dependencies are ubiquitous in mass spectrometry. Symmetries in combinatorial dependencies can be exploited to construct special dynamic programming algorithms: convolution trees, fast numeric max-convolution, and other approaches, all of which were invented and developed by the researchers. The researchers will use and improve these symmetry-exploiting algorithms to implement superior mass spectrometry-based protein identification, species classification, and small molecule analysis. Convolution trees can be used to solve these problems in quasilinear time, and so they can be applied to a very large number of proteins, species, or small molecules (or to a large number of spectra from any of these problems). The researchers will construct a library of software implementations of these algorithms with permissive open source licensing for unrestricted academic and industrial use. As they further develop these combinatorial methods, the researchers will create a combinatorics curriculum for intuitively teaching these concepts to K-12 students and create podcast episodes explaining these ideas in an accessible manner.The fruits of this research will be freely available at https://alg.cs.umt.edu/nsf-career.html . This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →