Parameter-free Peak Detection Algorithm for Reducing False Positive/Negative Compound Identification from Raw Mass Spectrometry Metabolomics Data.

$136,655R03FY2017CANIH

University Of North Carolina Charlotte, Charlotte NC

Investigators

Abstract

Project Summary / Abstract Parameter-free Peak Detection Algorithm for Reducing False Positive/Negative Compound Identification from Raw Mass Spectrometry Metabolomics Data Mass spectrometry (MS) coupled to liquid or gas chromatography (LC or GC) have become indispensable analyt- ical platforms for untargeted metabolomics. With sensitivity, chromatographic resolution, and mass measurement accuracy continuously improving, more and more analytes are now detectable, and this has enormous potential to lead to great strides in our understanding of metabolism. The first step in the preprocessing of raw LC- and GC-MS metabolomics data is the detection and extraction of peaks that represent ions. However, existing peak detection algorithms invariably yield an immense number of false positive and false negative peaks. These incorrect peaks can translate downstream into spurious or missing compound identifications. Furthermore, a large number of parameters must be specified for these algorithms to work. Unfortunately, general users often do not understand how to optimize these parameters, and maximizing one aspect (e.g., sensitivity) often has deleterious effects on another (e.g., specificity). To address the challenges, we propose a paradigm shift in the detection of peaks by simultaneously considering the three dimension of an ion?s information. This will significantly increase the accuracy of peak detection compared to what can be achieved by existing algorithms that carry out peak detection by processing data in three separate 2D planes. The results of our proposed research will benefit metabolomics research in multiple ways. (1) The more accurate algorithm will eliminate or reduce manual checking of results to a minimum. (2) With the parameter-free design, researchers will not have to go through many rounds of trial-and- error to determine the best compromise for a set of processing parameters. (3) The more accurate algorithm will provide greater confidence in the detection of truly unknown compounds and allow prioritization of candidates for more detailed and costly structural analysis. (4) The more accurate algorithm will benefit biomarker discovery by increasing the accuracy of quantitative metabolite information extracted from the raw metabolomics data.

View original record on NIH RePORTER →