Development of Trans Proteomic Pipeline, an Analysis Suite for Mass Spectrometry

$509,505R01FY2015GMNIH

Institute For Systems Biology, Seattle WA

Investigators

Linked publications & trials

Paper 39622905 Paper 39514576 Paper 39479990 Paper 39442081 Paper 39314370 Paper 39193565 Paper 38810119 Paper 38538550 Paper 38352447 Paper 38232391 Paper 38104260 Paper 38014076 Paper 37904046 Paper 37833331 Paper 37653270 Paper 37572790 Paper 37398146 Paper 37314965 Paper 37292611 Paper 37092802 Paper 36912639 Paper 36806354 Paper 36693629 Paper 36648445 Paper 36629399 Paper 36626722 Paper 36598107 Paper 36370099 Paper 36318223 Paper 36074795 Paper 35831657 Paper 35640880 Paper 35613471 Paper 35532924 Paper 35418183 Paper 35290070 Paper 34943957 Paper 34724830 Paper 34672606 Paper 34668408 Paper 34648730 Paper 34615866 Paper 34183830 Paper 33757883 Paper 33711481 Paper 33534803 Paper 33529024 Paper 33230158 Paper 33166149 Paper 33067471 Paper 33067450 Paper 32931287 Paper 32864978 Paper 32723790 Paper 32106352 Paper 32074120 Paper 31686107 Paper 31673027 Paper 31599596 Paper 31573204 Paper 31430157 Paper 31335145 Paper 31290668 Paper 31081335 Paper 31049579 Paper 30547015 Paper 30523691 Paper 30270626 Paper 30265558 Paper 30259924 Paper 30230343 Paper 30099871 Paper 30017480 Paper 29977167 Paper 29846577 Paper 29655560 Paper 29631402 Paper 29400476 Paper 29386051 Paper 28985418 Paper 28938075 Paper 28887440 Paper 28853897 Paper 28849660 Paper 28515314 Paper 28166638 Paper 28135259 Paper 27990823 Paper 27924013 Paper 27577934 Paper 27575953 Paper 27490519 Paper 27487407 Paper 27469004 Paper 27453469 Paper 27162549 Paper 27091361 Paper 26744403 Paper 26719571 Paper 26704149

Abstract

DESCRIPTION (provided by applicant): Mass spectrometry based proteomics is a key technology for the identification, quantification and comparison of proteins and their post-translational modifications across all aspects of biology. One major barrier in proteomics workflows is the paucity of flexible and customizable computational frameworks for generating data analysis software pipelines. Current software pipelines are often static, with limited flexibility beyond the specific analyses for which they were created and generally lag behind due to little or no maintenance. These tools therefore lack broad functionality and improvements in areas such as statistical validation, limiting the potential of the software to mere spectrum matching through black box tools. To address these barriers, we have been developing, maintaining and distributing cutting edge proteomics computational tools and standards in data analysis over the last 12 years through our program suite called the Trans-Proteomic Pipeline. Concomitant with software development, we ensure wide community adoption through extensive tutoring to all interested users of the proteomics community. Through our development, we have provided new functionality for proteomics data analyses with both new and improved statistical validation and global qualification. However, with new styles of mass spectrometry instrumentation that attempt to now provide comprehensive analysis, software tools must be continually maintained and new functionality developed in an extensible and flexible framework that ensures robust, routine operation so that it provides the user community, from novice to the most extreme power experts, with trusted results. The Trans-Proteomic Pipeline has been the first and most continually developed and maintained product for these requirements. Our goal of providing all these tools as both full open source and freely available complement of programs, ensures wide adoption and permits community input into the tools development for the broadest and most needed functionality possible. This continuing program of development and maintenance of the Trans-Proteomic Pipeline builds on the successful approach of robust tools development with the focus of from start to the end analysis of proteomics data. With ever increasing data collection rates at a Moore's Law level, this program will continue to develop tools to analyze these larger and larger datasets. We will develop and integrate tools for new styles of proteomic data collection such as multiplexed data-independent analysis capable of providing near full proteome quantitation in a single analysis, integration of next generation RNA-seq genomic analysis for sample specific databases, post-translational modification statistical analysis for confident site specific identification, and the implementation of new selected-reaction monitoring capabilities that drive proteomics as the next generation Western Blot. All these efforts are underpinned by a strong computational and biochemistry focused background to ensure the tools are well written with maximum relevance to biology.

View original record on NIH RePORTER →