GGrantIndex
← Search

Bilateral NSF/BIO-BBSRC: Collaborative Research: ABI Development: A Metagenomic Exchange - enriching metagenomics analysis by synergistic harmonisation of MG-RAST and the EBI Metag

$1,243,413FY2016BIONSF

University Of Chicago, Chicago IL

Investigators

Abstract

Micro-organisms are found in virtually all environments. When there is an imbalance within a community, this can lead to severe effects, such as disease in the human gut or the inability for plants to grow efficiently in soil. An understanding of the composition and interplay within the communities allows us to potentially manipulate them. Metagenomics is the study of these different micro-organism communities, which is achieved by isolating the DNA from the organisms within an environmental sample (e.g. water, soil, animal stool), sequencing the DNA, followed by the computational analysis to decode which organisms are present and the functions they might be performing. This computation is complicated: 1) there is a huge amount of data; 2) The sequence data is a jumbled mix of fragments from different organisms; 3) Decoding the DNA is hard - typically >90% of organisms within a sample are not well characterized. This proposal brings together three major resources within the field of metagenomics data archiving and analysis. The most immediate beneficiaries of the outputs from this proposal and associated resources will be the scientists worldwide who are involved in metagenomics research. This diverse, extensive and expanding community includes (but is not limited to) microbiologists interested in understanding microbial community structure and interactions, biochemists aiming to discover new proteins with functional applications, and clinicians seeking to investigate and modulate the microbial communities associated with healthy or diseased states. The European Nucleotide Archive (ENA) is a repository of DNA sequence data. Importantly, ENA also captures metagenomic contextual data, such as where and when the sample was taken, how the DNA was extracted and sequenced. The EBI metagenomics portal (EMG,UK) and MG-RAST (MGR,US) are 2 metagenomics sequence analysis platforms. The aim of this project is to initiate a long-term collaboration between the EMG and MGR platforms to build and operate a comprehensive data exchange system, the Metagenomics Exchange (ME). This resource will ensure that metagenomic sequence reads, derived data and associated metadata are permanently preserved and made available for the broadest future use. Crucially, it will ensure that these scientists can discover and mine pre-processed datasets and compare their analysis across varying platforms. There will no longer be the time consuming need for exhaustive searching of ENA, MG-RAST (MGR) and EBI metagenomics (EMG) to discover data, nor will there be the need to submit to both MGR and EMG-the cross submission will simply happen automatically. Furthermore, as a result of the work to make MG-RAST and EMG?s pipelines interoperable, scientists will have access to enhanced analyses that utilize a common set of parameters and allow mixing and matching of pipeline components to suit the data that is being processed, improving and enriching their analysis results. This will also reduce differences that are caused by non-biological artifacts, enabling greater clarity of results. Through the unification of result interfaces, standard users and power users will benefit, with MGR and EMG results accessible via each other's websites, helping the former set of user, and the ability to mine across multiple datasets offered for the latter user set.

View original record on NSF Award Search →