Deconvolution and Assembly of Metagenomes Using Chromatin Conformation Capture
Phase Genomics, Inc., Seattle WA
Investigators
Linked publications & trials
Abstract
Abstract The Specific Aim of this Phase II SBIR proposal is to develop an affordable commercial quality product for improved metagenomic sequencing and culture-free microbial discovery. With development of high-throughput sequencing technologies, we now have the capacity to sequence entire genomes of cultured microorganisms. However, we only have a limited capacity to sequence pooled genomes (microbiomes) through metagenomic sequencing, which includes sequencing the genomes of microorganisms that currently cannot be cultured. Using conventional protocols, the association of DNA fragments from the same species is lost during the DNA preparation process (cell lysis, DNA purification and shearing). It is therefore near-impossible to assign any specific DNA sequence to its origin without relying on a priori knowledge. To overcome this hurdle, we have adapted the Hi-C proximity-ligation tool to join DNA molecules that are physically proximal to one another within an intact cell. Our successful Phase I studies showed the feasibility of developing a commercial product for the construction of high-quality Hi-C libraries from fecal, soil, and clinical samples. Deep sequencing of these DNA junctions enables us to reconstruct complete or near-complete genomes and deconvolute mixed strains without any culturing. We have successfully applied this technology for deconvolution and assembly of artificially mixed populations of microorganisms that included various fungal, bacterial, and archaeal species as well as a number of real-world metagenomic samples. The method has proven to be highly accurate and efficient, including associating multiple chromosomes and plasmids with their host microorganism. While feasible, in order to become a commercial product we must develop our laboratory methods into commercial quality high-throughput assay kits and finalize software development. To achieve our Specific Aim, we will carry out the following Tasks: Task 1: Optimize current Hi-C kit protocols; Task 2: Develop, test and manufacture 96-well metagenomic Hi-C kits; Task 3: Algorithm development and software optimization for analyzing Hi-C data; Task 4: Produce a customer-facing website used for data analysis. Criteria for Success: A successful product would have to reduce the laboratory workflow to under 24 hours of prep time, be simple enough to allow multiplexing, and bring our cost of goods down to $50 per sample or less. The method would need to work on a wide array of metagenomic sample types, such as fecal, clinical, and environmental samples containing a variety of diverse microbes. Our kit will consist of enzymes and buffers to generate an Illumina-compatible Hi-C library from a raw sample such as 0.25 grams of soil or 50uL of fecal material with a <10% failure rate by users with moderate levels of experience. At least ten clients must independently validate our kits. The accompanying software will be able to assemble known genomes with at least 95% accuracy and generate at least 20 novel genomes per sample with >90% completeness and <5% contamination as measured by the CheckM tool.
View original record on NIH RePORTER →