A New Approach to Fragment Assembly in DNA Sequencing
University Of California San Diego, La Jolla CA
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): For the last twenty years, fragment assembly in DNA sequencing followed the "overlap - layout - consensus" paradigm that is used in all currently available assembly tools. Although this approach proved to be useful in clone-by-clone DNA sequencing, it faces difficulties in genomic shotgun assembly: the existing algorithms make assembly errors and are often unable to resolve repeats even in prokaryotic genomes. Biologists are well aware of potential assembly errors and are forced to carry additional experiments to verify the assembled contigs. We abandon the classical "overlap - layout - consensus" paradigm in favor of a new Eulerian Superpath approach to fragment assembly. This allows us to reduce the fragment assembly to a variation of the classical Eulerian path problem, and, for the first time, to resolve the problem of repeats in fragment assembly. Our new EULER algorithm resolves all repeats except long 100 percent perfect repeats that are theoretically impossible to resolve without additional experiments. This reduction allows one to generate provably optimal error-free fragment assemblies. The main goal of this proposal is to scale up EULER for assembly of eukaryotic genomes.
View original record on NIH RePORTER →