Continued improvement of genome assemblies and assembly techniques for Next Gener
Univ Of Maryland, College Park, College Park MD
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): The two widely used Next Generation Sequencing (NGS) technologies are 454 Sequencing and Illumina sequencing. We propose to determine the best sequencing strategy, that is the optimal mix of 454 and Illumina read and mate pair data to produce the best possible assembly at the lowest cost. We propose to continue developing our software for closing gaps and fixing mis- assemblies by our shooting method. We can extend the method to use additional NGS reads and mate pairs to close gaps in existing assemblies to increase contiguity, and find and correct mis-assemblies. This method can be used as a cheaper alternative to traditional finishing techniques. The final product of any assembly project is a set of the chromosome sequence files. We propose to develop improved software capable of producing chromosome sequences from the assembled contigs using mate pair and marker data. Our preliminary version works for assemblies that have large contigs (N50 size >100Kb). Genomes assembled from the NGS data typically have small contigs (N50 size of 10-20Kb). We propose to extend development of the software so that it is applicable to genome assemblies of the NGS data. We propose to employ the experience that we gained in the previous project period to re-assemble the genomes of chicken, rat, and possibly other genomes of public health interest from the existing Trace Archive data combined with (if available) additional NGS data. The NGS data is getting cheaper. Now there are many groups interested in sequencing various genomes. Thus we propose to produce de novo assemblies of insect, plant genomes and other organisms of public health interest in collaboration with the centers that generate the data. Our goal is to serve as an expert genome assembly group that provides its services and techniques to the community.
View original record on NIH RePORTER →