GGrantIndex
← Search

Mathematically Rigorous Results In Sequence Matching.

$303,689ZIAFY2012LMNIH

National Library Of Medicine

Investigators

Linked publications & trials

Abstract

The NCBI CoreTools now contains code from us, code that calculates to practical accuracies, and in less than 1 sec, all parameters of the modified Gumbel distribution (the Gumbel scale parameter, λ, pre-factor k, and finite-size correction). The BLAST group plans to use our faster calculations to generate the modified Gumbel parameters for several new DNA scoring schemes. The BLAST group have also implemented the new finite-size correction directly into their code, demonstrably improving BLAST sequence retrieval. The implementation even had unexpected benefits, such as improved retrieval from the Conserved Domain Database with rps-BLAST. In addition, biologists notice and find it irritating when an exact match to their query is not the highest-ranked hit in a sequence database. The new finite-size correction places identical matches more consistently at the top of the retrieval list than the old finite-size correction. We are now collaborating with Dr. Martin Frith in extending our methods to next-generation sequence matching, including frameshifts in DNA.

View original record on NIH RePORTER →