GGrantIndex
← Search

Mathematically Rigorous Results In Sequence Matching.

$156,696ZIAFY2010LMNIH

National Library Of Medicine

Investigators

Linked publications & trials

Abstract

We heuristically derived two new equations for the scale parameter, lambda. This equation can estimate efficiently with high accuracy. In addition, we have proposed several new formulas for Gumbel pre-factor k based on a path reversal identity and the Poisson clumping heuristic. This formula also provides very accurate results. We also have explored edge effects on the statistics. Edge effects are relevant, because real sequences have limited lengths, generating a correction term in an asymptotic expansion of the probability of sequence matching. This edge effect is likely to be more important in the statistics of matching with gaps than it was in the statistics of matching without gaps, because gapped matches tend to be longer, exhausting the sequences being matched more easily. The NCBI CoreTools now has code that calculates all the modified Gumbel parameters to practical accuracies in less than 1 sec. Moreover, the BLAST code now incorporates our improved finite-size correction.

View original record on NIH RePORTER →