Finding Protein Sequence Motifs--methods And Application

$0Z01FY2005LMNIH

National Library Of Medicine

Investigators

Linked publications & trials

Abstract

In the last few years, rapid accumulation of genome sequences and protein structures has been paralleled by major advances in sequence database search methods. The powerful Position-Specific Iterating BLAST (PSI-BLAST) method developed at the NCBI formed the basis of our work on protein motif analysis. In addition, Hidden Markov Models (HMM) and protein structure comparison methods were applied. During last year, we made further progress in detailed analysis of the classification, evolution, and functions of several classes of proteins. In particular, the archaeal-eukaryotic superfamily of primases has been analyzed in detail, its relationships with othr proteins containing the PALM domain has been elucidated, and a variety of previously unnoticed primases, especially those from eukaryotic viruses, were detected. Many eukaryotic genomes, including chordates and plants, encode previously uncharacterized homologs of these predicted viral primases, which might be involved in novel DNA repair pathways. Contextual analysis of multidomain protein architectures and gene neighborhoods in prokaryotes and viruses reveals remarkable parallels between AEPs and the unrelated DnaG-type primases, in particular, tight associations with the same repertoire of helicases. These observations point to a functional equivalence of the two classes of primases, which seem to have repeatedly displaced each other in various extrachromosomal replicons. Additionally, we identified by computational methods a novel cytidine deaminase, APOBEC4, which is likely to be involved in tissue-specific RNA editing in vertebrates. In another study, we showed that the eukaryotic cysteine sulfinic acid reductase, sulfiredoxin (Srx), evolved from the bacterial chromosome partitioning protein ParB.

View original record on NIH RePORTER →