Semi-Automated Method for Annotating Repeated Sequences
Northern Illinois University, Dekalb IL
Investigators
Linked publications & trials
Abstract
This project develops a graphical method for identifying and annotating repeated sequences found on chromosome-length DNA sequence assemblies. Repeated sequences include transposable elements, duplicated genes, organellar insertions, and other structures. Transposable elements in particular are known to insert into genes, causing both germinal mutations and somatic mutations including cancer. The basic method being developed here uses a BLAST search of segments of the chromosome against the entire chromosome sequence. The results are then displayed on a "BLAST dotplot", where both the x and y axes represent sections of the chromosome. The x-axis is displayed at 30 base pairs per pixel, a resolution that allows almost all genes and transposable elements to easily fit on a single computer screen. The y-axis is displayed at 50,000 bp/pixel, which allows the entire chromosome to be displayed on a single screen. Repeated sequences are easily seen as a series of stacked horizontal lines whose position and color shows the chromosomal location and similarity of each repeat. The coordinates of each repeat can be entered into a database along with additional information derived from other search software. Also, information about family relatedness and structural similarity can be hyperlinked to the display.
View original record on NIH RePORTER →