GGrantIndex
← Search

Regulation And Function Of Retroelements

$1,145,531ZIAFY2019HDNIH

Eunice Kennedy Shriver National Institute Of Child Health & Human Development

Investigators

Linked publications & trials

Abstract

Diseases such as AIDS and leukemia caused by retroviruses have intensified the need to understand the mechanisms of retrovirus replication. One of our objectives is to understand how retroviral cDNAs are integrated into the genome of infected cells. Because of their similarities to retroviruses, long terminal repeat (LTR)-retrotransposons are important models for retrovirus replication. The retrotransposon under study in our laboratory is the Tf1 element of the fission yeast Schizosaccharomyces pombe. We are particularly interested in Tf1 because its integration exhibits a strong preference for pol II promoters. This choice of target sites is similar to the strong integration preferences human immunodeficiency virus 1 (HIV-1) and murine leukemina virus (MLV) have for pol II transcription units. Currently, there are important questions about how these viruses recognize their target sites. We therefore study the integration of Tf1 as a model system with which we hope to uncover mechanisms general to the selection of integration sites. An understanding of the mechanisms responsible for targeted integration could lead to new approaches for antiviral therapies and to improvements in the application of viral vectors in gene therapy. Retroviruses and Long Terminal Repeat (LTR)-retrotransposons have distinct patterns of integration sites. The unwanted oncogenic events that can be generated by retrovirus-based vectors used in gene therapy is dependent on the selection of integration sites. The LTR-retrotransposon Tf1 of Schizosaccharomyces pombe is studied as a model because it integrates into the promoters of stress response genes. Although integrases (INs) encoded by retroviruses and LTR-retrotransposons are responsible for catalyzing the insertion of cDNA into the host genome, distinct host factors are required for the specificity of integration site selection. We tested this hypothesis with a genome-wide screen of host factors that promote Tf1 integration. By combining an assay for transposition with a unique genetic assay that measures cDNA present in the nucleus, we could identify factors that contribute to integration. We utilized this assay to test a collection of 3,004 S. pombe strains with single gene deletions. Using these screens and immunoblot measures of Tf1 proteins, we identified a total of 61 genes that promote integration. The integration factors participate in a range of processes including nuclear transport, transcription, mRNA processing, vesicle transport, chromatin structure and DNA repair. The DNA repair factors are of particular interest because they suggest the pathways that repair the single stranded gaps opposite integration sites are the same in diverse eukaryotes. We have continued to study the DNA repair factors to determine the role in integration. Cells activate expression networks with hundreds of genes that together increase resistance to common environmental insults. However, stress response networks can be insufficient to provide survival, which raises the importance of whether cells possess genetic programs that can promote adaptation to novel forms of stress. We found transposable element (TE) mobility in Schizosaccharomyces pombe was greatly increased when cells were exposed to unusual forms of stress such as heavy metals, caffeine, and the plasticizer phthalate. By subjecting cells with integration to CoCl2 we found the TE integration provided the major path to resistance. Groups of insertions that provided resistance were linked to TOR regulation and metal response genes. We extended our study of adaptation by analyzing TE positions in 57 genetically distinct wild strains. The genomic positions of 1,048 polymorphic LTRs were strongly associated with a range of stress response genes indicating TE integration promotes adaptation in natural conditions. These data provide strong support for the idea, first proposed by Barbara McClintock, that TEs provide a system to modify the genome in response to stress. Half of the human genome is comprised of TE sequences including 500,000 copies of L1, a non-LTR retrotransposon that is the only known family of TE in the human genome to possess autonomous transposition activity. While the vast majority of copies carry mutations, each genome contains approximately 100 active L1s with the potential to produce de novo insertions of L1 or catalyze insertion of the non-autonomous TE Alu. L1 activity in germline cells or early in development is responsible for 124 documented cases of disease and the discovery of somatic transposition of L1 in brain tissues has raised provocative questions about the impact of transposition during neurodevelopment and in neurological disease. A surge of sequence data from the 1,000 genomes project and of RNA-seq data from human subjects now makes it possible to correlate polymorphic TEs in human populations with disease alleles identified by genome-wide association studies (GWAS). Initial studies indicate that Alus are potential causative variants. The potential for TE insertions to be causative in disease is consistent with their propensity to alter gene expression. Twin studies show that the heritability of mental illness is extremely high ranging from 74% to 81%. Given the difficulty in identifying genetic variants responsible disease, and the regulatory capacity of TEs, we tested whether polymorphic TEs were potential causative variants for neurological and psychiatric disorders.The genome sequences of 2,504 human subjects analyzed for structural variation identified 17,000 TE insertions which include 12,784 Alu, and 3,048 L1s. We consolidated the results of 593 GWAS that characterized a wide range of neurological diseases and psychiatric disorders (NHGRI catalog). From these GWAS we selected for further analysis 753 TASs with disease associations of p<106. We identified polymorphic TEs contained within regions of linkage disequilibrium (LD) for each of the TASs based on a maximum distance of 1 mb. From the 76 polymorphic TEs in LD with disease TASs we examined their potential to alter regulatory elements as mapped by the panel of epigenomic data from NIH Roadmap Epigenomics Consortium and the genome-wide signatures of chromatin states determined by hidden Markoff models, ChromHMM. While a majority of the polymorphic candidate were located in domains of quiescent chromatin, a number of TEs were located in enhancers, promoters, or at sites extensively bound by transcription factors. For additional evidence of altered gene expression, we queried the Genotype-Tissue Expression (GTEx) project RNAseq database containing expression data for 449 donors across 43 tissues. GTEx readily identifies changes in tissue-specific gene expression associated with loci specific genetic variation. TASs in LD were identified as expression quantitative trait loci (eQTL) if the genetic loci with the variants was significantly associated with altered expression of a gene in a specific tissue. Based on these eQTLs and the associations with regulatory chromatin active in tissues of the brain, we obtained a list of 18 causal TE candidates. The results indicate polymorphic TEs are candidate factors in a wide range of neurological disorders including ALS and Parkinsons disease. Polymorphic TEs were also potential factors in psychiatric diseases including six that were in LD with increased risk of schizophrenia. By identifying polymorphic TEs that are tightly linked to risk of neurological and psychiatric diseases we have provided a valuable list of causal candidates that are structural variants positioned in regulatory chromatin and are linked to eQTLs of altered expression of proximal genes in brain tissue.

View original record on NIH RePORTER →