GGrantIndex
← Search

Regulation And Function Of Retroelements

$1,196,104ZIAFY2025HDNIH

Eunice Kennedy Shriver National Institute Of Child Health & Human Development

Investigators

Linked publications, trials & patents

Abstract

Retrotransposon insertions associated with risk of neurologic and psychiatric diseases Genetic variation can directly cause or increase susceptibility to neurologic and psychiatric diseases. Genome-wide association studies (GWAS) have been very useful in identifying single nucleotide polymorphisms (SNPs) associated with common and complex human diseases. However, identifying causal genetic variants remains challenging because most disorders are influenced by many loci and SNPs do not typically provide adequate resolution to pinpoint causal genes. Also, GWAS do not directly identify structural variants such as TE insertions which can have much greater impact on gene expression than SNPs. TE activity contributes directly to disease with reports of more than 124 cases of TE insertions that resulted in Mendelian genetic diseases. In addition to disrupting gene sequence TEs carry regulatory sequences that readily alter expression of adjacent genes. The potential for TEs to alter expression is particularly significant in brain because this tissue is one of the few with elevated TE expression. We tested the possibility that polymorphic TEs alter the expression of adjacent genes and contribute to increased risk of neurologic and psychiatric diseases. The genome sequences of 2,504 human subjects identified ~17,000 polymorphic TE insertions which include 12,784 Alu, and 3,048 L1s. We consolidated the results of 593 GWAS of neurologic and psychiatric disorders (NHGRI catalog). From these GWAS we selected for further analysis 753 trait associated SNPs (TASs) with disease associations of p<10−6. We identified 76 polymorphic TEs in linkage disequilibrium (LD) with disease TASs and examined their potential to alter regulatory elements as mapped by the panel of epigenomic data from the NIH Roadmap Epigenomics Consortium and the genome-wide signatures of chromatin states determined by hidden Markoff models, ChromHMM. A number of TEs are located in enhancers and promoters extensively bound by transcription factors. As an example, one TE designated LN1, is an AluYa4 inserted in promoter sequence at a site bound by many transcription factors. The promoter highly expresses the tubulin tyrosine ligase TTLL4 in the hippocampus, cerebellum, and caudate. TTLL4 catalyzes polyglutamylation of tubulins, which modulates microtubule related functions of neurite outgrowth and neurodegeneration. We considered this insertion a causal candidate because of its potential to effect expression in the brain and its linkage to increased risk of amyotrophic lateral sclerosis (ALS). For additional evidence that our candidate TEs altered gene expression, we queried the GTEx project RNAseq database containing expression data for 449 donors across 43 tissues. GTEx readily identifies changes in tissue-specific gene expression associated with loci specific genetic variation. Variants can be identified as expression quantitative trait loci (eQTL) when the genetic loci are significantly associated with altered expression of a gene in a specific tissue. We found many of the polymorphic TE candidates are eQTLs of adjacent genes significantly associated with altered expression in brain tissues. For example, LN1 is associated with altered expression of TTLL4 in cerebellum. Based on the TE eQTLs and the associations with regulatory chromatin active in tissues of the brain, we obtained a list of 10 causal TE candidates. The results indicate polymorphic TEs are candidate factors in a wide range of neurological disorders including ALS, migraine disorder, Parkinson’s disease and schizophrenia. We tested whether the insert sequences had the potential to influence transcription activity of a reporter gene in human neural stem cells. Of six candidate Alu insertions evaluated for their impact on promoter activity, we found five significantly altered expression of luciferase, indicating the candidate TEs have the potential to alter expression of genes in vivo. By identifying polymorphic TEs that are tightly linked to risk of neurologic and psychiatric diseases we have provided a valuable list of causal candidates that are structural variants on par with other candidate variants for having a causal role in neurologic and psychiatric disorders. We further explored the possibility that the polymorphic AluYa4 in the promoter of TTLL4 has a direct impact on TTLL4 activity. Using CRIPSR with iPS cells we inserted the AluYa4 (designated LN1) into the promoter of TTLL4 in the orientation and position found in the ALS cases. Quantitative RNA expression analysis revealed that LN1 does increase TTLL4 expression by 50% in iPS cells. Interestingly, several other unlinked genes also have increases in expression. Some of these genes have functions linked to microtubule dynamics, suggesting the changes in expression of these genes compensate for altered tubulin assembly resulting from increased TTLL4 activity. We are currently testing the impact of LN1 insertion in iPS cells differentiated into lower neurons by examining transcriptomics and cell neuron morphology. The chromatin associated activities of LEDGF in transcription and HIV-1 integration HIV/AIDS remains widespread with severe disparities among minority groups. Despite the highly effective antiretroviral medications that target activities such as integration, rates of drug resistance are increasing. This places great importance in discovering new aspects of replication to target. The chromatin associated transcription factor LEDGF/p75 contains a C-terminal integrase binding domain (IBD) that interacts directly with HIV-1 IN causing integration to occur across the bodies of actively transcribed sequences. In previous work we determined 1 million integration sites in HEK293T cells which as the largest dataset available provides a high-resolution map of insertion density across individual genes. Cells expressing LEDGF distribute integration broadly across transcribed sequence. LEDGF mediated integration across gene sequences is at odds with the original discovery of LEDGF which showed in purified systems the factor binds general transcription factors at promoters and functions as a coactivator. To determine whether LEDGF functions at promoters and gene bodies, high resolution maps of where LEDGF binds chromatin are needed. To understand how LEDGF distributes integration across transcribed sequences we examined its cellular function in transcription. HEK293T cells CRISPR edited to lack LEDGF (KO) have reduced expression of 296 genes. We evaluated the role of LEDGF at promoters down regulated in LEDGF KO cells with ChIP-seq by measuring levels of RNA Pol II and H3K4me3, a histone modification closely associated with active promoters. In the LEDGF KO cells the down regulated genes have reduced RNA Pol II enrichment at the transcription start sites (TSSs). To test whether LEDGF directly contributes to promoter activity we wished to compare amounts of LEDGF to levels of H3K4me3 and RNA Pol II at individual promoters. However, the availability of highly specific ChIP-seq data is limited due to the lack of anti-LEDGF antibodies that are effective in chromatin immunoprecipitation. We overcame this challenge with CRIPSR by inserting a biallelic 3XFLAG tag at the C-terminus of LEDGF in its native gene PSIP1 in HEK293T cells. With anti-FLAG antibodies we obtained reproducible dense ChIP-seq maps showing LEDGF forms narrow peaks at TSSs. A full 95% and 54% of LEDGF-3XFLAG peaks overlap with H3K4me3 and RNA Pol II enrichment, respectively. Importantly, in cells lacking LEDGF the levels of RNA Pol II at TSSs is significantly reduced. In addition, immunoprecipitation of LEDGF revealed an association with RNA Pol II. These data indicate that LEDGF is enriched at the TSSs of transcription units where it recruits RNA Pol II. In studies to identify how LEDGF is recruited to TSSs we ectopically expressed segments of LEDGF in cells that lack LEDGF expressed from its endogenous locus (PSIP1). These experiments revealed that the IBD plays a central role in positioning LEDGF at TSSs and that the PWWP plays only a minor part in the association of LEDGF across gene bodies. We evaluated candidate factors including MLL1 for promoting the IBD dependent association of LEDGF at TSSs. Depleting the expression of MLL1 resulted in reduced LEDGF association at TSSs and immunoprecipitation experiments confirmed an interaction between MLL1 and LEDGF. The prominent binding of LEDGF at TSSs is unexpected because integration occurs across transcribed sequences and because the enrichment of H3K36me3 which also occurs across transcribed sequences is thought to position integration by recruiting the N-terminal PWWP domain of LEDGF. We asked whether H3K36me3 is involved in integration across transcribed sequences. In collaboration with the lab of Alan Engelman (Dana Farber Cancer Institute) we found that HEK293T cells lacking the sole H3K36me3 methyltransferase (SETD2) have integration profiles largely unchanged from WT cells. Together with the finding that the PWWP domain doesn’t contribute to TSS binding of LEDGF, these data indicate that HIV-1 integration is not determined by H3K36me3 but instead results from the recruitment of LEDGF by MLL1 to TSSs. We are currently testing the model that LEDGF at TSSs associates with RNA Pol II in elongation complexes and in this context is recognized by HIV-1 integrase causing integration across gene bodies. Removal of long terminal repeat (LTR)-retrotransposons by LTR-LTR recombination impairs stress response Retrovirus integration has long been known to increase expression of genes as exemplified by the activation of oncogenes. Recent analyses of endogenous retroviruses in human populations uncovered unexpected levels of polymorphisms due to LTR (long terminal repeat)-LTR recombination that collapses full length elements into single LTRs. In the lab strain of Schizosaccharomyces pombe we examined the 13 full-length LTR-retrotransposons and found that the frequency of LTR-LTR recombination ranged from 0.9 to 44 per million cells depending on the specific location. By testing the contributions of Rad50, Rad51, and Rad52 we determined that single strand annealing-mediated DNA repair was responsible for most LTR-LTR recombination. We evaluated the outcome of LTR-LTR recombination by examining growth and global transcription in a strain where all 13 LTR-retrotransposons were replaced with single LTRs (Tf2-KO strain). Relative to WT cells the growth of Tf2-KO is impaired in the presence of hydrogen peroxide. In various conditions Tf2-KO has altered expression of oxidative stress and TORC1 activated modules of genes. In low oxygen conditions Tf2-KO has altered expression of 118 genes. Importantly, several of these genes are adjacent to a copy of Tf2. The adjacent genes include global regulators (e.g. pyp2, pib2, and ire1) of oxidative stress and TORC1. These data indicate that LTR-retrotransposons play an integral role in coordinating genome responses to oxidative and anerobic stress.

View original record on NIH RePORTER →