Chromatin Remodeling and Gene Activation
Eunice Kennedy Shriver National Institute Of Child Health & Human Development
Investigators
Linked publications & trials
Abstract
Gene activation involves the recruitment of a set of factors to a promoter in response to appropriate signals, ultimately resulting in the formation of an initiation complex by RNA polymerase II and transcription. These events coincide with the removal of promoter nucleosomes to create a nucleosome-depleted region (NDR). This observation has led to the generally accepted model that promoter nucleosomes physically block transcript initiation, acting as repressors by preventing access to specific transcription factor binding sites. The nucleosome is a very stable structure containing tightly wound DNA that is largely inaccessible to sequence-specific DNA binding proteins. Activation occurs if sequence-specific 'pioneer' transcription factors are present (these proteins bind nucleosomal sites with high affinity), and/or if 'classical' transcription factors, which are normally blocked by nucleosomes, recruit ATP-dependent chromatin remodelers to move or evict promoter nucleosomes, thus facilitating initiation complex formation. The ATP-dependent chromatin remodelers variously move nucleosomes along DNA, or remove the histones altogether, or form arrays of regularly spaced nucleosomes. Examples include the SWI/SNF and RSC complexes, which remodel nucleosomes on genes and at promoters, and the CHD and ISWI complexes, some of which are involved in determining nucleosome spacing. The INO80C complex is unusual because it appears to have both properties. Human diseases have been linked to chromatin remodeling enzymes. For example, mutations in the hSNF5 subunit of the SWI/SNF complex are strongly linked to pediatric rhabdoid tumors, and the CHD remodelers have been linked to cancer and autism. Therapies and drugs aimed at epigenetic targets are being tested. Thus, a full understanding of chromatin structure and the mechanisms by which it is manipulated is vital. Although the main effort of the lab is focused on the role of the ATP-dependent chromatin remodeling enzymes in gene regulation, with particular emphasis on how they may control the accessibility of genomic DNA, we are also interested in the central role of sequence-specific transcription factors. Transcription factors usually recognize consensus DNA binding sites, mostly containing 4 to 12 base pairs, and in which some bases may be degenerate (e.g., 'Y' for pyrimidine (C or T) or 'R' for purine (A or G)). The probability of finding such a site in the genome is often much higher than the number of actual binding sites detected empirically using methods such as ChIP-seq. Such consensus sites occur not only in regulatory elements, but also inside genes and elsewhere, where they may or may not be functional. The observation that consensus sites often predict far more transcription factor binding sites than are actually bound in vivo has led to the proposal that consensus sites in non-regulatory regions are not bound because they are blocked by chromatin. However, our recent measurements of DNA accessibility in yeast and mouse nuclei imply that all consensus sites are likely to be accessible in some cells within a population. This general but limited accessibility predicts detectable binding at all consensus sites, albeit reduced relative to sites in nucleosome-free DNA. If true, the hypothesis that chromatin prevents binding by transcription factors to consensus sites outside nucleosome-depleted regions is questionable. We have explored an alternative explanation: that consensus site sequences derived from ChIP-seq data may be too degenerate in some cases, such that only a subset of the predicted sites are true sites. We investigated this possibility using the well-studied yeast Gcn4 transcription factor as a model (1). Previously, we published ChIP-seq data for Gcn4 in a collaboration with the Hinnebusch Lab (NICHD). In that study, we derived a consensus sequence for Gcn4 binding, of which there are 1754 instances in the yeast genome, but only 546 show detectable Gcn4 binding in vivo. To resolve this discrepancy, it is necessary to determine which sites are bound by Gcn4 in the absence of the potential blocking effect of chromatin (i.e., using purified DNA in vitro). Accordingly, we developed a modified SELEX method to identify all of the sites bound by Gcn4 in the yeast genome, which we termed 'G-SELEX'. We used short genomic DNA fragments and purified Gcn4 attached to beads to select DNA fragments containing a Gcn4-bound site. The bound DNA was amplified and incubated with Gcn4 again in a second round of selection; three rounds of selection were performed in total. The final bound product was subjected to paired-end sequencing. The DNA fragments were mapped to the yeast genome to produce a very high quality coverage map. We identified 2,359 Gcn4-bound sites, but most were bound at very low frequency, corresponding to Gcn4 half-sites. In contrast, the major peaks (high-affinity sites) corresponded to the 7-bp sequence TGACTCA. However, of the 1078 instances of this sequence in the yeast genome, less than half are bound in vitro or in vivo. Further analysis revealed that the bound sites conform to a more extensive consensus: RTGACTCAY, such that RTGACTCAR or YTGACTCAY sites are bound only weakly, and YTGACTCAR sites are not bound at all. We conclude that the high-affinity site (RTGACTCAY) essentially accounts for Gcn4 binding in vitro and in vivo, irrespective of whether the site is located in a nucleosome-depleted promoter or inside a gene assembled into nucleosomes. More generally, we propose that transcription factor binding sites need to be defined more precisely using quantitative data, which should result in more accurate genome-wide prediction of real binding sites and greater insight into gene regulation. Overall, this study, together with our previous studies, suggest that the prevailing model that chromatin is a general block to gene expression unless specific transcription activators are present may be incorrect. Our current studies are aimed at resolving this issue.
View original record on NIH RePORTER →