Genome-wide analysis of chromatin modifications
National Heart, Lung, And Blood Institute
Investigators
Linked publications, trials & patents
Abstract
Previously we have studied the contribution of histone modifications, DNA methylation and their regulatory enzymes to transcriptional regulation in a variety of cellular systems. Recent studies have suggested cellular heterogeneity in gene expression even in the same cell population. The question is whether there is a similar heterogeneity in chromatin states in the apparently "same" cells. To address this question, we have developed the single-cell DNase-seq technique that can be used to detect chromatin states in single-cells or small number of primary cells. By applying this technique to NIH3T3 and mouse ES cells, we show the the heterogeneity of chromatin accessibility underlies the heterogeneity of gene expression across different cells. We also demonstrated its application in identifying potential functional mutations in human cancers. To develop more technologies that can be used to analyze the mammalian epigenomes, we developed 3e Hi-C for analyzing the three dimensional organization in the nucleus (Ren et al., Mol Cell 2017). To characterize genome-wide enhancer-promoter interactions at high resolution, we developed a novel technique, Transposition-mediated Analysis of Chromatin Looping (Trac-looping) (Nature Methods, in press). To analyze the epigenome at a single-cell level, in addition to the previous single-cell DNase-seq assay (Nature 2015), we have developed a single-cell MNase-seq for analysis of genome-wide nucleosome positions (Nature 2018). Application of scMNase-seq to NIH3T3 cells, mouse primary naive CD4 T cells and mouse embryonic stem cells reveals two principles of nucleosome organization: first, nucleosomes in heterochromatin regions, or that surround the transcription start sites of silent genes, show large variation in positioning across different cells but are highly uniformly spaced along the nucleosome array; and second, nucleosomes that surround the transcription start sites of active genes and DNase I hypersensitive sites show little variation in positioning across different cells but are relatively heterogeneously spaced along the nucleosome array. We found a bimodal distribution of nucleosome spacing at DNase I hypersensitive sites, which corresponds to inaccessible and accessible states and is associated with nucleosome variation and variation in accessibility across cells. Nucleosome variation is smaller within single cells than across cells, and smaller within the same cell type than across cell types. A large fraction of naive CD4 T cells and mouse embryonic stem cells shows depleted nucleosome occupancy at the de novo enhancers detected in their respective differentiated lineages, revealing the existence of cells primed for differentiation to specific lineages in undifferentiated cell populations. Furthermore, we have developed single-cell ChIC-seq and ACT-seq for the analysis of histone modifications at a single-cell level. In order to profile chromatin accessibility in a large number of single cells, we have now developed a novel indexing strategy to resolve single-cell DNase hypersensitivity profiles based on bulk cell analysis. This new technique, termed indexing single-cell DNase sequencing (iscDNase-seq), employs the activities of terminal DNA transferase (TdT) and T4 DNA ligase to add unique cell barcodes to DNase-digested chromatin ends. By a three-layer indexing strategy, it allows profiling genome-wide DHSs for >15 000 single-cells in a single experiment. Application of iscDNase-seq to human white blood cells accurately revealed specific cell types and inferred regulatory transcription factors (TF) specific to each cell type. We found that iscDNase-seq detected DHSs with specific properties related to gene expression and conservation missed by scATAC-seq for the same cell type. Also, we found that the cell-to-cell variation in accessibility computed using iscDNase-seq data is significantly correlated with the cell-to-cell variation in gene expression. Importantly, this correlation is significantly higher than that between scATAC-seq and scRNA-seq, suggesting that iscDNase-seq data can better predict the cellular heterogeneity in gene expression compared to scATAC-seq. Thus, iscDNase-seq is an attractive alternative method for single-cell epigenomics studies. Similarly, we developed iscChIC-seq to simultaneously profile histone modification and TF binding in a large number of single cells. Co-occupancy of different epigenetic marks or protein factors at the same genomic locations must often be inferred from multiple independently collected data sets. However, this strategy does not provide direct evidence of co-enrichment in the same cells due to the existence of cellular heterogeneity. To address this issue, we have developed a technique termed ACT2-seq that is capable of concurrently profiling multiple epigenetic marks in a single biological sample. In addition to reducing the numbers of samples required for experiments, ACT2-seq is capable of mapping co-occupancy of epigenetic factors on chromatin. This strategy provides direct evidence of co-enrichment without requiring complex single-molecule, single-cell, or magnetic bead-based approaches. Using ACT2-seq, we identified distinct relationships between co-occupancy of specific histone modifications and gene expression patterns. Investigating chromatin interactions between regulatory regions such as enhancer and promoter elements is vital for understanding the regulation of gene expression. Compared to Hi-C and its variants, the emerging 3D mapping technologies focusing on enriched signals, such as TrAC-looping, reduce the sequencing cost and provide higher interaction resolution for cis-regulatory elements. A robust pipeline is needed for the comprehensive interpretation of these data, especially for loop-centric analysis. Therefore, we have developed a new versatile tool named cLoops2 for the full-stack analysis of these 3D chromatin interaction data. cLoops2 consists of core modules for peak-calling, loop-calling, differentially enriched loops calling, and loops annotation. It also contains multiple modules for interaction resolution estimation, data similarity estimation, features quantification, feature aggregation analysis, and visualization. The three-dimensional genomic structure plays a critical role in gene expression, cellular differentiation, and pathological conditions. It is pivotal to elucidate fine-scale chromatin architectures, especially interactions of regulatory elements, to understand the temporospatial regulation of gene expression. We developed Hi-TrAC as a proximity ligation-free, robust, and sensitive technique to profile genome-wide chromatin interactions at high-resolution among regulatory elements. Hi-TrAC detects chromatin looping among accessible regions at single nucleosome resolution. With almost half million identified loops, we reveal a comprehensive interaction network of regulatory elements across the genome. After integrating chromatin binding profiles of transcription factors, we discover that cohesin complex and CTCF are responsible for organizing long-range chromatin loops, related to domain formation; whereas ZNF143 and HCFC1 are involved in structuring short-range chromatin loops between regulatory elements, which directly regulate gene expression. Thus, we introduce a methodology to identify a delicate and comprehensive network of cis-regulatory elements, revealing the complexity and a division of labor of transcription factors in organizing chromatin loops for genome organization and gene expression.
View original record on NIH RePORTER →