GGrantIndex
← Search

Deciphering novel binary CTCF code encrypted in Host and Proviral Epigenomes by Distinct Classes of CTCF & BORIS Binding Sites

$643,928ZIAFY2017AINIH

National Institute Of Allergy And Infectious Diseases

Investigators

Linked publications & trials

Abstract

CTCF, a highly conserved DNA binding protein, serves as a global organizer of chromatin architecture. CTCF is involved in regulation of transcriptional activation and repression, gene imprinting, control of cell proliferation and apoptosis, chromatin compartmentali-zation, X-chr inactivation, prevention of the 3-nt repeat expansions, and other chromatin resident processes. It took us over 20 years of CTCF studies to persuade others that multiple functionality of CTCF is indeed based on the ability of a highly-conserved 'multivalent 11 Zn Finger DBD to bind a wide range of diverse DNA sequences as well as on its intrinsic capacity to interact with a partner protein through the combinatorial usage of DNA-contating vs Protein-contacting Fingers. Last year, a similar multivalency has been proven for another poly-ZF array in Drosophila Su(Hw) factor. With the advent of next generation sequencing techniques, CTCF binding sites have been identified across fly, mouse, and human genomes. Reflecting the multitude of CTCF functions, many thousands of non-homologous CTSequences were found to be associated with genomic regions engaged in long-range chromatin interactions, including enhancers, promoters, and inter-genic boundary elements. It remains obscure, however, how the DNA sequences of given CTSes are related to the specific CTCF functions at these sites. This year we have made advances in the direction of understanding multi functionality of CTCF/DNA-complexes. By mapping simultaneous CTCF & BORIS occupancy genomewide, we uncovered two classes of CTCF binding regions that are pre-programmed and evolutionary conserved in DNA sequence. We found that 70% of CTCF bound regions enclose a single CTCF binding site, aka 1xCTSes while other 30% of CTCF-binding regions detected by ChIP-seq as single peaks are, in fact, shown to contain the dual CTCF binding sites, aka binary 2xCTSes. Occupancy of adjacent CTSes within binary 2xCTS-regions constrains 2 adjacent CTCF proteins to form homodimers in normal somatic cells, or to assemble heterodimers of CTCF+ BORIS co-bound at the same DNA spot in germ and cancer cells co-expressing BORIS on top CTCF. The recent breakthrough discovery of 2xCTS-regions (unresolvable by a standard CTCF ChIP-seq) enabled us, for the first time, to address the long-standing question as to how CTCF can serve in the context of the same nucleus as a bona fide transcription factor, while maintaining a substantial presence at putative insulator/boundary sites that bear no indications of transcriptional activity. Indeed, only 20% of all CTCF binding regions are located in promoter regions in any given cell type, while the remaining CTSes are not associated with transcriptional start sites. The obvious candidates for the determinants of such distinct functional roles would be DNA sequences themselves and/or differential identity of chromatin at these two types of sites. In our study we presented genome-wide evidence that DNA sequences underlying the two types of CTCF target sites are structurally different. The structural difference between two classes of CTCF binding sites is connected to the functional difference: 2xCTSes are preferentially located at active promoters and enhancers, and are associated with retained histones in human and mouse sperm, in stark contrast to genomic regions harboring a single CTCF binding site. A new finding reported in August 2017 on pathologies found in remarkable human CTCF+/- subjects strongly suggested that CTCF haploinsufficiency might induce aberrant methylation at CTCF binding sites (altering gene expression, etc.) similar to our previous results in Ctcf+/- mice (Kemp, Lobanenkov, and Filippova). Hence, it is possible that there is a common underlying patho-mechanism for the disorders caused by CTCF deletions distinct from a complete loss of function reported by us in the early 00's for the first time. Therefore, similar patho-mechanisms seem to underlie both human and mouse genetic disorders caused by insufficient CTCF dosage exclusive of additional ZnF mutations that even in tumors w/16q22 LOH would cause a complete rather than partial loss of DNA interactions with the multivalent CTCF. Next, our studies of the binary 2xCTS code challenge a perception prevalent in the current literature that all CTCF sites are equivalent, with a single CTCF molecule bound at a single CTS, despite the fact that they may contain two adjacent DNase I footprints. Finally, Non-random Destination of Sperm nucleosomes placed selectively into protamine-free DNA Zones was found to be pre-determined by nt context of the same 2xCTS-containing CTCF elements that are normally co-bound by both CTCF & BORIS 11 ZF paralogs co-expressed together in adult post-meiotic round spermatids. Taken together, our results provide a global view of chromatin dynamics and a resource for studying long-range control of gene expression in distinct human cell lineages, as well as explain why from a multitude of TransFactors, only CTCF has been recognized as a universal epigenetic mark that is present in all cell types at functionally distinct regions similar to modified histones and DNA.

View original record on NIH RePORTER →