The three-dimensional organization of the genome is closely related to its functioning. Interactions between parts of the genome located at large distances from each other have been detected within the chromosomes of different organisms, which led to the discovery of topologically associated domains (TADs). Methods that reveal such interactions between chromosomal loci imply detection of both protein-protein and protein-DNA interactions. We investigated the possibility of involvement of the direct DNA-DNA interactions in the structural and functional organization of Drosophila melanogaster chromosomal 87A7 locus, containing genes hsp70 Aa and hsp70 Ab, with the sequence analysis method. Our results indicate that the functional organization of 87A7 locus may involve different elements: chromosomal DNA fragments that attach chromosomes to the nuclear envelope, short polypurine/polypyrimidine tracts, insulators and their proteins. The combination of interactions of these elements may cause different functional states of 87A7 locus.
In recent years, great interest is attracted to interactions of different genomic regions of interphase chromosomes, which are located at large distances from each other (based on the linear dimensions of the chromosomal DNA) [
TADs were identified using various modifications of the 3C-method (Chromosome Conformation Capture). Formaldehyde fixation of protein-protein and protein-DNA interactions is one of the first steps of the method [
However, one cannot exclude the involvement of direct DNA-DNA interactions in the structural and functional organization of chromosomes [
The question that still needs to be elucidated is how the “borders” between the adjacent TADs (including the nucleotide sequences of the chromosome regions) are organized [
Earlier [
TADs are not detected both in mitotic chromosomes [
Despite the identification of TADs in many organisms and many studies in this field, much remains unclear. The data on TADs give mostly a general idea about the interactions between the different sections of DNA. A detailed study of the nucleotide sequences, transcription, epigenetic status as well as higher levels of chromatin organization can help to create a complete picture of the processes taking place in the genome.
This study is devoted to D. melanogaster heat shock genes (hsp70) locus. These genes encode proteins of hsp70 family performing the protective functions of the cell, preventing denaturation and aggregation of proteins. It is known that heat shock genes are expressed regardless of the surrounding chromatin [
D. melanogaster chromosomal locus 87А7 contains hsp70Aa and hsp70Ab genes (3R: 11,904,163 .. 12,006,544), length 102,382 bp (FlyBase, release 6,0).
Search for nucleotide sequences able to form three-chain DNA structures was performed according to the following criteria: polypurine/polypyrimidine tracts should be potentially able to form T(A.T), A(AT), C(GC), and G(GC) triplexes not less than 9 bp long. To simplify the simulation the search for complementary polypurine/polypyrimidine tracts was led on the same DNA strand [
The chromosomal segment of D. melanogaster chromosome 3 (3R: 11,904,163 .. 12,006,544) is 102,382 bp in length with 17 genes located within (from CG14731 gene to Ect3 gene) (
Scale and representation of genes is taken from flybase. Arrows show the direction of transcription. Localization of the functional areas identified in hsp70 genes locus is showed with figures: black squares―poly-A/T tracts; triangles―insulators class II (Su(Hw)); circles―insulators class I (CP190, BEAF32, CTCF); asterisks―(AAAGA)2 tracts; rhombi―scs/scs’-elements. The border between physical domains is marked with vertical dashed lines. Diagonal hatching represents the areas of frequent intra-domain interactions. Diagonal cells denote regions that are found within the loop for hsp70 genes/one loop for the whole locus and black rectangles denote the bases of these loop structures respectively.
The insulators of hsp70 genes locus are localized in the area: the region of scs- element overlaps with the promoter of CG31211 gene (3R: 11,948,272 .. 11,950,063) and includes Y82-Y84 and R108-R110 tracts, which have 31 complementary tract in the test region. Another insulator, scs’-element, is located in the area (3R: 11,962,738 .. 11,963,832) overlapping with the promoters of CG3281 and aurora genes. This region includes tracts R123, R124, and there is only one polypyrimidine tract complementary to R124 (data not shown). Flybase.org resource shows the localization of two types of recognition sites for the insulator proteins at the locus. The first are for the insulators class II.mE01 (contain recognition sites for Su(Hw) protein): insulator_II_2339 and insulator_II_2340. The second type of sites are for insulators class I.mE01 (contain recognition sites for two of three insulator proteins―CP190, BEAF32 and CTCF): insulator_I_3071, insulator_I_3072, insulator_I_3073, insulator_I_3074 (
Insulators | Insulator class_II.mE01 (flybase.org) | FBsf0000154391 (11,913,012 .. 11,913,022) |
---|---|---|
FBsf0000154392 (12,005,260 .. 12,005,270) | ||
Insulator class_I.mE01 (flybase.org) | FBsf0000158034 (11,949,333 .. 11,949,343) | |
FBsf0000158035 (11,962,949 .. 11,962,959) | ||
FBsf0000158036 (11,967,340 .. 11,967,350) | ||
FBsf0000158037 (11,974,832 .. 11,974,842) | ||
scs | 11,948,272 .. 11,950,063 | |
scs’ | 11,962,738 .. 11,963,832 | |
Conservative tract of EnvM4 fragment (AAAGA)2 | 1) 11,938,828 .. 11,938,838 2) 11,998,731 .. 11,998,742 | |
(GAGA)2 tracts | 1) 11,920,264 .. 11,920,272 2) 11,923,379 .. 11,923,387 3) 11,937,609 .. 11,937,616 4) 11,937,706 .. 11,937,714 5) 11,947,759 .. 11,947,773 6) 11,947,966 .. 11,947,974 7) 11,971,004 .. 11,971,014 8) 11,973,193 .. 11,973,200 9) 11,980,525 .. 11,980,533 10) 11,985,662 .. 11,985,671 | |
Poly (A/T) tracts | 1) 11,904,451 .. 11,904,462 2) 11,907,560 .. 11,907,569 3) 11,919,007 .. 11,919,020 4) 11,929,000 .. 11,929,017 5) 11,941,394 .. 11,941,420 6) 11,948,318 .. 11,948,326 7) 11,954,049 .. 11,954,058 8) 11,988,599 .. 11,988,615 9) 11,990,094 .. 11,990,098 10) 11,990,436 .. 11,990,454 11) 11,992,431 .. 11,992,443 12) 11,996,508 .. 11,996,516 13) 11,997,711 .. 11,997,719 14) 11,999,479 .. 11,999,489 15) 12,004,298 .. 12,004,306 16) 12,004,671 .. 12,004,680 17) 11,911,373 .. 11,911,382 18) 11,911,858 .. 11,911,869 19) 11,916,814 .. 11,916,826 20) 11,917,018 .. 11,917,027 21) 11,919,252 .. 11,919,261 22) 11,928,709 .. 11,928,721 23) 11,937,563 .. 11,937,572 24) 11,939,459 .. 11,939,473 25) 11,939,860 .. 11,939,871 26) 11,941,989 .. 11,941,999 27) 11,948,092 .. 11,948,104 28) 11,951,658 .. 11,951,667 29) 11,988,284 .. 11,988,294 30) 11,991,175 .. 11,991,183 31) 12,000,454 .. 12,000,469 |
Sites with the potential to form single looped structure (with the help of triple-stranded DNA) | (11,938,828 .. 11,949,416) .. ( 11,994,767 .. 11,998,654) | |
---|---|---|
Sites with the potential to form genes hsp70 loop (with the help of triple-stranded DNA) | (11,938,916 .. 11,948,717) .. ( 11,967,091 .. 11,967,605) | |
(11,938,828 .. 11,948,218) .. ( 11,968,206 .. 11,971,622) | ||
TADs localization [ | ||
Localization of physical domains (modelling) | Domain ID 748 | 11,821,528 .. 11,950,027 (Null) |
Domain ID 749 | 11,950,028 .. 12,219,527 (Null) | |
Areas of intra-domain interactions (±5 kb) (experimental data) | 11,929,278 .. 11,934,278 | |
11,959,278 .. 11,969,278 | ||
11,974,278 .. 11,994,278 |
The sequence analysis of the region revealed a large number of single GAGA and AAAGA tracts, but not dimers that are only 10 and 2, respectively (
The formation of the DNA loop structures (bending of the DNA molecule) is facilitated by poly-A and poly-T tracts [
Accordingly, the detailed analysis of the possibilities for loop structures forming was carried out in the region flanked by (AAAGA)2 tracts on both sides. A number of 55 polypyrimidine tracts capable of interacting with 189 polypurine tracts were found in the region.
The first question that we were interested in is whether the entire chromosomal region, located between the two conservative dimers AAAGA, form a loop by means of three- stranded DNA structures? It turned out that it is potentially possible, with 11 options of such loops. Four polypyrimidine and 14 polypurine tracts in the region of about 10.5 kb long between genes CG14731 and CG31211 (3R: 11,938,828 .. 11,949,416) have complementary polypurine (4) and polypyrimidine (7) tracts in the region of about 4 kb long between genes mfas and Ect 3 (3R: 11,994,767 .. 11,998,654). The size of these potential loop structures is about 60 kb. Interestingly, the sites with the potential to form such looped structures are flanked by conservative (AAAGA)2 as well as they are rich in poly-A and poly-T (
Genes hsp70Aa and hsp70Ab can be organized in a loop of a smaller size (about 30 kb). The two looped structures can be formed potentially. The first loop composition: 6 tracts localized between genes CG14731 and CG31211 (11,938,916 .. 11,948,717) are complementary to 4 tracts in the CG12213 gene (11,967,091 .. 11,967,605). In this case, genes CG31211, hsp70Aa, hsp70Ab, CG3281, and aurora are found within the loop. The second loop composition: 11 tracts localized between genes CG14731 and CG31211 (11,938,828 .. 11,948,218) are complementary to 5 tracts in the CG18347 gene (11,968,206 .. 11,971,622). Thereby the genes CG31211, hsp70Aa, hsp70Ab, CG3281, aurora, and CG12213 enter the loop. Since the tracts between the CG14731 and CG31211 genes largely overlap, either one or the other loop can be realized potentially (
Interestingly, polypurine/polypyrimidine tracts located in the 5’-region of the locus also overlap when forming a single loop structure and smaller loops of the analyzed chromosome segment (
It can be assumed that the formation of a domain of active hsp70 genes is implemented in 2 steps. At the first stage the loop is formed with the participation of polypurine/ polypyrimidine tracts (the formation of two types of loops is possible). BEAF32 and Zw5 proteins of scs/scs’-elements participate at the second stage (
Step 1. While formation of the first loop (see above), CG31211, hsp70Aa, hsp70Ab, CG3281, and aurora genes are found within the loop. When forming the second loop (see above), CG31211, hsp70Aa, hsp70Ab, CG3281, aurora, and CG12213 genes enter the loop. Since the tracts between the CG14731 and CG31211 genes largely overlap, either one or the other loop can be realized potentially.
Step 2: When forming the hinge structure at step 1, scs- and scs’-elements become sufficiently closer to each other (small distance apart in the nuclear volume). Perhaps it improves the conditions for interaction of BEAF32 and Zw5 proteins, which is shown
in vitro and vivo [
To analyze Drosophila TADs we used data by Sexton et al. [
The results of our study show that the short polypurine/polypyrimidine tracts may be involved in the loop organization of D. melanogaster chromosomal locus for hsp70 genes. Tracts (AAAGA)2 appeared to be the most significant nucleotide sequences to determine the domain at 87A7 locus. These tandem tracts are of extremely high evolutionary conservation, are localized mainly in the intergenic regions [
The entire locus region located between two (AAAGA)2 tracts can be arranged into loops with the help of complementary polypurine/polypyrimidine tracts (
The comparison of our data (
According to Hi-C [
A large number of differentially expressed genes is localized between conservative (AAAGA)2 tracts of 87A7 locus (
It is not excluded that the insulators most fully perform the functional domain formation within these chromosomal domains and their “work” outside these domains may be impeded. This assumption is confirmed by our experimental data with double- gene transgenic system (Drosophila yellow and white genes). We have shown that neDNA fragments (EnvM4) when flanking the two reporter genes are able to protect them from the PEV in the host chromosomes of D. melanogaster. The maximum protective effect was observed in the presence of insulator Wari located in the 3’-region of white gene. When neDNA flank only one of the reporter genes (white + Wari), this very gene is protected against PEV to a greater extent than other gene (yellow) not flanked by neDNA fragments [
Results of analysis for 87A7 locus loop chromatin organization can logically explain some experimental facts. Firstly, scs/scs’-elements are not able to protect the transgene from PEV when integrated into heterochromatin [
Secondly, scs/scs’-elements are not localized at the puff boundaries but inside it upon heat shock [
On the model of D. melanogaster chromosomal 87A7 locus, it was demonstrated that various elements could be involved in the structural and functional organization of chromosomes. They are short polypurine/polypyrimidine tracts, insulators and their proteins, regions that attach chromosomes to the nuclear envelope, i.e. not only DNA- protein but DNA-DNA interactions as well. Combinatorics of the interaction of these elements can determine alternative states of chromosomal locus and genes belonging to this locus. Moreover, the same structural and functional status of chromosomal locus can be accomplished by several embodiments of DNA-DNA interactions that may underlie self-regulation of the locus and in a wider sense―“stability” of the genetic system.
This work was supported by a grant from Subprogramme “Gene pools of wildlife and its preservation” of Presidium RAS Programme for Basic Research “Biodiversity of natural systems”.
Glazkov, M.V. and Shabarina, A.N. (2016) Loop Structures and Barrier Elements from D. melanogaster 87А7 Heat Shock Locus. Computational Molecular Bioscience, 6, 53-65. http://dx.doi.org/10.4236/cmb.2016.64005
TADs―Topologically Associated Domains, neDNA―chromosomal DNA fragments that attach chromosomes to the nuclear envelope, LINE―Long Interspersed Repeat Sequences, SINE―Shot Interspersed Repeat Sequences, LADs―Lamina Associated Domains, scs/scs’―specialized chromatin structures, FISH―Fluorescence in situ Hybridization.