We conducted genome sequence analysis to examine the presence/absence of two types of Z-DNA binding domains in various organisms. We examined 68 organisms from archaea, 914 organisms from bacteria, and 199 organisms from eukaryotes. RecA protein from Escherichia coli has a Z-DNA binding domain and this protein promotes homologous recombination. All the organisms examined had this domain. This result indicated that this domain is essential for all the organisms. RNA editing enzyme, adenosine deaminase from human has another type of Z-DNA binding domain. This domain was observed in some organisms of archaea, bacteria, and eukaryotes. The presence/absence of Z-DNA binding domain in adenosine deaminase indicated that gain and loss of this domain had occurred in the process of evolution. The implication of presence and absence of this domain is discussed in this study.
Double-stranded DNA is in equilibrium between right-handed B-DNA and left-handed Z-DNA. The B-DNA form is dominant and Z-DNA form makes only a small contribution to the equilibrium. It is reported that Z-DNA can be stabilized by cations and anions, dehydrating solvents, numerous covalent modifications of DNA, negative supercoiling, and Z-DNA binding proteins [
We assumed that the presence/absence of Z-DNA binding domain would give the clues to the function and evolution of Z-DNA binding proteins. We expected the survey of genome sequences would reveal the presence/absence of Z-DNA binding domain, as genome sequence has all protein information the organism has. We conducted genome sequence analysis to examine the presence/absence of two types of Z-DNA binding domains in various organisms from archaea, bacteria, and eukaryotes using the database of genomes to protein structures and functions (GTOP) [17 , 18]. GTOP provides protein annotation of 3D structures and functions based on homology search against Protein Data Bank [19 , 20] and Structural Classification of Proteins (SCOP) [
The determination of presence/absence of Z-DNA binding domain in organisms was simply done by using GTOP database. GTOP is containing protein fold predictions based on homology search against protein sequences of known structure. If there was a homologous hit for the Z-DNA binding domain with an e-value less than 10−10, it is estimated that the organism has the Z-DNA binding domain. If there was no hit, it is considered that the Z-DNA binding domain is absent in the organism.
The amino acid sequences of Z-DNA binding domain of E. coli recA protein and that of human ADAR protein from GTOP are shown in
GTOP adopted SCOP classification of protein structures, the unit of classification is usually the protein domain. SCOP organizes protein structures according to evolutionary origin and structure similarity. Actually, protein domains are classified on hierarchical levels into four categories: class, fold, superfamily, and family. The 3D structure of recA protein from E. coli has a domain described as class: alpha and beta
protein, fold: p-loop containing nucleotide triphosphate hydrolases, superfamily: p-loop containing nucleotide triphosphate hydrolases, family: recA protein-like (ATPase-domain). This domain is described as c.37.1.11 in SCOP code, and this is used as a keyword in GTOP search. Another Z-DNA binding domain in ADAR from human is described as class: all alpha protein, fold: DNA/RNA-binding 3-helical bundle, superfamily: winged helix DNA-binding protein, and family: Z-DNA binding domain. This domain is expressed as a.4.5.19 in SCOP code and used as a keyword in GTOP search. As more genomic sequences become available, the survey of proteins becomes difficult without useful tools. GTOP has a tool of keyword search on the web. For example, we searched the Z-DNA binding domain in GTOP using c.37.1.11 as keyword, then the homologous proteins in an organism were displayed with e-values. Therefore, we can simply estimate the presence or absence of the Z-DNA binding domain.
We employed GTOP for the search of two types of Z-DNA binding domains. In GTOP, organisms are classified based on the annotation in the genome sequence according to hierarchy: three kingdoms (archaea, bacteria, and eukaryotes), phylum, and section. In GTOP, 68 organisms in archaea were divided into 5 phyla and 13 sections (
The Z-DNA binding domain in recA protein from E. coli was observed in all the sections of archaea, bacteria, and eukaryotes in GTOP. Therefore, there is no need to distinguish the presence/absence of the Z-DNA binding domain in recA protein. This result indicated that this domain is essential for all the organisms.
Phylum | Section | Organism |
---|---|---|
Crenarchaeota | Thermoprotei | Hyperthermus butylicus |
Euryarchaeota | Archaeoglobi | Archaeoglobus fulgidus |
Halobacteria | ||
Methanobacteria | ||
Methanococci | ||
Methanomicrobia | Methanococcoides burtonii | |
Methanopyri | ||
Thermococci | Thermococcus onnurineus | |
Thermoplasmata | ||
Korarchaeota | Candidatus Korarchaeum | Candidatus Korarchaeum cryptofilum |
Nanoarchaeota | Nanoarchaeum | |
Thaumarchaeota | Cenarchaeales | |
marine archaeal | Nitrosopumilus maritimus |
Phylum | Section | Organism |
---|---|---|
Acidobacteria | Acidobacteriales | |
Candidatus Koribacter. | ||
Solibacteres | ||
Actinobacteria | Acidimicrobidae | |
Actinobacteridae | ||
Coriobacteridae | ||
Rubrobacteridae | ||
Aquificae | Aquificales | Persephonella marina |
Bacteroidetes | Bacteroidia | |
Candidatus Amoebophilus | ||
Flavobacteria | ||
Sphingobacteria | ||
Chlamydiae | Chlamydiales | |
Chlorobi | Chlorobia | Chlorobaculum parvum |
Chloroflexi | Chloroflexales | |
Dehalococcoidetes | ||
Herpetosiphonales | ||
Thermomicrobiales | ||
Cyanobacteria | Acaryochloris. | |
Chroococcales | ||
Gloeobacteria | ||
Nostocales | ||
Oscillatoriales | ||
Prochlorales | ||
Deinococcus-Thermus | Deinococci | |
Dictyoglomi | Dictyoglomia | |
Elusimicrobia | Elusimicrobiales | |
Firmicutes | Bacilli | Enterococcus faecalis |
Clostridia | Thermoanaerobacter tengcongensis | |
Fusobacteria | Fusobacteriales | |
Gemmatimonadetes | Gemmatimonadales | |
Nitrospirae | Nitrospirales | |
Planctomycetes | Planctomycetacia | |
Proteobacteria | Alphaproteobacteria | |
Betaproteobacteria | Burkholderia xenovorans |
Deltaproteobacteria | Desulfovibrio vulgaris | |
---|---|---|
Epsilonproteobacteria | Nitratiruptor sp. SB155-2 | |
Gammaproteobacteria | ||
Magnetococcus | ||
Spirochaetes | Spirochaetales | |
Tenericutes | Mollicutes | |
Thermotogae | Thermotogales | Fervidobacterium nodosum |
Verrucomicrobia | Methylacidiphilales | |
Opitutae | ||
Verrucomicrobiae |
Phylum | Section | Organism |
---|---|---|
Alveolata | Apicomplexa | |
Amoebozoa | Mycetozoa | |
Choanoflagellida | Codonosigidae | |
Cryptophyta | Pyrenomonadales | |
Euglenozoa | Kinetoplastida | |
Fornicata | Diplomonadida | |
Fungi | Chytridiomycota | |
Dikarya | ||
Fungi incertae sedis | ||
Microsporidia | ||
Ichthyosporea | ||
Haptophyceae | Isochrysidales | |
Heterolobosea | Schizopyrenida | |
Metazoa | Eumetazoa | Homo sapiens (human) |
Placozoa | ||
Rhodophyta | Bangiophyceae | |
Viridiplantae | Chlorophyta | |
Streptophyta | ||
Stramenopiles | Bacillariophyta | |
Oomycetes | ||
Pelagophyceae |
Another Z-DNA binding domain in ADAR from human was observed in some organisms of archaea, bacteria, and eukaryotes, respectively. The representative organism in the column of organism in Tables 1(a)-(c) indicates the presence of Z-DNA binding domain in ADAR from human. The white space in the column of organism means the absence of this domain.
Comparisons of the ribosomal RNA sequences from various organisms are commonly used to deduce the phylogenetic trees [
The evolutionary history of organisms of bacteria can be obtained by a comparison of conserved protein sequences of elongation factor-1 alpha/Tu or 70-kDa heat shock protein [
The Z-DNA binding domain was observed only in the organisms belong to phylum metazoan, section eumetazoa in eukaryotes (
As mentioned above, the Z-DNA binding proteins have been isolated from various organisms based on the measurements of the interactions between Z-DNA and its binding proteins. The presence of the Z-DNA binding domain in various organisms is consistent with the result that the Z-DNA binding domain in recA protein was observed in all the organisms examined. It is reported that the experiments of the Z-DNA binding proteins in E. coli were performed in the recA protein free strain [
Ideally, taxonomic classification should reflect the evolutionary history of the organism for the presence/absence of Z-DNA binding domain. If organisms A and B are phylogenetically close enough, it is expected that both organisms A and B have Z-DNA binding domain or not. This expectation varied among organisms as follows. There were 71 organisms in the section of betaproteobacteria and 24 organisms belong to Bundrkholderia species. Only Burkholderia xenovorans indicated the presence of Z-DNA binding domain, and other organisms indicated the absence of this domain. There were 49 vertebrates in the section of eumetazoa in eukaryotes. 43 vertebrates showed the presence of Z-DNA binding domain and 6 vertebrates including chicken showed the absence of this domain. It is reported that chicken has Z-DNA binding protein [
It is considered that if the function of a protein is essential, the protein would be conserved. The Z-DNA binding domain in recA protein from E. coli was conserved in all the organisms. This result indicated that the function of this domain is essential. Another type of the Z-DNA binding domain in ADAR from human was observed in some organisms in archaea, bacteria and eukaryotes. This result suggested that the function of this domain is non-essential, even though the biological function of this domain is not clearly understood.
Unfortunately, GTOP database has not been updated since 2010 October 6. However, GTOP offers valuable information on Z-DNA binding domains. As far as we examined, there was no database like GTOP with useful keyword search. There are two types of Z-DNA binding domains and some organisms have both domains. However, it seems that most researchers do not distinguish E. coli type or human type Z-DNA binding domain they are analyzing. To study the function of this domain, it is necessary to discern which type of this domain they are studying.
The author declares no conflicts of interest regarding the publication of this paper.