The following is the theoretical and experimental analysis of the role of the third nucleotide in codons during protein biosynthesis. Its role is largely enhanced compared to the existing understanding. Third nucleotide functionally and symmetrically divides codon families in 32 synonyms and 32 SYnonymous-HOMonymous hybrid codons—SYHOMs. Wherein, the sy homs function is to initiate nonlocal ribosome analysis of mRNA, represen ting real context in DNA language. Such analysis is a natural necessity for selection of one amino acid from two different amino acids, and between amino acids or a stop position, in situations when a ribosome interacts with syhom codons which have dual coding. This was theoretically substantiated earlier [1] [2] [3] . Experimental work [4] confirmed this theory: It was demonstrated that two different amino acids, selenocysteine and cysteine, are coded by a single UGA-syhom-codon for Euplotes crassus infusoria. This result does not call into question the dogma of unambiguity of amino acids and stop position coding by the cells genome, but it requires amendments to the existing model of genetic coding. These amendments are based on an enhanced understanding of the special linguistic/semantic role of the third nucleotide in codons and on the acceptance of the idea of real, rather than metaphorical, textuality of protein genes (mRNA). Such comprehension of the speech-similarity of genes (mRNA) and the role that third nucleotide in codons plays in this, leads to a simple statement about the quasi-consciousness (biocomputing) of the protein-synthesizing-system and its ability to recognize the context (meaning) of mRNA to make the correct choice of amino acids and stops in a syhom situation, based on the meanings of gene texts (mRNA).
A lot has been written about the hypothesis of F. Crick, including the works of the author himself, but most of the judgments are based on a formulation from F. Crick’s book “What a Mad Pursuit” 1988. [
However, there are some significant additional issues that stem from this brief message. This is what this article is about. “The standard” genetic protein code was obtained by M. Nirenberg’s group as a result from studying protein synthesis in E. coli. This work resulted in the table of the standard genetic code. It reflects the functions of protein genes as a static code structure, where all codons UNAMBIGUOUSLY encode amino acids and stop positions. It is important that according to the Wobble Hypothesis, half of the known 64 codons, i.e. 32, are redundant for 20 known amino acids. As for the 21st amino acid, selenocysteine and its coding―it will be explained later in this article. Redundant codons are synonyms that, in varying degrees of repetition, code the same, but different, amino acids and stop positions. These are the main provisions in M. Nirenberg’s model, later followed by F. Crick. This understanding has prevailed for 50 years, since M. Nirenberg received the Nobel Prize for this model in 1968. Now, theoretical and experimental results have accumulated, that suggest the introduction of amendments to this understanding of protein genetic coding. They are as follows.
The table of the standard code is functionally divided into two symmetric and equal parts, where 32 codons UNAMBIGIUOSLY and REDUNDANTLY encode only amino acids. These codons are synonyms. 32 other codons (not synonyms), called homonyms [
The table is symmetrically divided into codons-synonyms (in blue) and
Red codons―Mixed codons − Syhoms (Synonyms + Homonyms) Blue codons―Synonyms |
|
||||||
---|---|---|---|---|---|---|---|
C | G | T(U) | A | ||||
T(U) | TCT Ser TCC Ser TCA Ser TCG Ser | TGT Cys TGC Cys TGA Stop TGG Trp | TTT Phe TTC Phe TTA Leu TTG Leu | TAT Tyr TAC Tyr TAA Stop TAG Stop | |||
A | ACT Thr ACC Thr ACA Thr ACG Thr | AGT Ser AGC Ser AGA Arg AGG Arg | ATT Ile ATC Ile ATA Ile ATG Met | AAT Asn AAC Asn AAA Lys AAG Lys | |||
C | CCT Pro CCC Pro CCA Pro CCG Pro | CGT Arg CGC Arg CGA Arg CGG Arg | CTT Leu CTC Leu CTA Leu CTG Leu | CAT His CAC His CAA Gln CAG Gln | |||
G | GCT Ala GCC Ala GCA Ala GCG Ala | GGT Gly GGC Gly GGA Gly GGG Gly | GTT Val GTC Val GTA Val GTG Val | GAT Asp GAC Asp GAA Glu GAG Glu | |||
syhoms (in red).
Such a CHOICE is made by the ribosome due to the fact that it (and/or the whole cell) takes into account the context of the given mRNA. This choice automatically implies quasi-consciousness of the protein synthesizing system, more precisely, its biocomputer functions [
The situation of UNAMBIGUOUS coding by synonyms is determined by the fact that in each of the 8 codon families, ALL TRIPLETS (codons) are DIFFERENT. For this reason, in the triplet families, coding is performed by ALL three letters (nucleotides) and all triplets in each family encode only one amino acid. This coding is UNAMBIGUOUS AND REDUNDANT. Replacement of the third nucleotides in codons does not change the coding.
The situation of PRIMARY UNAMBIGUITY of coding by HOMONYM-triplets from the beginning (before ribosome reading of mRNA) is available in half of the codons―which are not synonyms (i.e., in fact, homonyms). This depends on the fact that the 3rd codon nucleotide―the key participant in the work of the genome-biocomputer of each cell―before the act of reading mRNA by the ribosome, in a static state, “does not plan” participation in the coding and can potentially be any of the 4 possible ones. Let me remind you that F. Crick did not comment on such cases of ribosome dynamics. So, the first two nucleotides (doublets) are coded. At the same time, in 6 homonym-families, it happens that the pairs of IDENTICAL doublets encode different amino acids. Wherein, in two families, it happens as follows... The doublet of TA-family encodes tyrosine and stop twice. In two TG-doublets: One doublet pair encodes cysteine; the other doublet pair encodes stop and tryptophan. In general, this means that in this case, there is also a homonymy factor, but with important additional characteristics. This phenomenon was discovered by the group of M. Nirenberg and F. Crick on the example of T(U)T(U) codon family [