The Mechanism of Translation
The Mechanism of Translation
The Mechanism of Translation
1. Initiation
2. Elongation
3. Termination
Initiation
Translation begins with the binding of the small ribosomal subunit to a specific sequence on the
mRNA chain. The small subunit binds via complementary base pairing between one of its internal
subunits and the ribosome binding site, a sequence of about ten nucleotides on the mRNA
located anywhere from 5 and 11 nucleotides from the initiating codon, AUG.
Figure %: Initiation
Once the small subunit has bound, a special tRNA molecule, called N-formyl methionine, or fMet,
recognizes and binds to the initiator codon. Next, the large subunit binds, forming what is known
as the initiation complex. With the formation of the initiation complex, the fMet-tRNA occupies the
P site of the ribosome and the A site is left empty. This entire initiation process is facilitated by
extra proteins, called initiation factors that help with the binding of ribosomal subunits and tRNA
to the mRNA chain.
Elongation
With the formation of the complex containing fMet-tRNA in the peptidyl site, an aminoacyl tRNA
with the complementary anticodon sequence can bind to the mRNA passing through the acceptor
site. This binding is aided by elongation factors that are dependent upon the energy from the
hydrolysis of GTP. Elongation factors go through a cycle to regenerate GTP after its hydrolysis.
Now, with tRNA bearing a chain of amino acids in the p site and tRNA containing a single amino
acid in the A site, the addition of a link to the chain can be made. This addition occurs through the
formation of a peptide bond, the nitrogen-carbon bond that forms between amino acid subunits to
form a polypeptide chain. This bond is catalyzed by the enzyme peptidyl transferase.
Figure %: Peptide Formation
The peptide bond occurs between the carboxyl group on the lowest link in the peptide chain
located at the p site and the amine group on the amino acid in the A group. As a result, the
peptide chain shifts over to the A site, with the original amino acid on the A site as the lowest link
in the chain. The tRNA in the A site becomes peptidyl RNA, and shifts over to the P site.
Meanwhile, the ribosome engages in a process called translocation: spurred by elongation
factors, the ribosome moves three nucleotides in the 3' prime direction along the mRNA. In other
words, the ribosome moves so that a new mRNA codon is accessible in the A site.
Introduction
Translation is the RNA directed synthesis of polypeptides. This process requires all three
classes of RNA. Although the chemistry of peptide bond formation is relatively simple, the
processes leading to the ability to form a peptide bond are exceedingly complex. The template for
correct addition of individual amino acids is the mRNA, yet both tRNAs and rRNAs are involved in
the process. The tRNAs carry activated amino acids into the ribosome which is composed of
rRNA and ribosomal proteins. The ribosome is associated with the mRNA ensuring correct
access of activated tRNAs and containing the necessary enzymatic activities to catalyze peptide
bond formation.
Historical Perspectives
1. The co-linearity between the DNA and protein encoded by the DNA. Yanofsky
showed that the order of observed mutations in the E. coli tryptophan synthetase
gene was the same as the corresponding amino acid changes in the protein.
2. Crick and Brenner demonstrated, from a large series of double mutants of the
bacteriophage T4, that the genetic code is read in a sequential manner starting from a
fixed point in the gene, the code was most likely a triplet and that all 64 possible
combinations of the 4 nucleotides code for amino acids, i.e. the code is degenerate
since there are only 20 amino acids.
The above mentioned experiments only indicated deductive correlation's regarding the genetic
code. The precise dictionary of the genetic code was originally determined by the use of in vitro
translation systems derived from E. coli cells. Synthetic polyribonucleotides were added to these
translation system along with all twenty amino acids. One amino acid at a time was radiolabeled.
The first demonstration of the dictionary of the genetic code was with the use of poly(U). This
synthetic polyribonucleotide encoded the amino acid phenylalanine, i.e. the resulting polypeptide
was poly(F).
The utilization of a variety of repeating di- tri- and tetra polyribonucleotides established the
entire genetic code. These results of these experiments confirmed that some amino acids are
encoded for by more than one triplet codon, hence the degeneracy of the genetic code. These
experiments also established the identity of translational termination codons.
An additional important point to come from these early experiments was that the 5' end of the
RNA corresponded to the amino terminus of the polypeptide. This was important since previous
labeling experiments had demonstrated that the N-terminus is the beginning of the elongating
polypeptide. Therefore, in vitro translation experiments established that the RNA is read in the 5'
to 3' direction.
Crick first postulated that translation of the genetic code would be carried out through
mediation of adapter molecules. Each adapter was postulated to carry a specific amino acid and
to recognize the corresponding codon. He suggested that the adapters contain RNA because
codon recognition could then occur by complementarity to the sequences of the codons in the
mRNA.
During the course of in vitro protein synthesis and labeling experiments it was shown that the
amino acids became transiently bound to a low molecular weight mass fraction of RNA. This
fraction of RNAs have been termed transfer RNAs (tRNAs) since they transfer amino acids to the
elongating polypeptide. These results indicate that accurate translation requires two equally
important recognition steps:
1. The correct choice of amino acid needs to be made for attachment to the
correspondingly correct tRNA.
2. Selection of the correct amino acid-charged tRNA by the mRNA. This process is
facilitated by the ribosomes which we will discuss below.
1. The genetic code is read in a sequential manner starting near the 5' end of the
mRNA. This means that translation proceeds along the mRNA in the 5' ——> 3'
direction which corresponds to the N-terminal to C-terminal direction of the amino acid
sequences within proteins.
2. The code is composed of a triplet of nucleotides.
3. That all 64 possible combinations of the 4 nucleotides code for amino acids, i.e. the
code is degenerate since there are only 20 amino acids.
The precise dictionary of the genetic code was determined with the use of in vitro translation
systems and polyribonucleotides. The results of these experiments confirmed that some amino
acids are encoded by more than one triplet codon, hence the degeneracy of the genetic code.
These experiments also established the identity of translational termination codons.
Shown below are the triplets that are used for each of the 20 amino acids found in eukaryotic
proteins. The row on the left side indicates the first nucleotide of each triplet and the row across
the top represents the second nucleotide. The wobble position nucleotides are indicated in blue.
The three stop codons are highlighted in red.
Characteristics of tRNAs
More than 300 different tRNAs have been sequenced, either directly or from their
corresponding DNA sequences. tRNAs vary in length from 60–95 nucleotides (18–28 kD). The
majority contain 76 nucleotides. Evidence has shown that the role of tRNAs in translation is to
carry activated amino acids to the elongating polypeptide chain. All tRNAs:
Activation of amino acids is carried out by a two step process catalyzed by aminoacyl-tRNA
synthetases. Each tRNA, and the amino acid it carries, are recognized by individual aminoacyl-
tRNA synthetases. This means there exists at least 20 different aminoacyl-tRNA synthetases,
there are actually at least 21 since the initiator met-tRNA of both prokaryotes and eukaryotes is
distinct from non-initiator met-tRNAs.
Activation of amino acids requires energy in the form of ATP and occurs in a two step reaction
catalyzed by the aminoacyl-tRNA synthetases. First the enzyme attaches the amino acid to the α-
phosphate of ATP with the concomitant release of pyrophosphate. This is termed an aminoacyl-
adenylate intermediate. In the second step the enzyme catalyzes transfer of the amino acid to
either the 2'– or 3'–OH of the ribose portion of the 3'-terminal adenosine residue of the tRNA
generating the activated aminoacyl-tRNA. Although these reaction are freely reversible, the
forward reaction is favored by the coupled hydrolysis of PPi.
Accurate recognition of the correct amino acid as well as the correct tRNA is different for each
aminoacyl-tRNA synthetase. Since the different amino acids have different R groups, the enzyme
for each amino acid has a different binding pocket for its specific amino acid. It is not the
anticodon that determines the tRNA utilized by the synthetases. Although the exact mechanism is
not known for all synthetases, it is likely to be a combination of the presence of specific modified
bases and the secondary structure of the tRNA that is correctly recognized by the synthetases.
It is absolutely necessary that the discrimination of correct amino acid and correct tRNA be
made by a given synthetase prior to release of the aminoacyl-tRNA from the enzyme. Once the
product is released there is no further way to proof-read whether a given tRNA is coupled to its
corresponding tRNA. Erroneous coupling would lead to the wrong amino acid being incorporated
into the polypeptide since the discrimination of amino acid during protein synthesis comes from
the recognition of the anticodon of a tRNA by the codon of the mRNA and not by recognition of
the amino acid. This was demonstrated by reductive desulfuration of cys-tRNA cys with Raney
nickel generating ala-tRNAcys. Alanine was then incorporated into an elongating polypeptide
where cysteine should have been.
Diagram showing the various modified nucleotides of tRNAs that are found in the wobble
position in the anticodon. The top half shows the wobble nucleotides of the anticodon in blue and
the various nucleotides (in red) of the wobble position of the codon that can be found in non-
Watson-Crick base-pairs. The lower panel illustrates the opposite showing the wobble
nucleotides of the codon in blue and the associated wobble nucleotides of the anticodon in red.
Now that we have charged aminoacyl-tRNAs and the mRNAs to convert nucleotide sequences
to amino acid sequences we need to bring the two together accurately and efficiently. This is the
job of the ribosomes. Ribosomes are composed of proteins and rRNAs.
All living organisms need to synthesis proteins and all cells of an organism need to synthesize
proteins, therefore, it is not hard to imagine that ribosomes are a major constituent of all cells of
all organisms. The make up of the ribosomes, both rRNA and associated proteins are slightly
different between prokaryotes and eukaryotes.
The ability to begin to identify the roles of the various ribosomal proteins in the processes of
ribosome assembly and translation was aided by the discovery that the ribosomal subunits will
self assemble in vitro from their constituent parts.
Following assembly of both the small and large subunits onto the mRNA, and given the
presence of charged tRNAs, protein synthesis can take place. To reiterate the process of protein
synthesis:
Translation proceeds in an ordered process. First accurate and efficient initiation occurs, then
chain elongation and finally accurate and efficient termination must occur. All three of these
processes require specific proteins, some of which are ribosome associated and some of which
are separate from the ribosome, but may be temporarily associated with it.
Initiation
Initiation of translation in both prokaryotes and eukaryotes requires a specific initiator tRNA,
tRNAmeti, that is used to incorporate the initial methionine residue into all proteins. In E. coli a
specific version of tRNAmeti is required to initiate translation, [tRNAfmeti]. The methionine attached
to this initiator tRNA is formylated. Formylation requires N10-formy-THF and is carried out after the
methionine is attached to the tRNA. The fmet-tRNAfmeti still recognizes the same codon, AUG, as
regular tRNAmet. Although tRNAmeti is specific for initiation in eukaryotes it is not a formylated
tRNAmet.
The specific non-ribosomally associated proteins required for accurate translational initiation
are termed initiation factors. In E. coli they are IFs in eukaryotes they are eIFs. Numerous eIFs
have been identified:
Activities of eIF-3
The eIF-3 complex is composed of 13 different subunits whose sizes, nomenclature and
functions are described in the Table below. The importance of the eIF-3 complex in translation
initiation is demonstrated by the fact that assembly of the eIF-2-GTP-met-tRNA imet (the ternary
complex), binding of the ternary complex and other components of the 43S pre-initiation complex
(PIC) to the ribosome 40S subunit, recruitment of the mRNA to the 43S PIC, and scanning of the
mRNA for the initiator AUG codon recognition are all dependent on eIF-3 complex activity.
Therefore, primary function of the components of eIF-3 is to act as a scaffold for the assembly of
the PIC and this assembled complex is referred to as the multi-initiation factor complex (MFC).
Human subunit
Nomenclature Function(s)
designation
eIF3D p66
eIF3E p48
eIF3H p40
eIF3I p36
eIF3K p28
eIF3L p67
eIF3M GA17
The initiation factors eIF-1 and eIF-3 bind to the 40S ribosomal subunit favoring
antiassociation to the 60S subunit. The prevention of subunit reassociation allows the preinitiation
complex to form.
The first step in the formation of the preinitiation complex is the binding of GTP to eIF-2 to
form a binary complex. eIF-2 is composed of three subunits, α, β and γ. The binary complex then
binds to the activated initiator tRNA, met-tRNAmet forming a ternary complex that then binds to the
40S subunit forming the 43S preinitiation complex. The preinitiation complex is stabilized by the
earlier association of eIF-3 and eIF-1 to the 40S subunit.
The cap structure of eukaryotic mRNAs is bound by specific eIFs prior to association with the
preinitiation complex. Cap binding is accomplished by the initiation factor eIF-4F. This factor is
actually a complex of 3 proteins; eIF-4E, A and G. The protein eIF-4E is a 24 kDa protein which
physically recognizes and binds to the cap structure. eIF-4A is a 46 kDa protein which binds and
hydrolyzes ATP and exhibits RNA helicase activity. Unwinding of mRNA secondary structure is
necessary to allow access of the ribosomal subunits. eIF-4G aids in binding of the mRNA to the
43S preinitiation complex.
Once the mRNA is properly aligned onto the preinitiation complex and the initiator met-tRNA met
is bound to the initiator AUG codon (a process facilitated by eIF-1) the 60S subunit associates
with the complex. The association of the 60S subunit requires the activity of eIF-5 which has first
bound to the preinitiation complex. The energy needed to stimulate the formation of the 80S
initiation complex comes from the hydrolysis of the GTP bound to eIF-2. The GDP bound form of
eIF-2 then binds to eIF-2B which stimulates the exchange of GTP for GDP on eIF-2. When GTP
is exchanged eIF-2B dissociates from eIF-2. This is termed the eIF-2 cycle (see diagram below).
This cycle is absolutely required in order for eukaryotic translational initiation to occur. The GTP
exchange reaction can be affected by phosphorylation of the α-subunit of eIF-2.
At this stage the initiator met-tRNAmet is bound to the mRNA within a site of the ribosome
termed the P-site, for peptide site. The other site within the ribosome to which incoming charged
tRNAs bind is termed the A-site, for amino acid site.
The eIF-2 cycle involves the regeneration of GTP-bound eIF-2 following the hydrolysis of GTP
during translational initiation. When the 40S preinitiation complex is engaged with the 60S
ribosome to form the 80S initiation complex, the GTP bound to eIF-2 is hydrolyzed providing
energy for the process. In order for additional rounds of translational initiation to occur, the GDP
bound to eIF-2 must be exchanged for GTP. This is the function of eIF-2B which is also called
guanine nucleotide exchange factor (GEF).
Elongation
The process of elongation, like that of initiation requires specific non-ribosomal proteins. In E.
coli these are EFs and in eEFs. Elongation of polypeptides occurs in a cyclic manner such that at
the end of one complete round of amino acid addition the A site will be empty and ready to accept
the incoming aminoacyl-tRNA dictated by the next codon of the mRNA. This means that not only
does the incoming amino acid need to be attached to the peptide chain but the ribosome must
move down the mRNA to the next codon. Each incoming aminoacyl-tRNA is brought to the
ribosome by an eEF-1α-GTP complex. When the correct tRNA is deposited into the A site the
GTP is hydrolyzed and the eEF-1α-GDP complex dissociates. In order for additional translocation
events the GDP must be exchanged for GTP. This is carried out by eEF-1βγ similarly to the GTP
exchange that occurs with eIF-2 catalyzed by eIF-2B.
The peptide attached to the tRNA in the P site is transferred to the amino group at the
aminoacyl-tRNA in the A site. This reaction is catalyzed by peptidyltransferase. This process is
termed transpeptidation. The elongated peptide now resides on a tRNA in the A site. The A site
needs to be freed in order to accept the next aminoacyl-tRNA. The process of moving the
peptidyl-tRNA from the A site to the P site is termed, translocation. Translocation is catalyzed by
eEF-2 coupled to GTP hydrolysis. In the process of translocation the ribosome is moved along
the mRNA such that the next codon of the mRNA resides under the A site. Following
translocation eEF-2 is released from the ribosome. The cycle can now begin again. The ability of
eEF-2 to carry out translocation is regulated by the state of phosphorylation of the enzyme, when
phosphorylated the enzyme is inhibited. Phosphorylation of eEF-2 is catalyzed by the enzyme
eEF2 kinase (eEF2K). Regulation of eEF2K activity is normally under the control of insulin and
Ca2+ fluxes. The Ca2+-mediated effects are the result of calmodulin interaction with eEF2K.
Activation of eEF2K in skeletal muscle by Ca2+ is important to reduce consumption of ATP in the
process of protein synthesis during periods of exertion which will lead to release of intracellular
Ca2+ stores. eEF2K itself is also regulated by phosphorylation and one of the kinases that
phosphorylates the enzyme is regulated by mTOR (see Regulation of eIF-4E below). In addition,
the master metabolic regulatory kinase, AMP-activated protein kinase (AMPK) will phosphorylate
and activate eEF2K leading to inhibition of eEF-2 activity.
Termination
Like initiation and elongation, translational termination requires specific protein factors
identified as releasing factors, RFs in E. coli and eRFs in eukaryotes. There are 2 RFs in E. coli
and one in eukaryotes. The signals for termination are the same in both prokaryotes and
eukaryotes. These signals are termination codons present in the mRNA. There are 3 termination
codons, UAG, UAA and UGA.
In E. coli the termination codons UAA and UAG are recognized by RF-1, whereas RF-2
recognizes the termination codons UAA and UGA. The eRF binds to the A site of the ribosome in
conjunction with GTP. The binding of eRF to the ribosome stimulates the peptidytransferase
activity to transfer the peptidyl group to water instead of an aminoacyl-tRNA. The resulting
uncharged tRNA left in the P site is expelled with concomitant hydrolysis of GTP. The inactive
ribosome then releases its mRNA and the 80S complex dissociates into the 40S and 60S
subunits ready for another round of translation.
Selenoproteins
The cellular levels of eIF-4E are the lowest of all eukaryotic initiation factors which makes this
factor a prime target for regulation. Indeed, at least 3 distinct mechanisms are known to exist that
regulate the level and activity of eIF-4E. These include regulation of the level of transcription of
the eIF-4E gene, post-translational modification via phosphorylation and inhibition by interaction
with binding proteins.
Although the exact mechanisms used to upregulate the transcription of the eIF-4E gene are
not yet well understood, it is known that exposure of cells to growth factors as well as activation of
T cells leads to increased expression of eIF-4E. The proto-oncogene MYC is believed to play a
role in the transcriptional activation of eIF-4E as 2 functional MYC-binding sites have been found
in the promoter region of the eIF-4E gene. Of significant note is the finding that cells that are
stably over-expressing the MYC gene also have enhanced levels of eIF-4E. Quite strikingly it has
been shown that promiscuous elevation in the levels of eIF-4E lead to tumorigenesis placing this
translation factor in the category of proto-oncogene.
Numerous extracellular stimuli (e.g. insulin, EGF, angiotensin II and gastrin) that exert a
portion of their effects at the level of enhanced translation do so by affecting the state of eIF-4E
phosphorylation. However, it should be noted that not all signals that lead to increased eIF-4E
phosphorylation lead to increased rates of translation. Changes in eIF-4E phosphorylation
correlate well with progression through the cell cycle. In resting (G0) cells eIF-4E phosphorylation
is low, it increases during G1 and S phase and then declines again in M phase. Phosphorylation
of eIF-4E occurs at one major site which is Ser209 (in the human and mouse proteins).
The primary signal transduction pathway leading to eIF-4E phosphorylation is that involving
the RAS gene. Many growth factors stimulate activation of RAS in response to binding their
cognate receptors. Subsequently, RAS activation leads to the phosphorylation and activation of
MAP-interacting kinase-1 (Mnk1) which in turn phosphorylates eIF-4E. Although the exact effect
of eIF-4E phosphorylation is not clearly defined, it may be necessary to increase affinity of eIF-4E
for the mRNA cap structure and for eIF-4G.
The principal mechanism utilized in the regulation of eIF-4E activity is through its interaction
with a family of binding/repressor proteins termed 4EBPs (4E binding proteins) which are widely
distributed in numerous vertebrate and invertebrate organisms. In mammalian cells 3 related
4EBPs have been found where 4EBP1 and 4EBP2 are also identified as PHAS-I and PHAS-II
(PHAS refers to properties of heat and acid stability).
Binding of 4E-BPs to eIF-4E does not alter the affinity of eIF-4E for the cap structure but
prevents the interaction of eIF-4E with eIF-4G which in turn suppresses the formation of the eIF-
4F complex (see Table of Initiation Factors above). The ability of 4EBPs to interact with eIF-4E is
controlled via the phosphorylation of specific Ser and Thr residues in 4EBP. When
hypophosphorylated, 4EBPs bind with high efficiency to eIF-4E but lose their binding capacity
when phosphorylated. Numerous growth and signal transduction stimulating effectors lead to
phosphorylation of 4E-BPs just as these same responses can lead to phosphorylation of eIF-4E.
There are several signal transduction pathways whose activations lead to phosphorylation of
4E-BPs. These include pathways that lead to activation of phosphatidylinositol 3-kinase (PI3K),
the Akt Ser/Thr kinase which is also called protein kinase B (PKB) and the FKBP12-rapamycin-
associated protein/mammalian target of rapamycin (FRAP/mTOR) family of proteins. Akt was
originally identified as a virally encoded oncogene and there are now at least three members of
the PKB/Akt family identified as Akt1, Akt2, and Akt3. The mammalian TOR proteins are
homologs of the yeast TOR proteins that were identified in a screen for yeast mutants resistant to
rapamycin. Rapamycin is an immunosuppressant used primarily in the prevention of tissue
rejection following organ transplantation. Rapamycin functions within cells by binding the
immunophilin FK506-binding protein 12 (FKBP12). Immunophilins are intracellular proteins that
binds to immunosuppressive drugs such as FK506 and rapamycin. When rapamycin inhibits the
kinase activity of FRAP/mTOR it can no longer phosphorylate 4EB. One of the major effects of
insulin is increased protein synthesis and this effect is elicited, in part, via activation of mTOR
function. For more information on the regulation of protein synthesis by insulin see the Insulin
Action page.
Targets for mTOR regulation of translational initiation and elongation. AMPK = AMP-activated
kinase. TSC1 and TSC2 = Tuberous sclerosis tumor suppressors 1 (hamartin) and 2 (tuberin);
Rheb = Ras homolog enriched in brain; PKB/Akt = protein kinase B; 4EBP1 = eIF-4E binding
protein; p70S6K = 70kDa ribosomal protein S6 kinase, also called S6K; eEF2K = eukaryotic
elongation factor 2 kinase.
Regulation of mTOR activity is effected via several mechanisms. Activation of AMPK results in
phosphorylation and activation of the TSC1/TSC2 complex which results in inhibition of mTOR.
AMPK can also phosphorylate and inhibit mTOR. Conversely, activation of PKB (as in the case of
insulin receptor activation) leads to activation of mTOR either by inhibition of the TSC1/TSC2
complex or by phosphorylation and activation of mTOR directly. Activation of mTOR leads to
phosphorylation of p70S6K and 4EBP1. The net effect of phosphorylation of 4EBP1 is that it is
released from eIF-4E allowing eIF-4E to actively bind eIF-4G and recognize the cap structure of
mRNAs. Activated p70S6K phosphorylates and inhibits eEF2K. If eEF2K does not phosphorylate
eEF2 then translation elongation proceeds uninhibited.
The phosphorylation of eIF-2 is the result of an activity called heme-controlled inhibitor (HCI)
which functions as diagrammed below. HCI is generated in the absence of heme, a mitochondrial
product. Removal of phosphate is catalyzed by a specific eIF-2 phosphatase which is unaffected
by heme. The presence of HCI was first seen in in vitro translation system derived from lysates of
reticulocytes. Reticulocytes synthesize almost exclusively hemoglobin at an extremely high rate.
In an intact reticulocyte eIF-2 is protected from phosphorylation by a specific 67 kDa protein.
The regulation of translation by heme controlled inhibitor (HCI). Control of translation by heme
is clinically important only in erythrocytes. Erythrocytes are enucleate and contain primarily globin
mRNA. When the level of heme (required for the synthesis of biologically active hemoglobin) is
low it would be inefficient for erythrocytes to synthesize globin protein. As the level of heme falls
the activity of HCI increases. HCI is a kinase which phosphorylates eIF-2. When phosphorylated,
eIF-2 still hydrolyzes bound GTP to GDP and still interacts with eIF-2B (GEF). However, the rate
of eIF-2B-mediated GTP exchange is greatly reduced. This renders eIF-2 incapable of being
used to form a new ternary initiation complex and translational initiation is reduced. When the
level of heme again rises the activity of HCI is reduced and translational initiation is once again
active.
Regulation of translation can also be induced in virally infected cells. It would benefit a virally
infected cell to turn off protein synthesis to prevent propagation of the viruses. This is
accomplished by the induced synthesis of interferons (IFs). There are 3 classes of IFs. The
leukocyte or α-IFs, the fibroblast or β-IFs and the lymphocyte or γ-IFs. IFs are induced by
dsRNAs and themselves induce a specific kinase termed RNA-dependent protein kinase (PKR)
that phosphorylates eIF-2 thereby shutting off translation in a similar manner to that of heme
control of translation. Additionally, IFs induce the synthesis of 2'-5'-oligoadenylate, pppA(2'p5'A)n,
that activates a pre-existing ribonuclease, RNase L. RNase L degrades all classes of mRNAs
thereby shutting off translation.
Regulation of the translation of certain mRNAs occurs through the action of specific RNA-
binding proteins. Protein of this class have been identified that bind to sequences in either the 5'
non-translated region (5'-UTR) or 3'-UTR. Two particularly interesting and important regulatory
schemes related to iron metabolism encompass RNA binding proteins that bind to either the 5'-
UTR of one mRNA or the 3'-UTR of another.
The transferrin receptor is a protein located in the plasma membrane that binds the protein
transferrin. Transferrin is the major iron transport protein in the plasma. When iron levels are low
the rate of synthesis of the transferrin receptor mRNA increases so that cells can take up more
iron. This regulation occurs through the action of an iron response element binding protein (IRBP)
that binds to specific iron response elements (IREs) in the 3'-UTR of the transferrin receptor
mRNA. These IREs form hair-pin loop structures that are recognized by IRBP. This IRBP is an
iron-deficient form of aconitase, the iron-requiring enzyme of the TCA cycle. When iron levels are
low, IRBP is free of iron and can therefore, interact with the IREs in the 3'-UTR of the transferrin
receptor mRNA. Transferrin receptor mRNA with IRBP bound is stabilized from degradation.
Conversely, when iron levels are high, IRBP binds iron then cannot interact with the IREs in the
transferrin receptor mRNA. The effect is an increase in degradation of the transferrin receptor
mRNA.
A related, but opposite, phenomenon controls the translation of the ferritin mRNA. Ferritin is
an iron-binding protein that prevents toxic levels of ionized iron (Fe2+) from building up in cells.
The ferritin mRNA has an IRE in its 5'-UTR. As with the transferrin receptor story, when iron
levels are high, IRBP cannot bind to the IRE in the 5'-UTR of the ferritin mRNA. This allows the
ferritin mRNA to be translated. Conversely, when iron levels are low, the IRBP binds to the IRE in
the ferrritin mRNA preventing its translation.
Many of the antibiotics utilized for the treatment of bacterial infections as well as certain toxins
function through the inhibition of translation. Inhibition can be effected at all stages of translation
from initiation to elongation to termination.
Inhibitor Comments
Tetracycline
Cycloheximide
Figure %: Translocation
With the A site open again, the next appropriate aminoacyl tRNA can bind there and the same
reaction takes place, yielding a three-amino acid peptide chain. This process repeats, creating a
polypeptide chain in the P site of the ribosome. A single ribosome can translate 60 nucleotides
per second. This speed can be vastly augmented when ribosomes link up to form polyribosomes.
Termination
Translation ends when one of three stop codons, UAA, UAG, or UGA, enters the A site of
the ribosome. There are no aminoacyl tRNA molecules that recognize these sequences. Instead,
release factors bind to the P site, catalyzing the release of the completed polypeptide chain and
separating the ribosome into its original small and large subunits.
Codon Sheet
To initiate translation, a 30S ribosomal subunitbinds to a short nucleotide sequence on the mRNA
called the ribosome binding site. However, translation doesn't usually begin until the 30S
ribosomal subunit reaches the first AUG sequence in the mRNA. For this reason, AUG is known
as the start codon. At this point, an initiation complex composed of the 30S subunit, a tRNA
having the anticodon UAC and carrying an altered form of the amino acid methionine (N-
formylmethionine or f-Met), and proteins called initiation factors is formed.
Fig. 4: Translation of mRNA by tRNA: 50S Ribosomal Subunit Attaches to the Initiation
Complex.
A 50S ribosomal subunit then attaches to the initiation complex and the initiation factors leave.
This forms the 70S ribosome.
Now an aminoacyl-tRNA with an anticodon complementary to the third codon, GGA, comes into
the "A" site of the ribosome.
Translation of mRNA by tRNA.
Once the anticodon of the tRNA at the "A" site forms hydrogen bonds with the second codon along the
mRNA, the amino acid being held by the tRNA at the "P" site of the ribosome is enzymatically removed
and forms a peptide bond with the amino acid carried by the tRNA at the "A" site.
Releasing factors (eRF) are capable of recognizing termination signal residues in the A site. The releasing
factor, in conjugation with GTP and the peptidyl transferases, promotes the hydrolysis of the bond between
the peptide and the tRNA occupying the P site. The ribosome dissociates into 40S and 60S subunits.
Prokaryotic translation
Initiation
The process of initiation of translation in prokaryotes.
The ribosome has three sites: the A site, the P site, and the E site. The A site is the point of
entry for the aminoacyl tRNA (except for the first aminoacyl tRNA, fMet-tRNAfMet, which enters at
the P site). The P site is where the peptidyl tRNA is formed in the ribosome. And the E site which
is the exit site of the now uncharged tRNA after it gives its amino acid to the growing peptide
chain.
Elongation
Elongation of the polypeptide chain involves addition of amino acids to the carboxyl end of the
growing chain. The growing protein exits the ribosome through the polypeptide exit tunnel in the
large subunit[2].
Elongation starts when the fmet-tRNA enters the P site, causing a conformational change
which opens the A site for the new aminoacyl-tRNA to bind. This binding is facilitated by
elongation factor-Tu (EF-Tu), a small GTPase. Now the P site contains the beginning of the
peptide chain of the protein to be encoded and the A site has the next amino acid to be added to
the peptide chain. The growing polypeptide connected to the tRNA in the P site is detached from
the tRNA in the P site and a peptide bond is formed between the last amino acids of the
polypeptide and the amino acid still attached to the tRNA in the A site. This process, known as
peptide bond formation, is catalyzed by a ribozyme (the 23S ribosomal RNA in the 50S ribosomal
subunit). Now, the A site has the newly formed peptide, while the P site has an uncharged tRNA
(tRNA with no amino acids). In the final stage of elongation, translocation, the ribosome moves 3
nucleotides towards the 3'end of mRNA. Since tRNAs are linked to mRNA by codon-anticodon
base-pairing, tRNAs move relative to the ribosome taking the nascent polypeptide from the A site
to the P site and moving the uncharged tRNA to the E exit site. This process is catalyzed by
elongation factor G (EF-G).
The ribosome continues to translate the remaining codons on the mRNA as more aminoacyl-
tRNA bind to the A site, until the ribosome reaches a stop codon on mRNA(UAA, UGA, or UAG).
Termination
Termination occurs when one of the three termination codons moves into the A site. These
codons are not recognized by any tRNAs. Instead, they are recognized by proteins called release
factors, namely RF1 (recognizing the UAA and UAG stop codons) or RF2 (recognizing the UAA
and UGA stop codons). These factors trigger the hydrolysis of the ester bond in peptidyl-tRNA
and the release of the newly synthesized protein from the ribosome. A third release factor RF-3
catalyzes the release of RF-1 and RF-2 at the end of the termination process.
Polysomes
Translation is carried out by more than one ribosome simultaneously. Because of the relatively
large size of ribosomes, they can only attach to sites on mRNA 35 nucleotides apart. The
complex of one mRNA and a number of ribosomes is called a polysome or polyribosome.
Effect of antibioticSeveral antibiotics exert their action by targeting the translation process
in bacteria. They exploit the differences between prokaryotic and eukaryotic
translation mechanisms to selectively inhibit protein synthesis in bacteria without
affecting the host.
Proteins that are membrane bound or are destined for excretion are synthesized by ribosomes
associated with the membranes of the endoplasmic reticulum (ER). The ER associated with
ribosomes is termed rough ER (RER). This class of proteins all contain an N-terminus termed a
signal sequence or signal peptide. The signal peptide is usually 13-36 predominantly
hydrophobic residues. The signal peptide is recognized by a multi-protein complex termed the
signal recognition particle (SRP). This signal peptide is removed following passage through the
endoplasmic reticulum membrane. The removal of the signal peptide is catalyzed by signal
peptidase. Proteins that contain a signal peptide are called preproteins to distinguish them from
proproteins. However, some proteins that are destined for secretion are also further proteolyzed
following secretion and, therefore contain pro sequences. This class of proteins is termed
preproproteins.
Mechanism of synthesis of membrane bound or secreted proteins. Ribosomes engage the ER
membrane through interaction of the signal recognition particle, SRP in the ribosome with the
SRP receptor in the ER membrane. As the protein is synthesized the signal sequence is passed
through the ER membrane into the lumen of the ER. After sufficient synthesis the signal peptide
is removed by the action of signal peptidase. Synthesis will continue and if the protein is secreted
it will end up completely in the lumen of the ER. If the protein is membrane associated a stop
transfer motif in the protein will stop the transfer of the protein through the ER membrane. This
will become the membrane spanning domain of the protein.
Proteolytic Cleavage
Most proteins undergo proteolytic cleavage following translation. The simplest form of this is
the removal of the initiation methionine. Many proteins are synthesized as inactive precursors that
are activated under proper physiological conditions by limited proteolysis. Pancreatic enzymes
and enzymes involved in clotting are examples of the latter. Inactive precursor proteins that are
activated by removal of polypeptides are termed proproteins.
Another is example of a preproprotein is insulin. Since insulin is secreted from the pancreas it
has a prepeptide. Following cleavage of the 24 amino acid signal peptide the protein folds into
proinsulin. Proinsulin is further cleaved yielding active insulin which is composed of two peptide
chains linked togehter through disulfide bonds.
Still other proteins (of the enzyme class) are synthesized as inactive precursors called
zymogens. Zymogens are activated by proteolytic cleavage such as is the situation for several
proteins of the blood clotting cascade.
Acylation
Many proteins are modified at their N-termini following synthesis. In most cases the initiator
methionine is hydrolyzed and an acetyl group is added to the new N-terminal amino acid. Acetyl-
CoA is the acetyl donor for these reactions. Some proteins have the 14 carbon myristoyl group
added to their N-termini. The donor for this modification is myristoyl-CoA. This latter modification
allows association of the modified protein with membranes. The catalytic subunit of cyclicAMP-
dependent protein kinase (PKA) is myristoylated.
Methylation
Additional nitrogen methylations are found on the imidazole ring of histidine, the guanidino
moiety of arginine and the R-group amides of glutamate and aspartate. Methylation of the oxygen
of the R-group carboxylates of gutamate and aspartate also takes place and forms methyl esters.
Proteins can also be methylated on the thiol R-group of cysteine.
As indicated below, many proteins are modified at their C-terminus by prenylation near a
cysteine residue in the consensus CAAX. Following the prenylation reaction the protein is cleaved
at the peptide bond of the cysteine and the carboxylate residue is methylated by a prenylated
protein methyltransferase. One such protein that undergoes this type of modification is the proto-
oncogene RAS.
Phosphorylation
Physiologically relevant examples are the phosphorylations that occur in glycogen synthase
and glycogen phosphorylase in hepatocytes in response to glucagon release from the pancreas.
Phosphorylation of synthase inhibits its activity, whereas, the activity of phosphorylase is
increased. These two events lead to increased hepatic glucose delivery to the blood.
The enzymes that phosphorylate proteins are termed kinases and those that remove
phosphates are termed phosphatases. Protein kinases catalyze reactions of the following type:
In animal cells serine, threonine and tyrosine are the amino acids subject to phosphorylation.
The largest group of kinases are those that phsophorylate either serines or threonines and as
such are termed serine/threonine kinases. The ratio of phosphorylation of the three different
amino acids is approximately 1000/100/1 for serine/threonine/tyrosine.
Sulfation
Sulfate modification of proteins occurs at tyrosine residues such as in fibrinogen and in some
secreted proteins (eg gastrin). The universal sulfate donor is 3'-phosphoadenosyl-5'-
phosphosulphate (PAPS).
Since sulfate is added permanently it is necessary for the biological activity and not used as a
regulatory modification like that of tyrosine phosphorylation.
Prenylation refers to the addition of the 15 carbon farnesyl group or the 20 carbon
geranylgeranyl group to acceptor proteins, both of which are isoprenoid compounds derived from
the cholesterol biosynthetic pathway. The isoprenoid groups are attached to cysteine residues at
the carboxy terminus of proteins in a thioether linkage (C-S-C). A common consensus sequence
at the C-terminus of prenylated proteins has been identified and is composed of CAAX, where C
is cysteine, A is any aliphatic amino acid (except alanine) and X is the C-terminal amino acid. In
order for the prenylation reaction to occur the three C-terminal amino acids (AAX) are first
removed. Following attachment of the prenyl group the carboxylate of the cysteine is methylated
in a reaction utilizing S-adenosylmethionine as the methyl donor.
In addition to numerous prenylated proteins that contain the CAAX consensus, prenylation is
known to occur on proteins of the RAB family of RAS-related G-proteins. There are at least 60
proteins in this family that are prenylated at either a CC or CXC element in their C-termini. The
RAB family of proteins are involved in signaling pathways that control intracellular membrane
trafficking.
Some of the most important proteins whose functions depend upon prenylation are those that
modulate immune responses. These include proteins involved in leukocyte motility, activation,
and proliferation and endothelial cell immune functions. It is these immune modulatory roles of
many prenylated proteins that are the basis for a portion of the anti-inflammatory actions of the
statin class of cholesterol synthesis-inhibiting drugs due to a reduction in the synthesis of
farnesylpyrophosphate and geranylpyrophosphate and thus reduced extent of inflammatory
events. Other important examples of prenylated proteins include the oncogenic GTP-binding and
hydrolyzing protein RAS and the γ-subunit of the visual protein transducin, both of which are
farnesylated. In addition, numerous GTP-binding and hydrolyzing proteins (termed G-proteins) of
signal transduction cascades have γ-subunits modified by geranylgeranylation.
Genetic code
From Wikipedia, the free encyclopedia
Jump to: navigation, search
The genetic code is the set of rules by which information encoded in genetic
material (DNA or mRNA sequences) is translated into proteins (amino acid
sequences) by living cells. The code defines a mapping between tri-nucleotide
sequences, called codons, and amino acids. With some exceptions,[1] a triplet
codon in a nucleic acid sequence specifies a single amino acid. Because the vast
majority of genes are encoded with exactly the same code (see the RNA codon
table), this particular code is often referred to as the canonical or standard
genetic code, or simply the genetic code, though in fact there are many variant
codes. For example, protein synthesis in human mitochondria relies on a genetic
code that differs from the standard genetic code.
Not all genetic information is stored using the genetic code. All organisms'
DNA contains regulatory sequences, intergenic segments, and chromosomal
structural areas that can contribute greatly to phenotype. Those elements
operate under sets of rules that are distinct from the codon-to-amino acid
paradigm underlying the genetic code.
The fact that codons consist of three DNA bases was first demonstrated in the
Crick, Brenner et al. experiment. The first elucidation of a codon was done by
Marshall Nirenberg and Heinrich J. Matthaei in 1961 at the National Institutes of
Health. They used a cell-free system to translate a poly-uracil RNA sequence
(i.e., UUUUU...) and discovered that the polypeptide that they had synthesized
consisted of only the amino acid phenylalanine. They thereby deduced that the
codon UUU specified the amino acid phenylalanine. This was followed by
experiments in the laboratory of Severo Ochoa demonstrating that the poly-
adenine RNA sequence (AAAAA...) coded for the polypeptide, poly-lysine. [3] and
the poly-cytosine RNA sequence (CCCCC...) coded for the polypeptide, poly-
proline.[4] Therefore the codon AAA specified the amino acid lysine, and the
codon CCC specified the amino acid proline. Using different copolymers most of
the remaining codons were then determined. Extending this work, Nirenberg and
Philip Leder revealed the triplet nature of the genetic code and allowed the
codons of the standard genetic code to be deciphered. In these experiments
various combinations of mRNA were passed through a filter which contained
ribosomes, the components of cells that translate RNA into protein. Unique
triplets promoted the binding of specific tRNAs to the ribosome. Leder and
Nirenberg were able to determine the sequences of 54 out of 64 codons in their
experiments.[5]
Subsequent work by Har Gobind Khorana identified the rest of the genetic
code. Shortly thereafter, Robert W. Holley determined the structure of transfer
RNA (tRNA), the adapter molecule that facilitates the process of translating RNA
into protein. This work was based upon earlier studies by Severo Ochoa, who
received the Nobel prize in 1959 for his work on the enzymology of RNA
synthesis.[6] In 1968, Khorana, Holley and Nirenberg received the Nobel Prize in
Physiology or Medicine for their work.[7]
The standard genetic code is shown in the following tables. Table 1 shows
what amino acid each of the 64 codons specifies. Table 2 shows what codons
specify each of the 20 standard amino acids involved in translation. These are
called forward and reverse codon tables, respectively. For example, the codon
AAU represents the amino acid asparagine, and UGU and UGC represent
cysteine (standard three-letter designations, Asn and Cys, respectively).[8]:522
The DNA codon table is essentially identical to that for RNA, but with U
replaced by
Salient features
A codon is defined by the initial nucleotide from which translation starts. For
example, the string GGGAAACCC, if read from the first position, contains the
codons GGG, AAA and CCC; and, if read from the second position, it contains
the codons GGA and AAC; if read starting from the third position, GAA and ACC.
Every sequence can thus be read in three reading frames, each of which will
produce a different amino acid sequence (in the given example, Gly-Lys-Pro,
Gly-Asn, or Glu-Thr, respectively). With double-stranded DNA there are six
possible reading frames, three in the forward orientation on one strand and three
reverse on the opposite strand.[10]:330 The actual frame in which a protein
sequence is translated is defined by a start codon, usually the first AUG codon in
the mRNA sequence.
Start/stop codons
Translation starts with a chain initiation codon (start codon). Unlike stop
codons, the codon alone is not sufficient to begin the process. Nearby sequences
(such as the Shine-Dalgarno sequence in E. coli) and initiation factors are also
required to start translation. The most common start codon is AUG which is read
as methionine or, in bacteria, as formylmethionine. Alternative start codons
(depending on the organism), include "GUG" or "UUG", which normally code for
valine or leucine, respectively. However, when used as a start codon, these
alternative start codons are translated as methionine or formylmethionine.[11]
The three stop codons have been given names: UAG is amber, UGA is opal
(sometimes also called umber), and UAA is ochre. "Amber" was named by
discoverers Richard Epstein and Charles Steinberg after their friend Harris
Bernstein, whose last name means "amber" in German. The other two stop
codons were named "ochre" and "opal" in order to keep the "color names" theme.
Stop codons are also called "termination" or "nonsense" codons and they signal
release of the nascent polypeptide from the ribosome due to binding of release
factors in the absence of cognate tRNAs with anticodons complementary to
these stop signals.[12]
Effect of mutations
The genetic code has redundancy but no ambiguity (see the codon tables
above for the full correlation). For example, although codons GAA and GAG both
specify glutamic acid (redundancy), neither of them specifies any other amino
acid (no ambiguity). The codons encoding one amino acid may differ in any of
their three positions. For example the amino acid glutamic acid is specified by
GAA and GAG codons (difference in the third position), the amino acid leucine is
specified by UUA, UUG, CUU, CUC, CUA, CUG codons (difference in the first or
third position), while the amino acid serine is specified by UCA, UCG, UCC,
UCU, AGU, AGC (difference in the first, second or third position).[8]:521–522
A position of a codon is said to be a fourfold degenerate site if any nucleotide
at this position specifies the same amino acid. For example, the third position of
the glycine codons (GGA, GGG, GGC, GGU) is a fourfold degenerate site,
because all nucleotide substitutions at this site are synonymous; i.e., they do not
change the amino acid. Only the third positions of some codons may be fourfold
degenerate.[8]:521–522 A position of a codon is said to be a twofold degenerate site if
only two of four possible nucleotides at this position specify the same amino acid.
For example, the third position of the glutamic acid codons (GAA, GAG) is a
twofold degenerate site. In twofold degenerate sites, the equivalent nucleotides
are always either two purines (A/G) or two pyrimidines (C/U), so only
transversional substitutions (purine to pyrimidine or pyrimidine to purine) in
twofold degenerate sites are nonsynonymous.[8]:521–522 A position of a codon is
said to be a non-degenerate site if any mutation at this position results in amino
acid substitution. There is only one threefold degenerate site where changing to
three of the four nucleotides may have no effect on the amino acid (depending on
what it is changed to), while changing to the fourth possible nucleotide always
results in an amino acid substitution. This is the third position of an isoleucine
codon: AUU, AUC, or AUA all encode isoleucine, but AUG encodes methionine.
In computation this position is often treated as a twofold degenerate site.[8]:521–522
There are three amino acids encoded by six different codons: serine, leucine,
and arginine. Only two amino acids are specified by a single codon. One of these
is the amino-acid methionine, specified by the codon AUG, which also specifies
the start of translation; the other is tryptophan, specified by the codon UGG. The
degeneracy of the genetic code is what accounts for the existence of
synonymous mutations.[8]:Chp 15
Degeneracy results because there are more codons than encodable amino
acids. For example, if there were two bases per codon, then only 16 amino acids
could be coded for (4²=16). Because at least 21 codes are required (20 amino
acids plus stop), and the next largest number of bases is three, then 4³ gives 64
possible codons, meaning that some degeneracy must exist.[8]:521–522
These properties of the genetic code make it more fault-tolerant for point
mutations. For example, in theory, fourfold degenerate codons can tolerate any
point mutation at the third position, although codon usage bias restricts this in
practice in many organisms; twofold degenerate codons can tolerate one out of
the three possible point mutations at the third position. Since transition mutations
(purine to purine or pyrimidine to pyrimidine mutations) are more likely than
transversion (purine to pyrimidine or vice-versa) mutations, the equivalence of
purines or that of pyrimidines at twofold degenerate sites adds a further fault-
tolerance.[8]:531–532
Grouping of codons by amino acid residue molar volume and hydropathy.
Even so, single point mutations can still cause dysfunctional proteins. For
example, a mutated hemoglobin gene causes sickle-cell disease. In the mutant
hemoglobin a hydrophilic glutamate (Glu) is substituted by the hydrophobic valine
(Val), that is, GAA or GAG becomes GUA or GUG. The substitution of glutamate
by valine reduces the solubility of β-globin which causes hemoglobin to form
linear polymers linked by the hydrophobic interaction between the valine groups
causing sickle-cell deformation of erythrocytes. Sickle-cell disease is generally
not caused by a de novo mutation. Rather it is selected for in malarial regions (in
a way similar to thalassemia), as heterozygous people have some resistance to
the malarial Plasmodium parasite (heterozygote advantage).[29]
These variable codes for amino acids are allowed because of modified bases
in the first base of the anticodon of the tRNA, and the base-pair formed is called
a wobble base pair. The modified bases include inosine and the Non-Watson-
Crick U-G basepair.[30]
While slight variations on the standard code had been predicted earlier,[31]
none were discovered until 1979, when researchers studying human
mitochondrial genes discovered they used an alternative code. Many slight
variants have been discovered since,[32] including various alternative
mitochondrial codes,[33] as well as small variants such as Mycoplasma translating
the codon UGA as tryptophan and Candida species translating CUG as a serine
rather than a leucine.[34][35] In bacteria and archaea, GUG and UUG are common
start codons. However, in rare cases, certain specific proteins may use
alternative initiation (start) codons not normally used by that species.[32]
In certain proteins, non-standard amino acids are substituted for standard stop
codons, depending upon associated signal sequences in the messenger RNA:
UGA can code for selenocysteine and UAG can code for pyrrolysine as
discussed in the relevant articles. Selenocysteine is now viewed as the 21st
amino acid, and pyrrolysine is viewed as the 22nd.[32]
Since 2001, 40 non-natural amino acids have been added into protein by
creating a unique codon (recoding) and a corresponding transfer-RNA:aminoacyl
– tRNA-synthetase pair to encode it with diverse physicochemical and biological
properties in order to be used as a tool to exploring protein structure and function
or to create novel or enhanced proteins.[36][37]
Despite the minor variations that exist, the genetic code used by all known
forms of life is nearly universal. However, there are a huge number of possible
genetic codes. If amino acids are randomly associated with triplet codons, there
will be 1.5 x 1084 possible genetic codes.[38]
There are four themes running through the many theories that seek to explain
the evolution of the genetic code (and hence the origin of these patterns):[44]
• Chemical principles govern specific RNA interaction with amino acids. Aptamer
experiments showed that some amino acids have a selective chemical affinity for
the base triplets that code for them.[45] Recent experiments show that of the 8
amino acids tested, 6 show some RNA triplet-amino acid association.[46][47] This
has been called the stereochemical code. The stereochemical code could have
created an ancient core of assignments. The current complex translation
mechanism involving tRNA and associated enzymes may be a later development,
and that originally, protein sequences were directly templated on base sequences.
• Biosynthetic expansion. The standard modern genetic code grew from a simpler
earlier code through a process of "biosynthetic expansion". Here the idea is that
primordial life "discovered" new amino acids (e.g., as by-products of metabolism)
and later back-incorporated some of these into the machinery of genetic coding.
Although much circumstantial evidence has been found to suggest that fewer
different amino acids were used in the past than today,[48] precise and detailed
hypotheses about exactly which amino acids entered the code in exactly what
order have proved far more controversial.[49][50]
• Natural selection has led to codon assignments of the genetic code that minimize
the effects of mutations.[51] A recent hypothesis[52] suggests that the triplet code
was derived from codes that used longer than triplet codons. Longer than triplet
decoding has higher degree of codon redundancy and is more error resistant than
the triplet decoding. This feature could allow accurate decoding in the absence of
highly complex translational machinery such as the ribosome.
• Information channels: Information-theoretic approaches see the genetic code as
an error-prone information channel [53]. The inherent noise (i.e. errors) in the
channel poses the organism with a fundamental question: how to construct a
genetic code that can withstand the impact of noise [54] while accurately and
efficiently translating information? These “rate-distortion” models [55] suggest that
the genetic code originated as a result of the interplay of the three conflicting
evolutionary forces: the needs for diverse amino-acids [56], for error-tolerance [51]
and for minimal cost of resources. The code emerges at a coding transition when
the mapping of codons to amino-acids becomes nonrandom. The emergence of
the code is governed by the topology defined by the probable errors and is related
to the map coloring problem.
On the basis of their studies with the lac system, and results such as the
PaJaMo experiment, François Jacob (right) and Jaques Monod proposed the
Operon Model of Gene Expression in bacteria.
[Lod11-5]
The following are the important features of the model:
Structural Genes
these genes code for protein and RNA molecules that are required for normal
enzymatic or structural functions in the cell.
Regulator Genes
these genes code for protein and RNA molecules whose function is to regulate
the expression of other genes. Because these gene products act at another site,
they are trans-acting factors.
If the regulatory protein is a repressor, the site on the DNA to which it binds is
called an OPERATOR.
In the absence of a repressor, RNA polymerase can bind to the promoter and
initiate transcription of the operon:
In the presence of a repressor, RNA polymerase is unable to transcribe the
operon:
The exact details whereby repressor interferes with RNA polymerase and its
ability to transcribe need to be described on an operon-by-operon basis.
Structural Genes
[MVH26-17]
The lacY gene is 1251 bp in length and codes for a 30 KDal monomeric
membrane protein of 417 amino acids.
Regulator Genes
The lac operon has a regulator gene: lacI which codes for the regulatory
protein, lactose repressor. The lacI gene is 1080 bp in length; the repressor
functions as a tetramer.
Gene Organization
The three structural genes are organized as a unit -- lacZ-lacY-lacA and are
expressed as a unit from lacZ through lacA.
Regulator genes code for diffusible molecules which can potentially act at
many other locations in the genome.
In the absence of an inducer, the lactose repressor binds to its operator and
blocks RNA polymerase from transcribing the structural genes of the operon:
[27-9a] [MVH26-18]
Note:
• the lacI gene is expressed from its own promoter which is a very weak promoter
so the amount of mRNA transcribed, and, hence, the amount of protein made, will
be low.
[27-9b] [MVH26-18]
Note:
• Lactose is not per se the true effector. Allolactose is the true inducer of the
operon. Isopropyl-thio-galactoside (IPTG is an artificial inducer of the operon -
one that is commonly used in research laboratories.
[allolactose] [IPTG]
As we will see, a key aspect of the lactose repressor and its function are its
dual properties:
Protein synthesis can be divided into the same three phases as any of the
other polymerization reactions we have discussed in this course, but it also
contains an explicit fourth phase:
Initiation
Elongation
whereby the correct amino acid is brought to the ribosome, is joined to the
nascent polypeptide chain, and the entire assembly moves one position along
the mRNA.
Termination
Disassembly
whereby a special factor binds to the ribosome so that it can release the
mRNA and tRNA that is still bound to it and so that it can be recycled in another
round of protein synthesis.
There are two rules about protein synthesis to keep in mind:
Initiation
The following ingredients are needed for this phase of protein synthesis:
• The mRNA
[26-27] [MVH27-20]
IF3 promotes the dissociation of the ribosome into its two component subunits.
The presence of IF3 permits the assembly of the initiation complex and prevents
binding of the 50S subunit prematurely.
IF1 assists IF3 in some way, perhaps by increasing the dissociation rate of the
30S and 50S subunits of the ribosome.
Binding of the mRNA and the fMet-tRNAfMet
IF3 assists the mRNA to bind with the 30S subunit of the ribosome so that the
start codon is correctly positioned at the peptidyl site of the ribosome. The
mRNA is positioned by means of base-pairing between the 3' end of the 16S
rRNA with the Shine-Dalgarno sequence immediately upstream of the start
codon.
IF2(GTP) assists the fMet-tRNAfMet to bind to the 30S subunit in the correct site
- the P site.
It is not clear whether the mRNA or fMet-tRNAfMet binds first. It may be that
either can bind first.
At this stage of assembly, the 30S initiation complex is complete and IF3 can
dissociate.
The following diagram illustrates the relative rates of these events during
initiation:
Diagram from:
Late events of translation initiation in bacteria: a
kinetic analysis
J. Tomic, L.A. Vitali, T. Daviter1, A. Savelsbergh, R.
Spurio, P. Striebeck, W. Wintermeyer, M.V. Rodnina
and C.O. Gualerzi
The EMBO Journal, Vol. 19, No. 9 pp. 2127-2136,
2000
Elongation
Three special Elongation Factors are required for this phase of protein
synthesis: EF-Tu (GTP), EF-Ts and EF-G (GTP).
[Image]
[26-31]
A new codon is now positioned at the A site and awaits a new aminoacyl-
tRNA.
[26-28] [MVH27-22]
The elongation factor, EF-Tu (GTP) binds with an aminoacyl-tRNA and brings
it to the ribosome. Once the correct aminoacyl-tRNA is positioned in the
ribosome, GTP is hydrolyzed, EF-Tu (GDP) undergoes a conformational change
and then dissociates away from the ribosome.
There are two ways that EF-Tu functions to ensure that the correct aminoacyl-
tRNA is in place:
• EF-Tu prevents the aminoacyl end of the charged tRNA from entering the A site
on the ribosome. This ensures that codon-anticodon pairing is checked first before
the charged tRNA is irreversibly bound in the A site and a new, potentially
incorrect, peptide bond is made.
• GTP hydrolysis is SLOW and EF-Tu cannot dissociate from the ribosome until it
occurs. The amount of time prior to GTP hydrolysis allows the final fidelity
check to take place. Hydrolysis is associated with a conformational change in EF-
Tu
[26-29]
Experiments using GTP analogues have been used to establish these results:
o If a GTP analogue such as GTP-γ -S, which is hydrolyzed very slowly,
is used then protein synthesis slows down because of the slow rate of
hydrolysis but it also becomes more accurate because there is more time to
check that the correct aminoacyl-tRNA is in place.
The following diagram illustrates the relative rates of events during EF-Tu
dependent tRNA binding:
Diagram from:
Late events of translation initiation in bacteria: a kinetic analysis
J. Tomic, L.A. Vitali, T. Daviter, A. Savelsbergh, R. Spurio, P. Striebeck, W.
Wintermeyer, M.V. Rodnina and C.O. Gualerzi
The EMBO Journal, Vol. 19, No. 9 pp. 2127-2136, 2000
Kanamycin Causes misreading of the code by interfering with the wobble base
pairing.
Streptomycin This antibiotic was the first aminoglycoside characterized. It inhibits
prokaryotic ribosomes in a couple of ways. It causes misreading by
interfering with the normal pairing between codon and anticodon. It can
also prevent initiation. Streptomycin resistant bacteria carry an altered
S12 subunit.
[Box26-4-2]
Tetracycline Inhibits aminoacyl-tRNA binding to the A site on the ribosome.
Kirromycin Blocks dissociation of GDP from EF-Tu after hydrolysis. This prevents
dissociation of EF-Tu from the ribosome and effectively stalls protein
synthesis.
EF-Tu is the most abundant protein in the E. coli cell. There are approximately
70-100,000 molecules/cell which is 5% of the total cell protein. There are also
approximately 70-100,000 tRNA molecules/cell. Nearly all of the aminoacyl-tRNA
in the cell is bound by EF-Tu.
EF-Tu cannot bind with tRNAfMet. This tRNA has a slight difference in its
structure compared with that of tRNAMet which means that it is not bound by EF-
Tu.
EF-Tu (GDP) is inactive and cannot bind aminoacylated tRNAs. However, EF-
Tu has a higher affinity for GDP (Ka = 10-8M) than for GTP (Ka = 10-6M).
In order to recycle EF-Tu, the elongation factor EF-Ts binds to the EF-Tu
(GDP) complex to displace the GDP. GTP then, in turn, displaces EF-Ts. Many
other G-proteins require a guanine nucleotide release protein (GNRP) to
release GDP; EF-Ts is the GNRP for EF-Tu.
[MVH27-23]
Formation of the new peptide bond (Transpeptidation)
[26-23]
Adenine 2451 (in the E coli 23S rRNA) is located in a microenvironment such
that the pKa is shifted by 4 units to a value of 7.6. This permits it to act as a
general acid/base for catalysis as shown above. This adenine is universally
conserved in all known 23S rRNA's.
[Box26-4-1]
Finally, the ribosome translocates along the mRNA thereby moving the new
peptidyl-tRNA to the P site and the old (now uncharged) tRNA, which has just
lost its peptidyl chain, to the E site. This step requires the elongation factor, EF-
G(GTP). There are 20,000 molecules/cell of EF-G which is the same as the
number of ribosomes.
EF-G blocks the binding of aminoacyl tRNAs to the A site as well as blocking
the binding of Release Factors. It effectively makes sure that translocation must
take place before the cycle continues.
EF-G and the tRNA-EF-Tu complex are mutually exclusive. The structures of
these two are remarkably similar and demonstrate very nicely why these two
cannot bind to the ribosome simultaneously:
Phe-tRNA-EF-Tu EF-G
[26-30]
The following figure compares the binding of tRNA-EF-Tu and EF-G with the
ribosome. Notice the similarity in the manner in which both the structures can fit
into the anticodon binding part of the A site. Notice also that there are differences
in the manner in which EF-Tu and EF-G interact with the ribosome.
Image adapted from:
Note that as a new protein is being synthesized, it must leave the ribosome.
Structural studies show that there is an exit tunnel but that it is quite narrow and
that it is unlikely that any significant protein folding could occur within the
ribosome. The following image shows a trans-section through the ribosome that
shows the rRNA (grey), the ribosomal proteins (green), the peptidyl transferase
centre (PT), and the nascent polypeptide (white).
Termination
The final phase of protein synthesis requires that the finished polypeptide
chain be detached from a tRNA. This can only happen in response to the signal
that a stop codon has been reached.
[26-32] [MVH27-26]
There are no tRNAs that recognize the stop codons (except the tRNAs for
selenocysteine and pyrrolysine as well as the suppressor tRNAs). Rather stop
codons are recognized by release factor RF1 (which recognizes the UAA and
UAG stop codons) or RF2 (which recognizes the UAA and UGA stop codons).
These release factors act at the A site of the ribosome. A third release factor,
RF3 (GTP), stimulates the binding of RF1 and RF2.
Disassembly
There is one final step in the overall cycle of protein synthesis, namely,
disassembly of the ribosome. In bacteria this requires the participation of the
ribosome recycling factor (RRF).
Following the action of the release factors, the ribosome complex contains a
70S ribosome, a bound mRNA, an empty A-site, and a deacylated tRNA in the P-
site. RRF along with EF-G(GTP) dissassembles the complex.
RRF is a small protein containing 185 amino acids. Structurally, it contains two
domains. Overall, the shape of the molecule mimics that of tRNA. Much like EF-
Tu and EF-G, this mimicry may underlie the explanation of how this protein
functions.
Image adapted from:
M.Selmer, S. Al-Karadaghi, G. Hirokawa, A. Kaji, A. Liljas (1999) Crystal
Structure of Thermotoga maritima Ribosome Recycling Factor: A tRNA
Mimic . Science 286: 2349-2352.
EF-G and EF-Tu are shown in the left two images, respectively. The third
image shows a superposition of RRF with a tRNA molecule. Note, however, that
RRF does not have any structural elements that correspond with the acceptor
arm of the tRNA.
It is thought that RRF could bind to the A-site of the ribosome. Selmer et al.
propose that EF-G then binds and that translocation may occur. This would move
the empty tRNA that is still bound in the P-site to the E-site, thereby releasing it.
The ribosome would then dissociate releasing the mRNA as well as RRF and
EF-G.
The following figure shows that molecular mimicry extends among the release
factors: RF2, eRF and RRF. Notice that RF2 has the structure that can fit into the
anticodon binding part of the A site as does eRF.
Initiation
Elongation
Termination
Disassembly
Many antibiotics and toxins function by blocking certain steps during protein
synthesis. As well as their utility in treating infections, antibiotics have been
useful in dissecting many of the molecular details of the steps and reactions of
protein synthesis. The following will give you a feel for this important topic.
[MVH27-28]
Kanamycin Causes misreading of the code by interfering with the wobble base
pairing.
Kirromycin Blocks dissociation of GDP from EF-Tu after hydrolysis. This prevents
dissociation of EF-Tu from the ribosome and effectively stalls protein
synthesis.
Puromycin Causes premature chain termination. Its structure resembles that of
the 3' end of a tyrosyl-tRNA and it participates as a substrate in a
peptidyl transferase reaction.
[Box26-4-1]
[Box26-4-2]
Tetracycline Inhibits aminoacyl-tRNA binding to the A site on the ribosome.
• It can be used as an
mRNA which codes
for a 10 amino acid
long oligopeptide:
ANDENYALAA.
Image of Escherichia coli tmRNA from the tmRNA Database. Click
here to view a three-dimensional structure of the E. coli tmRNA.
The mechanism of action of the ssrA RNA is shown in the following figure:
Diagram from:
SsrA-mediated peptide tagging caused by rare codons and tRNA
scarcity
E.D.Roche and R.T.Sauer
The EMBO Journal, Vol. 18 (16) pp. 4579-4589, 1999
When a ribosome stalls, the ssrA RNA charged with alanine is brought to the
A-site of the ribosome by the SsrB protein. Peptidyl transferase activity transfers
the nascent polypeptide to the alanine attached to ssrA.
The mRNA template is also displaced by the ssrA RNA. Further protein
synthesis now uses ssrA as a template and ten further amino acids
(ANDENYALAA) are added to the C-terminal end of the polypeptide.
However, the final two amino acids that are added (AA) mark the new protein
for proteolysis by the two proteases ClpAP and ClpXP.
Thus any proteins that are only partially synthesized by stalled ribosomes can
be rapidly destroyed and turned over.
The Genetic Code
Soon after the structure of DNA was proposed, Francis Crick turned his
thoughts to the Genetic Code. At first he realised that any code that used only 2
bases at a time did not have enough information capacity to specify all of the
amino acids found in proteins. He also though that a code that used 3 bases at a
time had too much capacity.
In fact, the idea that there are 20 standard amino acids was not clear at that
time. The search to unravel the Genetic Code, was partly instrumental in leading
to that conclusion as well.
Crick and Sidney Brenner, along with their many colleagues, spent a lot of
time thinking about the Code and how it might be interpreted. Once it was
accepted that there was a standard repertoire of 20 amino acids, the triplet
nature of the code followed.
What did not follow was how these triplets might be arranged. For a time, they
considered an overlapping arrangement of codons (a word coined by Seymour
Benzer) but they were able to dismiss this on the basis of protein sequence
analysis.
Once they felt that the code was non-overlapping, the question became one
of knowing where each triplet began. Proof that the code was indeed a triplet as
well as the determination of the meaning of each triplet came from that old
standby: experimentation.
[26-2]
[S5-15]
[Lod4-28]
above pictures from Nobel web site
Francis Crick (in What Mad Pursuit) describes how he heard about
Nirenberg's results while on a visit to the Biochemical Congress in Moscow in
1961:
The use of poly(A) and poly(C) as templates similarly showed that AAA was
a codon for lysine and that CCC was a codon for proline. However, poly(G) did
not work at all in the system.
This use of homopolymers is clearly quite limited. The use of random mixed
copolymers helped to extend the utility of the system and the information
obtained from it.
RELATIVE
CODON FREQUENCY
FREQUENCY
AAA 0.579 100
AAU 0.116 20
AUA 0.116 20
UAA 0.116 20
AUU 0.023 4
UAU 0.023 4
UUA 0.023 4
UUU 0.00463 1
By measuring the ratios of the different amino acids that are incorporated into
protein using random colpolymer templates, it is possible to narrow down the
range of codons that correspond to particular amino acids.
This method did not yield all of the codon assignments. That required the
chemical synthesis of short oligonucleotides with defined sequences. These were
used in two ways:
Nirenberg and Phil Leder showed that aminoacylated tRNAs could be bound
to ribosomes if the ribosomes contained trinucleotides acting as mRNA.
[Lod4-30] [S5-16]
[MVH27-2] [Lod4-29]
U C A G
UUU Phe UCU Ser UAU Tyr UGU Cys
UUC Phe UCC Ser UAC Tyr UGC Cys
UUA Leu UCA Ser UAA Stop UGA Stop
UUG Leu UCG Ser UAG Stop UGG Trp
CUU Leu CCU Pro CAU His CGU Arg
CUC Leu CCC Pro CAC His CGC Arg
CUA Leu CCA Pro CAA Gln CGA Arg
CUG Leu CCG Pro CAG Gln CGG Arg
AUU Ile ACU The AAU Asn AGU Ser
AUC Ile ACC Thr AAC Asn AGC Ser
AUA Ile ACA Thr AAA Lys AGA Arg
AUG Met ACG Thr AAG Lys AGG Arg
GUU Val GCU Ala GAU Asp GGU Gly
GUC Val GCC Ala GAC Asp GGC Gly
GUA Val GCA Ala GAA Glu GGA Gly
GUG Val GCG Ala GAG Glu GGG Gly
For a simpler view of the this table go to http://esg-
www.mit.edu:8001/esgbio/dogma/images/code.gif.
[T26-1]
[MVH27-1]
• The code is degenerate. Most amino acids are specified by more than
one codon. In fact, only Met and Trp are specified by a single codon:
In general, no codon specifies more than one amino acid. The exceptions so far
are AUG, UGA and UAG. In the first case, AUG specifies both Methionine and
N-formyl-Methionine, which is used to initiate protein synthesis in bacteria. In
the second case, UGA specifies the twenty-first amino-acid selenocysteine as
well as being a stop codon. And, in the last case, UAG specifies the twenty
second amino acid (the most recent to be added to the list), pyrrolysine.
• There is one start codon: AUG. However, note that GUG and UUG are
occasionally found as start codons.