The Evolutionary Consequences of Erroneous Protein Synthesis
The Evolutionary Consequences of Erroneous Protein Synthesis
The Evolutionary Consequences of Erroneous Protein Synthesis
Kinetic misfolding
The synthesis of a functional protein from genetic including the failure of properly synthesized polypep-
The failure of an error-free information is strikingly error prone. For example, tides to fold5. Such selection would act most strongly on
protein to assume its proper amino acid misincorporations during translation highly expressed genes and, in animals, on genes that
ground-state conformation or are estimated to occur once in every 1,000 to 10,000 are expressed in neural tissues. Mathematical modelling
spontaneous loss of the
codons translated 1,2. At this error rate, 15% of all and computer simulations predict biophysical adapta-
ground-state conformation.
average-length protein molecules will contain at least tions that reduce this cost 5,7–9, and several of these pre-
one misincorporated amino acid. Polypeptide errors dictions have been verified in a recent experimental
can induce protein misfolding, aggregation and cell evolution study 10.
death (for example, REF. 3). Misfolded proteins under- Together, these studies indicate that there is a path-
lie a broad array of neurodegenerative diseases, and way that leads from the fidelity of protein production
misincorporation of amino acids during translation through cellular dysfunction and organismal fitness
may be a causative factor in the pathology of multiple defects — exemplified by neuro degeneration — to
sclerosis and amyotrophic lateral sclerosis (ALS)4,5. adaptations whose imprints can be seen in the evolution
Conversely, global defects in protein synthesis produce of coding sequences across taxa.
*FAS Center for Systems tissue-specific neurodegeneration that is linked to the Here, we first review what is known about the fre-
Biology, Harvard University, production of misfolded proteins3,6. quencies of errors in the production of functional pro-
431 Northwest Laboratory, We define erroneous protein synthesis as any teins, from transcription to protein folding. We do not
52 Oxford Street, Cambridge,
Massachusetts 02138, USA.
disruption in the conversion of a coding sequence into attempt a comprehensive review of all measurements.
‡
Center for Computational a functional protein. In addition to amino acid misin- Instead, we aim to provide perspective and to motivate
Biology and Bioinformatics, corporations, the following types of error can arise: much-needed future studies by highlighting the diverse
Institute for Cell and transcription errors, aberrant splicing, premature set of approaches taken. We then review the many
Molecular Biology
termination, faulty post-translational modifications ways in which organisms may have evolved to cope
and Section of Integrative
Biology, The University of and kinetic misfolding (FIG. 1). This definition explicitly with errors in protein synthesis, either by selectively
Texas at Austin, 1 University includes correctly synthesized polypeptides that fail to reducing error rates or by evolving tolerance to errors.
Station C0930, Austin, fold into a functional protein. Next, we examine how organisms exploit errors in pro-
Texas 78712, USA. We have previously suggested that major patterns of tein synthesis to achieve biological and evolutionary
Correspondence to C.O.W.
e-mail:
coding sequence evolution, which are conserved from outcomes that are inaccessible when synthesis is error
[email protected] bacteria to humans, arise from the selective pressure free. We conclude with a discussion of implications for
doi:10.1038/nrg2662 to minimize the cost of erroneous protein synthesis, future research.
Errors in post-translational
Post-translational
modifications
mRNA Transcription modifications
• Incorrect proteolytic cleavage
Transcription errors • Erroneous ubiquitylation,
• Nucleotide misincorporation glycosylation and
• Polymerase slippage Signal phosphorylation
Polymerase Glycan • Other errors in post-
peptide
DNA translational functionalization
Figure 1 | sources of errors in eukaryotic protein synthesis. Errors arise at many stages, from the transcription of
Nature Reviews | Genetics
genetic information to the folding and post-translational modification of the finished polypeptide.
Erroneous protein synthesis one error. At the bacterial scale, perfectly replicated
errors arise at all steps of protein synthesis, from tran- genomes are commonplace, but perfectly synthesized
scription to protein folding, and have widespread phe- proteomes never occur. The available evidence sug-
notypic consequences. However, surprisingly little is gests that eukaryotes are no more or less accurate
known about the range of outcomes and frequencies at protein synthesis than are prokaryotes 13. All else
of errors (subsequently referred to as error spectra). being equal, longer proteins necessarily accumulate
more errors, which leads to astonishing predictions: if
Error rates in protein synthesis. The science of meas- canonical missense error rates hold, each molecule of
uring error rates associated with protein synthesis the giant human muscle protein titin, which consists
remains in its infancy, even though the first attempts of 34,350 amino acids, would contain an average of
were made over 45 years ago (for example, REF. 11). 17 missense errors, and an average human sarcomere
As a case in point, the literature contains experimen- would contain no error-free titin molecules at all.
tal measurements for the frequency of less than 5% of errors in post-translational modification are likely
the 1,216 (64 × 19) possible codon-to-amino-acid to be important, but their frequency and effects remain
errors in translation, and only a few estimates have been largely unknown. one of the most common modifica-
made in the same species. recent studies have made tions, glycosylation, occurs on more than 50% of the
substantial progress in measuring error rates in specific proteins in a human cell14. Glycosylation is not tem-
cases (for example, REF. 12), but ongoing technological plate driven and shows remarkable heterogeneity 15.
advances are likely to lead to the first comprehensive oligosaccharides attached to glycosylation sites tend
view of translation error frequencies in normal cells to vary from copy to copy of the same protein and the
in the near future (BOX 1). occupancy rates of glycosylation sites also vary, which
TABLE 1 provides estimates of error rates in the stages makes it unclear to what extent heterogeneity in gly-
of protein synthesis from transcription through to pro- cosylation should be considered erroneous. That not
tein folding, emphasizing the heterogeneous experi- all heterogeneity is functionally normal is shown by
mental approaches used and the patchy knowledge that the often highly deleterious effects of glycosylation-
has resulted. The central observation is that protein- altering mutations, which usually affect the efficiency
synthesis errors are orders of magnitude more frequent of glycosylation or the composition of glycans without
than DNA-replication errors. The Escherichia coli disrupting glycosylation altogether 16. The extent and
genome is 4.6 × 106 bp long, so at the typical mutation importance of misphosporylation also remains poorly
rate of approximately 10–9 per bp, 1 bacterium in 200 understood despite its potentially major consequences.
will bear a mutation in its genome. by contrast, the For example, misphosphorylation of the microtubule-
average E. coli coding sequence is 335 codons long, binding protein tau is a pathological signature of all
and at a canonical per-codon missense error rate of cases of Alzheimer’s disease and apparently contributes
5 × 10–4, 15% of protein molecules will contain at least to tau misfolding and aggregation17.
Box 1 | Measuring translational error rates Deleterious effects of synthesis errors. Plentiful evidence
shows that errors in protein synthesis reduce organism
Translation is the most error-prone step of protein synthesis. Therefore, accurate fitness. The disruption of translational fidelity with com-
measurements of amino acid misincorporation rates are crucial for a thorough mon antibiotics, such as streptomycin and kanamycin,
understanding of synthesis errors. We can write all possible missense errors in the
kills bacteria. Also, cells with impaired translational
form of a 64 × 19 matrix with 1,216 independent entries. To date, only a small
proofreading ability display altered morphologies20 and
percentage of these entries has been measured in a handful of organisms.
The challenge in measuring missense error rates is that in a given sample, the suffer severe fitness defects21, as do cells with increased
abundance of error-free molecules is several orders of magnitude higher than rates of transcription errors in an essential gene10. Finally,
that of any species of error-containing molecules — this causes most unbiased defects in translational fidelity and in protein folding
detection methods to become overloaded and forces investigators to use clever, cause disease phenotypes in mouse models3,6.
but strongly biased, schemes to obtain any result at all. A single amino acid substitution in the editing
Historical methods used to measure translational error rates fall into three broad domain of an alanyl-trNA synthetase causes mis-
categories. First, some groups have measured the amount of a specific amino acid acylation, subsequent widespread translation errors
in a protein that should not contain this amino acid. For example, Edelmann and and protein misfolding. In turn this causes degenera-
Gallant measured the amount of cysteine in the normally cysteine-free protein
tion of Purkinje cells in the mouse cerebellum, ataxia
flagellin of Escherichia coli91. Second, some groups have measured the change in
and death3. This result supports the possibility that
the isoelectric point of a protein due to amino acid misincorporation92,93. Both of
these approaches share the drawback that they average over many different disease conditions that involve tissue-specific dys-
elements in the ribosomal error matrix. A third approach builds on special function arise from global errors in protein synthesis20.
reporter systems that produce a signal when a specific codon is mistranslated. For Neurons may be unusually sensitive to synthesis errors
example, Kramer and Farabaugh studied the misincorporation of lysine at various because of their long lifetimes, their large surface-area-
codons using the fused luciferases F-luc and R-luc, the luminescence of which can to-volume ratios (and correspondingly abundant sites
be determined independently and with high accuracy. In F-luc, they replaced the for membrane-induced aggregation)22, their branched
codon for the essential lysine at position 529 with all near-cognate codons and morphologies (which impede transport and damage
several other codons12,94. With these constructs, they measured the frequency of responses)23, their fluctuating cell polarization and
mistranslation of specific codons into lysine by assaying the F-luc activity relative
because their protein quality control systems are more
to the R-luc activity.
likely to be overloaded by misfolded proteins17 (compare
Could an estimate of the entire 64 × 19 error matrix be obtained in a single
experiment? In principle, yes. Massive gains in the sensitivity of quantitative with REFS 24,25).
tandem mass spectrometry (MS/MS) (for example, REF. 90) offer the tantalizing Fitness costs can arise by multiple different mecha-
potential for detecting low-frequency errors against a background of wild-type nisms. Protein-synthesis errors will often lead to the
molecules. Deep quantitative MS/MS probing of peptides generated from a loss of function of the protein. A recent study showed
purified target protein or proteins, using a detection database that includes all that disruption of folding and function of the antibiotic-
possible single amino acid substitutions as well as the DNA-encoded sequence, resistance protein β-lactamase by transcriptional errors
could in principle detect both the type and position of amino acid substitutions reduced cellular fitness but could be compensated for
introduced by mistranscription and mistranslation. By encoding the target by increased expression of the protein and by stabilizing
protein(s) with multiple instances of all 64 codons, the error spectrum of each
mutations in the protein sequence10.
codon could be estimated in multiple contexts. Single-molecule RNA sequencing
Protein-synthesis errors may also produce polypep-
of the transcripts of the target gene could then be used to assess the frequency
and position of transcription errors, which would allow translation errors due to tides that display a gain of toxic function. In rare cases,
misacylation and misreading to be disentangled. Although such an experiment is the error may confer an alternative or pathological
technically demanding, it is within the reach of present-day methods and would, function on an otherwise normal folded protein. More
at a stroke, provide the first comprehensive view of the translation error spectrum often, errors disrupt folding, and the misfolded mol-
in any organism. ecule may be toxic. In this context, ‘toxic’ simply means
harmful and does not specify the modality or severity
of the harm. Misfolded proteins may destabilize mem-
Perhaps surprisingly, we know even less about branes26, steal quality-control bandwidth from essential
the error rates in the production of functional pro- proteins24,25 and induce chronic stress. The toxic effects
teins. An early study reported that up to 30% of newly of aminoglycoside antibiotics, which act on ribosomes
synthesized proteins were rapidly degraded, and most and lead to the production of misfolded proteins, have
of the degraded proteins were believed to be defec- been traced in part to misfolded-protein-induced signal-
tive ribosomal products 18. but a later study using ling through the membrane receptor cpxA. The ultimate
Gain of toxic function
Any event that causes a
similar techniques found that most newly synthe- consequence is increased free radical formation, mem-
protein to generate a sized proteins were largely protected from degrada- brane depolarization and cell death27. Misfolded protein
deleterious effect on the cell tion, even when they were unable to fold correctly cytotoxicity has been studied extensively as a contributor
that expresses it. For example, owing to misincorporation errors, and “at most a few to neurodegenerative disease. It has become increasingly
a mutation that causes a
percent” of newly synthesized proteins were rapidly clear that at the molecular level, misfolding-associated
protein to aggregate and
become cytotoxic would be degraded19. Therefore, the failure rate of functional- disease phenotypes often reflect gains of toxic function
called a gain-of-toxic-function protein production, which is the ultimate expression rather than losses of function3,17,22,23,25,26,28.
mutation. of failures in protein synthesis, remains essentially In another scenario, the synthesis and degradation
unknown. Correspondingly, the proportion of those of a non-functional protein may be apparently harm-
Clean-up costs
Any fitness costs related to the
failures due to upstream synthesis errors versus errors less but may incur clean-up costs (for example, REF. 29).
production and degradation of in folding of correctly synthesized proteins also ribosomal throughput dedicated to a polypeptide that
non-functional protein. remains unclear. will ultimately fail to function is a cost, particularly for
fast-growing organisms30. The expression of quality Effect of gene expression level. In scenarios in which
control systems, such as chaperones, to assist, rescue protein-synthesis errors produce harmful molecular
or degrade polypeptides is a further fitness cost acting species or waste valuable cellular resources, the severity
in trans. Toxicity and clean-up costs may coexist: even of the resulting phenotypic effects will depend on the
if quality control systems ultimately detect and either expression level of that gene. The production of errone-
degrade or refold all misfolded proteins, the proteins ously synthesized proteins increases with the expression
may still wreak substantial toxic havoc, just as crime level of a gene, and so, as a result, does the influence
does not cease to be a problem even if all criminals are of these proteins on the phenotype of an organism. For
eventually caught. example, the clean-up costs due to the synthesis of non-
errors in the proteins responsible for the reproduc- functional protein will be proportional to the amount
tion of genetic and non-genetic material, particularly in of protein produced. Many forms of misfolded-protein
translation and replication, may lead to reduced fidelity toxicity, such as aggregation and interference with
and subsequent dysfunction in succeeding generations. membranes, increase with absolute protein concentra-
Such an effect, originally conceived as an error catas- tion and therefore with gene expression level. Note that
trophe by orgel31, has been shown in bacteria, in which if synthesis errors primarily act by reducing protein
heritable mutations can arise from an editing defect function, an effect from the gene expression level is not
in translation21,32. expected. errors in a protein, such as a DNA polymerase
a Functional protein Accuracy selection should not, however, cause the uni-
form usage of accurate codons along the gene. Instead,
it should disproportionately affect those sites at which
translation errors would have particularly severe effects
on protein folding or function33. A common test for
Polypeptide
translational-accuracy selection therefore assesses
mRNA whether preferred codons associate with evolutionarily
conserved sites5,33,34 or with sites that are known to be
important for protein structure or function33,35. In gen-
Ribosome Misfolded protein
eral, these analyses show a moderate but highly signifi-
cant tendency for preferred codons to coincide with sites
at which translation errors are expected to be important,
Translational which is consistent with weak selection for increased
error translational accuracy.
ribosomes can also be made more accurate. However,
whereas changes in codon usage to improve accu-
b Functional protein racy come at little cost, increased ribosomal accuracy
often comes at the cost of translation speed and energy
efficiency 36. This is due in part to the intrinsic physical
implementation of increased accuracy through increased
energy-dependent rejection of trNAs. Consequently,
organisms may evolve to balance ribosome speed,
ribosome accuracy and energetic costs.
A second codon-level selection pressure penalizes
codons that have a high probability of being mistrans-
Misfolded protein lated into radically different amino acids. We refer to
this selection pressure as selection for error mitigation.
Although it does not lead to a reduction of error fre-
quencies per se, this selection pressure reduces the
frequency of the most costly errors at the expense of
a larger number of more benign errors. Several bioin-
Figure 2 | Alternative strategies for reducing protein misfolding. a | In this formatics studies have found evidence for selection for
Nature Reviews | Genetics
scenario, the proteins are poor folders and misfold readily. However, a highly accurate
error mitigation37–39. The genetic code also has error-
translational apparatus produces few proteins with translation errors and therefore
limits the total amount of misfolded proteins. b | In this scenario, an error-prone
mitigating properties40 and may have evolved specifically to
translation system produces many proteins with errors. But the proteins fold readily minimize the effects of translation errors41.
and tend not to misfold even when mistranslated. It is likely that selection limits error frequencies at
all steps of protein synthesis. Some simple predictions
have yet to be tested, such as whether high-expression
or transcription factor, with low expression but a crucial genes have lower transcriptional error rates. but aside
Translational-accuracy function, are likely to have an equally deleterious effect from accuracy selection and error mitigation, little is
selection
A selection pressure that
on organism fitness as errors that disrupt the function of known about the signatures that would indicate such
causes genes or specific sites a highly abundant enzyme or structural molecule5,7. selection pressures. one exception is the efficiency of
in genes to be encoded by splicing in fission yeast, as estimated by the proportion
high-fidelity codons — that is, Adaptations for cost minimization of intron–exon junctions retained in cellular mrNAs. It
codons that correspond to
Faced with costly protein-synthesis errors, organisms increases markedly with gene expression level42, presum-
abundant tRNAs.
may evolve two high-level cost-reduction strategies: ably because mis-splicing becomes more costly when
Selection for error the reduction of error frequencies (increased accuracy) incorrectly spliced mrNAs are abundant.
mitigation and the reduction of the costs of the remaining errors
A selection pressure that causes (increased tolerance or robustness). because costs tend Increased tolerance or robustness. errors need not be
genes or specific sites in genes
to be encoded by codons that,
to increase with gene expression level, selection for cost eliminated entirely if organisms can tolerate a certain
when mistranslated, lead to the reduction is often visible in differences between genes of amount of errors without paying a substantial fitness
substitution of amino acids with low and high expression levels. cost (FIG. 2). Some tolerance is inherent in protein bio-
limited deleterious effects. chemistry. In vitro, proteins can be robust to many
Reduction of error frequencies. The primary source of individual or multiple mutations43–48, although most
Translational-robustness
selection missense substitutions during protein synthesis is the mis- mutations tend to reduce protein stability. robustness
A selection pressure that incorporation of non-cognate trNAs during translation. can itself be modulated by mutations in the protein47,48.
causes proteins to be tolerant Codons corresponding to low-abundance trNAs tend to These observations suggest that proteins can evolve
to missense errors under be more error prone than other codons12. Consequently, robustness to typical errors arising under translation,
translation. Translationally
robust proteins fold and
codon usage affects translation error frequencies. which is termed translational-robustness selection7,8.
function even when Selection pressure to use codons with low error rates is Proteins that possess translational robustness can fold
mistranslated. commonly referred to as translational-accuracy selection33. and function properly even if they are mistranslated.
Mathematical and computational modelling predicts that involves only selection against mistranslation-
that this selection pressure will cause proteins to be induced misfolding 5. New genome-wide signals are
more thermostable and to also be more tolerant to needed to disentangle the selection pressures against the
genetic mutations 5,7–9. recent experimental results costs of error-free protein misfolding from those against
confirm these predictions10. error-induced protein misfolding.
but even if a gene is translated without any errors, In all extant models, we do not expect a one-to-one
the resulting protein may misfold because of interactions relationship between gene expression level and evolution-
with other proteins (such as misfolded or aggregated ary conservation. Selection acts only on the deleterious
proteins) or because of properties of the protein itself. outcomes of erroneous synthesis, which may vary from
Key protein properties include thermodynamic stabil- protein to protein. For example, protein alleles that are
ity, which is measured by the free energy of unfolding, less likely to become toxic, or are more rapidly detected
and kinetic folding, which is measured by the rate of and shuttled towards degradation or refolding, should
folding or unfolding. In most cases, thermodynamics experience less evolutionary constraint. This pressure,
dictate whether a protein can attain a stable folded state and the resulting constraints on sequence change, should
and kinetics determine how likely a thermodynami- intensify with increasing expression level or for genes
cally stable protein is to complete folding before other that are expressed in sensitive tissue types. Likewise, if
processes, such as aggregation and degradation, derail a particular protein fold is highly tolerant to synthesis
it. rapid folding and high stability tend to be correlated. errors, then genes encoding proteins of this fold will
We have previously suggested that selection reduces the experience little selection pressure to reduce costs, even
propensity of proteins to misfold even when translated if they are expressed at a relatively high level. by contrast,
without errors5,7, but this hypothesis has not yet been sensitive folds will experience much stronger selection
tested experimentally. because of the close relation- pressure at comparable expression levels. Consistent
ship between thermodynamic stability and tolerance to with this reasoning, the biophysical properties of the
mutations47–50, more translationally robust proteins may protein fold also influence the rate of sequence diver-
also be more kinetically stable and vice versa. A key dif- gence62–65, and the relative contributions of expression
ficulty at present is distinguishing stochastic misfolding level and protein structure to evolutionary conservation
from mistranslation-induced misfolding, as translation seem to be of a similar magnitude66.
errors remain difficult to detect. Consistent with either
translational-robustness selection or selection against Beneficial synthesis errors
stochastic misfolding is the observation that highly even though errors in protein synthesis tend to be del-
expressed genes are less aggregation prone than genes eterious on average, in numerous cases they can have
of low expression level51–53. direct benefits for organism fitness.
other adaptations besides robust protein folding may
reduce the cost of synthesis errors. one is the efficient Error-dependent protein expression. A wide range of
detection and degradation of mis-spliced products. The organisms, from viruses to mammals, has evolved cer-
nonsense-mediated decay pathway degrades mrNAs tain genes that depend on errors in protein synthesis. The
that contain premature stop codons54. The introns of best-known example of such a situation is programmed
eukaryotes tend to either contain stop codons or alter the frameshifting, in which the elongating ribosome shifts
translational reading frame to reveal a downstream stop forward (+1) or back (−1) by a single nucleotide to enter
codon, which leads to the degradation of mis-spliced a new reading frame67. Escherichia coli DNA polymerase
transcripts by the nonsense-mediated decay pathway 55. III subunits τ and γ and eukaryotic ornithine decarboxy-
lase antizyme (oAZ) depend on frameshifting for proper
Genome-wide signatures of cost reduction. broad pat- protein expression68,69.
terns of coding-sequence evolution, such as the tendency Programmed frameshifts can control gene expres-
for highly expressed proteins to evolve slowly, may reflect sion70 (FIG. 3a). oAZ non-competitively inhibits orni-
selection to reduce the costs of protein misfolding 5. thine decarboxylase (oDC), an enzyme that catalyses
Genome-wide analyses of evolutionary rates have con- the first step in polyamine synthesis. Polyamines such
sistently found that expression level is a major predictor as spermidine stimulate +1 frameshifting. In eukaryotes
of both synonymous and non-synonymous divergence from fission yeast to mammals, the gene that encodes
Stochastic misfolding
in bacteria, fungi, plants and animals5,56. Multivariate oAZ normally terminates at an early stop codon and
Misfolding of error-free analyses find that quantities related to translation fre- yields only a short peptide with no inhibitory activity,
polypeptides. Also see ‘kinetic quency make stronger contributions to evolutionary rate but a +1 frameshift yields a full-length antizyme that
misfolding’. than do quantities linked primarily to gene function57–61. inhibits oDC. At low polyamine levels, frameshifting
We have suggested that selection against protein mis- occurs infrequently and little antizyme is produced,
Programmed frameshifting
Frameshifting that is required folding, including misfolding of error-free polypeptides, more oDC is active and more polyamines accumulate70.
for the proper expression of a imposes a strong constraint on coding-sequence evolu- As polyamine levels rise, frameshifting is stimulated and
specific functional protein. The tion5. Many genomic patterns — covariation among more full-length antizyme is produced, which inhibits
frequency with which evolutionary rates, expression level, codon-usage bias oDC and reduces polyamine production70. Therefore,
ribosomes change the reading
frame at programmed-
and the transition–transversion ratio, as well as an the polyamine-controlled frequency of a translation
frameshift sites is often tightly association between optimal codons and evolutionar- ‘error’ has evolved to implement the feedback regulation
regulated. ily conserved sites — can be reproduced in a model of polyamine levels.
a Degradation of ODC OAZ–ODC High been uncovered and exploited by viruses, is now being
complex co-opted by human biological engineers74.
Polyamine level
Such nonsense suppression had been long studied in
OAZ ORF 1 bacterial and cell culture systems75. recently, it has taken
STOP OAZ ORF 2
STOP′ on increased importance as a therapy for genetic diseases
+1 frameshift
in which a premature stop-codon mutation causes a dis-
ease phenotype, such as in cystic fibrosis and Duchenne
Short, muscular dystrophy 76.
non-functional Drugs that interfere with translational fidelity in bac-
peptide teria are commonly used as antibiotics. bacterial mutants
that depend on streptomycin for viability are readily iso-
ODC
lated77 and tend to have hyperaccurate but slow ribo-
Ornithine Putrescine Low somes36. Streptomycin independence is often regained
by mutations that decrease ribosomal fidelity 78.
Box 2 | Open questions error prone and that erroneous protein synthesis can dif-
ferentially affect specific tissue types, impose substan-
• What are the exact error rates for transcription, splicing, translation and tial cellular fitness costs and modulate the evolution of
post-translational modifications? How do these rates vary between genes whole genomes.
and covary with key variables, such as gene expression levels?
In stark contrast with the rarity of DNA replica-
• What is the failure rate of protein folding? What proportion of folding failures tion errors, the extraordinary frequency of protein-
results from upstream synthesis errors, stochastic factors or trans-acting factors? synthesis errors in normal cells urges a different, perhaps
• What is the genome-wide distribution of post-translational modifications, such as unfamiliar, view of cellular operations. Cells are inher-
glycosylation and phosphorylation? What proportion of post-translational ently noisy statistical ensembles, and the genotype is best
modification events is deleterious under normal conditions? understood as encoding the frequency of different out-
• What are the main mechanisms by which protein-synthesis errors produce comes rather than a single so-called ‘correct’ state that
fitness costs? is disrupted by errors. Notions of correct and erroneous
• How large are the fitness costs associated with erroneous protein synthesis may be subsumed by the more useful notions of benefi-
compared with other unrelated fitness costs? cial and deleterious, with the important difference that
• Why are error rates so high even though the associated fitness costs seem substantial? supposed errors may be beneficial, and even essential. For
• How important are synthesis errors for the proper functioning of the example, programmed +1 frameshifts and translational
cellular machinery? hops seem to have evolved through the amplification
• How else have organisms evolved to exploit the molecular diversity present in their of low-frequency translation errors67.
cells due to errors in protein synthesis? recent single-molecule studies underscore the need
• What do signatures of natural selection against the consequences of to embrace the extraordinary molecular diversity that
protein-synthesis errors reveal about human neurodegenerative diseases, arises from a single genotype. In fission yeast, the fre-
particularly their prevalence, severity and cellular manifestations? quency of retained introns seems to exceed 90% for the
majority of transcripts42. Are all these retained introns
technical artefacts, errors that have deleterious effects
Implications for future research that are too small to be eliminated by natural selection,
our understanding of the fidelity of transcription, errors in transcripts that are destined for degradation
translation and protein folding remains far from com- by nonsense-mediated decay 55 or uneasy compromises
plete (BOX 2). No comprehensive, or even representative, that have resulted from energetic or kinetic costs asso-
inventories of error spectra exist for cells under normal ciated with increased splicing fidelity? or do some of
physiological conditions. Technological innovations, these retained introns confer important benefits on the
such as single-molecule nucleic acid sequencing, have organism that would be suppressed by higher-fidelity
given us a surprising portrait of rampant splicing errors splicing? Similarly, for some high-expression proteins,
in eukaryotic genomes42, and this technology, in combi- certain mistranslation-generated, biochemically simi-
nation with deep-coverage quantitative mass spectrom- lar molecular species are expected to exist at cellular
etry 90, may soon provide a similar breakthrough in our abundances of 10–100 molecules per cell, which is suf-
understanding of the rates of transcriptional and transla- ficient for their function as regulatory proteins. It seems
tional errors (BOX 1). However, the frequency and types of unlikely that nature always fails to exploit the existence
errors in common post-translational modifications, such of these molecular subspecies, but they are difficult
as glycosylation and phosphorylation, remain almost to hunt down; perhaps high-expression genes that
completely unknown, as do the consequences of these change expression markedly in cells with hyperaccurate
errors for protein folding and function. Moreover, the ribosomes may point to autoregulatory systems that are
relative fitness costs of loss of protein function, quality maintained by mistranslation. We believe that rather
control and gain of toxic function remain unknown, and than being a negligible nuisance, erroneous synthesis,
considerable effort will be required to determine them with its attendant modifiers and resulting adapta-
(BOX 2). Whatever the results of such studies, the exist- tions, will play a central part in our understanding
ing evidence shows that protein synthesis is surprisingly of molecular evolution.
1. Parker, J. Errors and alternatives in reading 5. Drummond, D. A. & Wilke, C. O. Mistranslation-induced 8. Wilke, C. O. & Drummond, D. A.
the universal genetic code. Microbiol. Rev. 53, protein misfolding as a dominant constraint on Population genetics of translational robustness.
273–298 (1989). coding-sequence evolution. Cell 134, 341–352 (2008). Genetics 173, 473–481 (2006).
2. Ogle, J. M. & Ramakrishnan, V. Structural insights A bioinformatics and modelling study showing 9. Willensdorfer, M., Bürger, R. & Nowak, M. A.
into translational fidelity. Annu. Rev. Biochem. 74, that genome-wide patterns of molecular evolution Phenotypic mutation rates and the abundance
129–177 (2005). are shared from bacteria to mammals. The study of abnormal proteins in yeast. PLoS Comput. Biol.
3. Lee, J. W. et al. Editing-defective tRNA synthetase also showed that in a simple model of protein 3, e203 (2007).
causes protein misfolding and neurodegeneration. evolution, selection against mistranslation-induced 10. Goldsmith, M. & Tawfik, D. S. Potential role
Nature 443, 50–55 (2006). protein misfolding is sufficient to reproduce of phenotypic mutations in the evolution of
Global mistranslation-induced protein these patterns. protein expression and stability. Proc. Natl Acad.
misfolding leads to cell type-specific 6. Zhao, L., Longo-Guess, C., Harris, B. S., Lee, J. W. & Sci. USA 106, 6197–6202 (2009).
neurodegeneration in mice. An exciting study Ackerman, S. L. Protein accumulation and One of the first studies to investigate
that links the fidelity of translation to a disease neurodegeneration in the woozy mutant mouse is the effect of transcription errors on
phenotype. caused by disruption of SIL1, a cochaperone of BiP. protein evolution in an experimental system.
4. Rubinstein, E. Misincorporation of the proline Nature Genet. 37, 974–979 (2005). TEM1 β-lactamase expressed using an
analog azetidine-2-carboxylic acid in the 7. Drummond, D. A., Bloom, J. D., Adami, C., error-prone RNA polymerase evolved an
pathogenesis of multiple sclerosis: a hypothesis. Wilke, C. O. & Arnold, F. H. Why highly expressed increased level of gene expression, increased
J. Neuropathol. Exp. Neurol. 67, 1032–1034 proteins evolve slowly. Proc. Natl Acad. Sci. USA thermostability and increased mutational
(2008). 102, 14338–14343 (2005). robustness.
11. Loftfield, R. B. The frequency of errors in protein 34. Stoletzki, N. & Eyre-Walker, A. Synonymous codon 57. Rocha , E. P. C. & Danchin, A. An analysis of
synthesis. Biochem. J. 89, 82–92 (1963). usage in Escherichia coli: selection for translational determinants of amino acids substitution rates in
12. Kramer, E. B. & Farabaugh, P. J. The frequency of accuracy. Mol. Biol. Evol. 24, 374–381 (2007). bacterial proteins. Mol. Biol. Evol. 21, 108–116 (2004).
translational misreading errors in E. coli is largely 35. Zhou, T., Weems, M. & Wilke, C. O. Translationally 58. Agrafioti, I. et al. Comparative analysis of the
determined by tRNA competition. RNA 13, 87–96 optimal codons associate with structurally sensitive Saccharomyces cerevisiae and Caenorhabditis
(2007). sites in proteins. Mol. Biol. Evol. 26, 1571–1580 elegans protein interaction networks. BMC Evol. Biol.
A highly accurate measurement of specific amino (2009). 5, 23 (2005).
acid misincorporation frequencies under translation. 36. Ruusala, T., Andersson, D., Ehrenberg, M. & 59. Wolf, Y. I., Carmel, L. & Koonin, E. V.
13. Stansfield, I., Jones, K. M., Herbert, P., Shaw, A. L. W. V. Kurland, C. G. Hyper-accurate ribosomes inhibit Unifying measures of gene function and evolution.
& Tuite, M. F. Missense translation errors in growth. EMBO J. 3, 2575–2580 (1984). Proc. Biol. Sci. 273, 1507–1515 (2006).
Saccharomyces cerevisiae. J. Mol. Biol. 282, 13–24 37. Archetti, M. Selection on codon usage for error 60. Drummond, D. A., Raval, A. & Wilke, C. O. A single
(1998). minimization at the protein level. J. Mol. Evol. 59, determinant dominates the rate of yeast protein
14. Wong, C.-H. Protein glycosylation: new challenges and 400–415 (2004). evolution. Mol. Biol. Evol. 23, 327–337 (2006).
opportunities. J. Org. Chem. 70, 4219–4225 (2005). 38. Archetti, M. Genetic robustness and selection at the 61. Xia, Y., Franzosa, E. A. & Gerstein, M. B.
15. Mahal, L. K. Glycomics: towards bioinformatic protein level for synonymous codons. J. Evol. Biol. 19, Integrated assessment of genomic correlates of
approaches to understanding glycosylation. 353–365 (2006). protein evolutionary rate. PLoS Comp. Biol. 5,
Anticancer Agents Med. Chem. 8, 37–51 (2008). 39. Higgs, P. G., Hao, W. & Golding, G. B. Identification of e1000413 (2009).
16. Freeze, H. H. Genetic defects in the human glycome. conflicting selective effects on highly expressed genes. 62. Bloom, J. D., Drummond, D. A., Arnold, F. H. &
Nature Rev. Genet. 7, 537–551 (2006). Evol. Bioinform. Online 3, 1–13 (2007). Wilke, C. O. Structural determinants of the rate of
17. Winklhofer, K., Tatzelt, J. & Haass, C. The two faces of 40. Freeland, S. J. & Hurst, L. D. The genetic code is one protein evolution in yeast. Mol. Biol. Evol. 23,
protein misfolding: gain- and loss-of-function in in a million. J. Mol. Evol. 47, 238–248 (1998). 1751–1761 (2006).
neurodegenerative diseases. EMBO J. 27, 336–349 41. Woese, C. R. On the evolution of the genetic 63. Hartling, J. & Kim, J. Mutational robustness
(2008). code. Proc. Natl Acad. Sci. USA 54, 1546–1552 and geometrical form in protein structures.
18. Schubert, U. et al. Rapid degradation of a large (1965). J. Exp. Zoolog. B Mol. Dev. Evol. 310, 216–226
fraction of newly synthesized proteins by proteasomes. 42. Wilhelm, B. T. et al. Dynamic repertoire of a eukaryotic (2007).
Nature 404, 770–774 (2000). transcriptome surveyed at single-nucleotide 64. Choi, S. C., Hobolth, A., Robinson, D. M., Kishino, H.
19. Vabulas, R. & Hartl, F. Protein synthesis upon acute resolution. Nature 453, 1239–1243 (2008). & Thorne, J. L. Quantifying the impact of protein
nutrient restriction relies on proteasome function. A detailed study of the transcriptome of tertiary structure on molecular evolution. Mol. Biol.
Science 310, 1960–1963 (2005). Schizosaccharomyces pombe under multiple Evol. 24, 1769–1782 (2007).
20. Nangle, L., Motta, C. & Schimmel, P. Global effects of conditions using both high-throughput sequencing 65. Zhou, T., Drummond, D. A. & Wilke, C. O. Contact
mistranslation from an editing defect in mammalian and tiling arrays. The study found widespread density affects protein evolutionary rate from bacteria
cells. Chem. Biol. 13, 1091–1100 (2006). transcription of non-coding regions and frequent to animals. J. Mol. Evol. 66, 395–404 (2008).
21. Bacher, J. M., de Crécy-Lagard, V. & Schimmel, P. R. intron retention in mRNAs. 66. Wolf, M. Y., Wolf, Y. I. & Koonin, E. V.
Inhibited cell growth and protein functional changes 43. Loeb, D. D. et al. Complete mutagenesis of the HIV-1 Comparable contributions of structural-functional
from an editing-defective tRNA synthetase. Proc. Natl protease. Nature 340, 397–400 (1989). constraints and expression level to the rate of protein
Acad. Sci. USA 102, 1697–1701 (2005). 44. Shafikhani, S., Siegel, R. A., Ferrari, E. & sequence evolution. Biol. Direct 3, 40 (2008).
22. Stefani, M. Generic cell dysfunction in Schnellenberger, V. Generation of large libraries 67. Farabaugh, P. J. Programmed translational
neurodegenerative disorders: role of surfaces in early of random mutants in Bacillus subtilis by frameshifting. Annu. Rev. Genet. 30, 507–528
protein misfolding, aggregation, and aggregate PCR-based plasmid multimerization. Biotechniques (1996).
cytotoxicity. Neuroscientist 13, 519–531 (2007). 23, 304–310 (1997). 68. Blinkowa, A. L. & Walker, J. R. Programmed ribosomal
23. Malgaroli, A., Vallar, L. & Zimarino, V. 45. Daugherty, P. S., Chen, G., Iverson, B. L. & Georgiou, G. frameshifting generates the Escherichia coli
Protein homeostasis in neurons and its pathological Quantitative analysis of the effect of the mutation DNA polymerase III γ subunit from within the
alterations. Curr. Opin. Neurobiol. 16, 270–274 (2006). frequency on the affinity maturation of single chain Fv τ subunit reading frame. Nucleic Acids Res. 18,
24. Gidalevitz, T., Ben-Zvi, A., Ho, K., Brignull, H. & antibodies. Proc. Natl Acad. Sci. USA 97, 2029–2034 1725–1729 (1990).
Morimoto, R. I. Progressive disruption of cellular (1999). 69. Matsufuji, S. et al. Autoregulatory frameshifting in
protein folding in models of polyglutamine diseases. 46. Guo, H. H., Choe, J. & Loeb, L. A. Protein tolerance to decoding mammalian ornithine decarboxylase
Science 311, 1471–1474 (2006). random amino acid change. Proc. Natl Acad. Sci. USA antizyme. Cell 80, 51–60 (1995).
This study provides insights into the nature of 101, 9205–9210 (2004). 70. Ivanov, I. P., Matsufuji, S., Murakami, Y., Gesteland, R. F.
cellular costs due to protein misfolding. It shows 47. Bloom, J. D. et al. Thermodynamic prediction of & Atkin, J. F. Conservation of polyamine regulation by
how aggregation-prone proteins can induce protein neutrality. Proc. Natl Acad. Sci. USA 102, translational frameshifting from yeast to mammals.
cytotoxic effects by destabilizing marginally stable 606–611 (2005). EMBO J. 19, 1907–1917 (2000).
essential proteins. 48. Bershtein, S., Segal, M., Bekerman, R., Tokuriki, N. & 71. Pleiss, J. A., Whitworth, G. B., Bergkessel, M. &
25. Gidalevitz, T., Krupinski, T., Garcia, S. & Morimoto, R. I. Tawfik, D. S. Robustness–epistasis link shapes the Guthrie, C. Rapid, transcript-specific changes in
Destabilizing protein polymorphisms in the genetic fitness landscape of a randomly drifting protein. splicing in response to environmental stress. Mol. Cell
background direct phenotypic expression of mutant Nature 444, 929–932 (2006). 27, 928–937 (2007).
SOD1 toxicity. PLoS Genet. 5, e1000399 (2009). 49. Taverna, D. M. & Goldstein, R. A. Why are proteins so 72. Engelberg-Kulka, H., Dekel, L., Israeli-Reches, M. &
26. Stefani, M. & Dobson, C. M. Protein aggregation and robust to site mutations? J. Mol. Biol. 315, 479–484 Belfort, M. The requirement of nonsense suppression
aggregate toxicity: new insights into protein folding, (2002). for the development of several phages. Mol. Gen.
misfolding diseases and biological evolution. J. Mol. 50. Wilke, C. O., Bloom, J. D., Drummond, D. A. & Genet. 170, 155–159 (1979).
Med. 81, 678–699 (2003). Raval, A. Predicting the tolerance of proteins to 73. Donnelly, M. L. L. et al. Analysis of the aphthovirus
27. Kohanski, M. A., Dwyer, D. J., Wierzbowski, J., random amino acid substitution. Biophys. J. 89, 2A/2B polyprotein ‘cleavage’ mechanism indicates
Cottarel, G. & Collins, J. J. Mistranslation of 3714–3720 (2005). not a proteolytic reaction, but a novel translational
membrane proteins and two-component system 51. Tartaglia, G. G., Pechmann, S., Dobson, C. M. & effect: a putative ribosomal ‘skip’. J. Gen. Virol. 82,
activation trigger antibiotic-mediated cell death. Vendruscolo, M. Life on the edge: a link between 1013–1025 (2001).
Cell 135, 679–690 (2008). gene expression levels and aggregation rates 74. Funston, G. M., Kallioinen, S. E., de Felipe, P.,
28. Bucciantini, M. et al. Inherent toxicity of aggregates of human proteins. Trends Biochem. Sci. 32, Ryan, M. D. & Iggo, R. D. Expression of heterologous
implies a common mechanism for protein misfolding 204–206 (2007). genes in oncolytic adenoviruses using picornaviral 2A
diseases. Nature 416, 507–511 (2002). 52. Vendruscolo, M. & Tartaglia, G. G. Towards sequences that trigger ribosome skipping. J. Gen. Virol.
One of the first demonstrations that misfolded quantitative predictions in cell biology using 89, 389–396 (2008).
proteins can be cytotoxic. Aggregates of the chemical properties of proteins. Mol. Biosyst. 4, 75. Gorini, L. Informational suppression. Annu. Rev.
N-terminal domain of the E. coli HypF protein 1170–1175 (2008). Genet. 4, 107–134 (1970).
reduce the viability of mouse fibroblasts in a 53. Tartaglia, G. G., Pechmann, S., Dobson, C. M. & 76. Welch, E. M. et al. PTC124 targets genetic
concentration-dependent manner. Vendruscolo, M. A relationship between mRNA disorders caused by nonsense mutations. Nature
29. Bürger, R., Willensdorfer, M. & Nowak, M. A. expression levels and protein solubility in E. coli. 447, 87–91 (2007).
Why are phenotypic mutation rates much higher than J. Mol. Biol. 388, 381–389 (2009). 77. Bjare, U. & Gorini, L. Drug dependence reversed by a
genotypic mutation rates? Genetics 172, 197–206 54. McGlincy, N. J. & Smith, C. W. J. Alternative splicing ribosomal ambiguity mutation, ram, in Escherichia
(2006). resulting in nonsense-mediated mRNA decay: what is coli. J. Mol. Biol. 57, 423–435 (1971).
30. Stoebel, D., Dean, A. & Dykhuizen, D. The cost of the meaning of nonsense? Trends Biochem. Sci. 33, 78. Björkman, J., Samuelsson, P., Andersson, D. &
gene expression of E. coli lac operon proteins is in 385–393 (2008). Hughes, D. Novel ribosomal mutations affecting
the process, not in the products. Genetics 178, 55. Jaillon, O. et al. Translational control of intron translational accuracy, antibiotic resistance and
1653–1660 (2008). splicing in eukaryotes. Nature 451, 359–362 virulence of Salmonella typhimurium. Mol. Microbiol.
31. Orgel, L. The maintenance of the accuracy of protein (2008). 31, 53–58 (1999).
synthesis and its relevance to ageing. Proc. Natl Acad. A study of the importance of the 79. Masel, J. Cryptic genetic variation is enriched for
Sci. USA 49, 517–521 (1963). nonsense-mediated decay pathway in limiting the potential adaptations. Genetics 172, 1985–1991
32. Bacher, J. M. & Schimmel, P. An editing-defective amount of mis-spliced introns in eukaryotes. Many (2006).
aminoacyl-tRNA synthetase is mutagenic in aging introns in eukaryotes contain stop codons that 80. Whitehead, D. J., Wilke, C. O., Vernazobres, D. &
bacteria via the SOS response. Proc. Natl Acad. Sci. trigger the nonsense-mediated decay pathway in Bornberg-Bauer, E. The look-ahead effect of
USA 104, 1907–1912 (2007). cases of mis-splicing. phenotypic mutations. Biol. Direct 3, 18 (2008).
33. Akashi, H. Synonymous codon usage in Drosophila 56. Pál, C., Papp, B. & Lercher, M. An integrated 81. Wickner, R. B., Masison, D. C. & Edskes, H. K.
melanogaster: natural selection and translational view of protein evolution. Nature Rev. Genet. 7, [PSI] and [URE3] as yeast prions. Yeast 11,
accuracy. Genetics 136, 927–935 (1994). 337–348 (2006). 1671–1685 (1995).
82. Chernoff, Y. O., Newnam, G. P., Kumar, J., 88. Cairns, J., Overbaugh, J. & Miller, S. The origin of 97. Daviter, T., Gromadski, K. & Rodnina, M.
Allen, K. & Zink, A. D. Evidence for a protein mutants. Nature 335, 142–145 (1988). The ribosome’s response to codon-anticodon
mutator in yeast: role of the Hsp70-related 89. Andersson, D. I., Slechta, E. S. & Roth, J. R. mismatches. Biochimie 88, 1001–1011 (2006).
chaperone Ssb in formation, stability, and toxicity Evidence that gene amplification underlies adaptive 98. Curran, J. & Yarus, M. Base substitutions in the tRNA
of the [PSI] prion. Mol. Cell. Biol. 19, 8103–8112 mutability of the bacterial lac operon. Science 282, anticodon arm do not degrade the accuracy of reading
(1999). 1133–1135 (1998). frame maintenance. Proc. Natl Acad. Sci. USA 83,
83. True, H. L. & Lindquist, S. L. A yeast prion provides a 90. de Godoy, L. M. F. et al. Comprehensive mass- 6538–6542 (1986).
mechanism for genetic variation and phenotypic spectrometry-based proteome quantification 99. Jorgensen, F. & Kurland, C. G. Processivity errors of
diversity. Nature 407, 477–483 (2000). of haploid versus diploid yeast. Nature 455, gene expression in Escherichia coli. J. Mol. Biol. 215,
An influential paper showing that the yeast prion 1251–1254 (2008). 511–521 (1990).
[PSI+] can uncover hidden genetic variation and 91. Edelmann, P. & Gallant, J. Mistranslation in E. coli. 100. Arava, Y., Boas, F., Brown, P. & Herschlag, D.
produce new heritable phenotypes. Cell 10, 131–137 (1977). Dissecting eukaryotic translation and its control by
84. True, H. L., Berlin, I. & Lindquist, S. L. 92. Parker, J. & Friesen, J. D. ‘Two out of three’ codon ribosome density mapping. Nucleic Acids Res. 33,
Epigenetic regulation of translation reveals hidden reading leading to mistranslation in vivo. Mol. Gen. 2421–2432 (2005).
genetic variation to produce complex traits. Nature Genet. 177, 439–445 (1980).
431, 184–187 (2004). 93. Ellis, N. & Gallant, J. An estimate of the global error Acknowledgements
85. Jensen, M. A., True, H. L., Chernoff, Y. O. & Lindquist, S. frequency in translation. Mol. Gen. Genet. 188, This work was funded in part by the National Institutes of
Molecular population genetics and evolution of a 169–172 (1982). Health grants P50 GM068763 and R01 GM088344.
prion-like protein in Saccharomyces cerevisiae. 94. Toth, M. J., Murgola, E. J. & Schimmel, P. Evidence for
Genetics 159, 527–535 (2001). a unique first position codon-anticodon mismatch
86. King, O. D. & Masel, J. The evolution of bet-hedging in vivo. J. Mol. Biol. 201, 451–454 (1988).
adaptations to rare scenarios. Theor. Popul. Biol. 72, 95. Kireeva, M. L. et al. Transient reversal of RNA FURTHER INFORMATION
560–575 (2007). polymerase II active site closing controls fidelity of D. Allan Drummond’s homepage:
87. Griswold, C. K. & Masel, J. Complex adaptations can transcription elongation. Mol. Cell 30, 557–566 (2008). http://drummond.openwetware.org
drive the evolution of the capacitor [PSI], even with 96. Fox-Walsh, K. L. & Hertel, K. J. Splice-site pairing is an Claus O. Wilke’s homepage: http://wlab.ccbb.utexas.edu
realistic rates of yeast sex. PLoS Genet. 5, e1000517 intrinsically high fidelity process. Proc. Natl Acad. Sci. All links ARe Active in the Online pdf
(2009). USA 106, 1766–1771 (2009).