@MBS MedicalBooksStore 2019 Genetic Analysis An Integrated Approach
@MBS MedicalBooksStore 2019 Genetic Analysis An Integrated Approach
@MBS MedicalBooksStore 2019 Genetic Analysis An Integrated Approach
2 Transmission Genetics 30
20 Population Genetics and Evolution at the Population, Species, and Molecular Levels 725
Second Position
U C A G
UUU UCU UAU UGU U
Phe (F) Tyr (Y) Cys (C)
UUC UCC UAC UGC C
Ser (S)
U UCA
UUA UAA stop UGA stop A
Leu (L) UCG
UUG UAG stop UGG Trp (W) G
Mark F. Sanders
University of California at Davis
John L. Bowman
Monash University,
Melbourne, Australia
University of California at Davis
Copyright © 2019, 2015 Pearson Education, Inc. All Rights Reserved. Printed in the United States of America.
This publication is protected by copyright, and permission should be obtained from the publisher prior to any
prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise. For information regarding permissions, request forms and
the appropriate contacts within the Pearson Education Global Rights & Permissions department, please visit
www.pearsoned.com/permissions/.
Acknowledgements of third party content appear on page C-1, which constitutes an extension of this copyright
page.
PEARSON, ALWAYS LEARNING, and Mastering™ Genetics are exclusive trademarks in the U.S. and/or other
countries owned by Pearson Education, Inc. or its affiliates.
Unless otherwise indicated herein, any third-party trademarks that may appear in this work are the property of
their respective owners and any references to third-party trademarks, logos or other trade dress are for demon-
strative or descriptive purposes only. Such references are not intended to imply any sponsorship, endorsement,
authorization, or promotion of Pearson’s products by the owners of such marks, or any relationship between the
owner and Pearson Education, Inc. or its affiliates, authors, licensees or distributors.
1 17
ISBN-10: 0134605179
ISBN-13: 9780134605173
www.pearsonhighered.com
Table of Contents
1 2
The Molecular Basis of Transmission Genetics 30
Heredity, Variation, and
Evolution 1 2.1 Gregor Mendel Discovered the Basic
Principles of Genetic Transmission 31
Mendel’s Modern Experimental Approach 31
1.1 Modern Genetics Is in Its Second
Century 2 Five Critical Experimental Innovations 33
The Development of Modern Genetics 2
The Four Phases of Modern Genetics 3
2.2 Monohybrid Crosses Reveal the
Segregation of Alleles 34
Genetics—Central to Modern Biology 5
Identifying Dominant and Recessive Traits 34
Evidence of Particulate Inheritance and Rejection of the
1.2 The Structure of DNA Suggests Blending Theory 36
a Mechanism for Replication 6 Segregation of Alleles 36
The Discovery of DNA Structure 6
Hypothesis Testing by Test-Cross Analysis 37
DNA Nucleotides 7
Hypothesis Testing by F2 Self-Fertilization 38
DNA Replication 9
Genetic Analysis 2.1 39
Genetic Analysis 1.1 10
Experimental Insight 1.1 10 2.3 Dihybrid and Trihybrid Crosses Reveal the
Independent Assortment of Alleles 40
1.3 DNA Transcription and Messenger Dihybrid-Cross Analysis of Two Genes 40
RNA Translation Express Genes 11 Experimental Insight 2.1 41
Transcription 12 Testing Independent Assortment by Test-Cross
Translation 13 Analysis 43
Genetic Analysis 1.2 14 Genetic Analysis 2.2 44
Testing Independent Assortment by Trihybrid-Cross
Analysis 45
1.4 Genetic Variation Can Be Detected
The Rediscovery of Mendel’s Work 46
by Examining DNA, RNA, and Proteins 15
Gel Electrophoresis 15 Experimental Insight 2.2 46
Stains, Blots, and Probes 16
DNA Sequencing and Genomics 18 2.4 Probability Theory Predicts Mendelian
Proteomics and Other “-omic” Analyses 18 Ratios 47
The Product Rule 47
iii
iv CONTENTS
2.6 Autosomal Inheritance and Molecular 3.5 Human Sex-Linked Transmission Follows
Genetics Parallel the Predictions of Distinct Patterns 91
Mendel’s Hereditary Principles 51 Expression of X-Linked Recessive Traits 92
Autosomal Dominant Inheritance 52 X-Linked Dominant Trait Transmission 93
Autosomal Recessive Inheritance 53 Y-Linked Inheritance 93
Prospective and Retrospective Predictions in Genetic Analysis 3.3 95
Human Genetics 53
Molecular Genetics of Mendel’s Traits 54 3.6 Dosage Compensation Equalizes the
Genetic Analysis 2.3 55 Expression of Sex-Linked Genes 96
Case Study OMIM, Gene Mutations, and Human Case Study The (Degenerative) Evolution
Hereditary Disease 57 of the Mammalian Y Chromosome 97
Summary 59 • Preparing for Problem Summary 99 • Preparing for Problem
Solving 60 • Problems 60 Solving 99 • Problems 100
4
3 Gene Interaction 105
Cell Division and
Chromosome Heredity 67 4.1 Interactions between Alleles Produce
Dominance Relationships 106
3.1 Mitosis Divides Somatic Cells 68 The Molecular Basis of Dominance 106
5 Solving 177 • Problems 177
7
DNA Structure and 8
Replication 235 Molecular Biology of
Transcription and RNA
7.1 DNA Is the Hereditary Molecule of Life 236
Processing 275
Chromosomes Contain DNA 236
A Transformation Factor Responsible for Heredity 236
8.1 RNA Transcripts Carry the Messages of
DNA Is the Transformation Factor 238
Genes 276
DNA Is the Hereditary Molecule 238
RNA Nucleotides and Structure 276
7.2 The DNA Double Helix Consists of Experimental Discovery of Messenger RNA 277
Two Complementary and Antiparallel Categories of RNA 278
Strands 240
8.2 Bacterial Transcription Is a Four-Stage
DNA Nucleotides 240
Process 279
The DNA Duplex 241
Bacterial RNA Polymerase 280
Genetic Analysis 7.1 244 Bacterial Promoters 280
CONTENTS vii
Alternative Patterns of RNA Transcription and Alternative No Gaps in the Genetic Code 336
RNA Splicing 301 Deciphering the Genetic Code 337
Self-Splicing Introns 302 Genetic Analysis 9.3 339
Genetic Analysis 8.2 303 Case Study Antibiotics and Translation
Ribosomal RNA Processing 304 Interference 340
Transfer RNA Processing 304 Summary 340 • Preparing for Problem
Solving 341 • Problems 342
RNA Editing 307
Case Study Sexy Splicing: Alternative mRNA Splicing
and Sex Determination in Drosophila 307
Summary 308 • Preparing for Problem
Solving 309 • Problems 309
APPLICATION B
Human Genetic Screening
9.1 Polypeptides Are Amino Acid Chains That B.2 Newborn Genetic Screening 349
Are Assembled at Ribosomes 315 Phenylketonuria and the First Newborn Genetic Test 349
Amino Acid Structure 315 Living with PKU 350
Polypeptide and Transcript Structure 315 The Recommended Uniform Screening Panel 351
viii CONTENTS
B.3 Genetic Testing to Identify Carriers 353 10.4 Chromosome Breakage Causes Mutation
Testing Blood Proteins 353 by Loss, Gain, and Rearrangement of
DNA-Based Carrier Screening and Diagnostic Chromosomes 375
Verification 353 Partial Chromosome Deletion 375
Carrier Screening Criteria 353 Unequal Crossover 376
Pharmacogenetic Screening 354 Detecting Duplication and Deletion 377
Deletion Mapping 377
B.4 Prenatal Genetic Testing 354
Invasive Screening Using Amniocentesis or Chorionic 10.5 Chromosome Breakage Leads
Villus Sampling 354
to Inversion and Translocation
Noninvasive Prenatal Testing 356 of Chromosomes 378
Maternal Serum Screening 356 Chromosome Inversion 378
Preimplantation Genetic Screening 356
Genetic Analysis 10.3 379
Experimental Insight 10.1 382
B.5 Direct-to-Consumer Genetic Testing 357
Chromosome Translocation 383
B.6 Opportunities and Choices 359
Problems 359 10.6 Eukaryotic Chromosomes Are Organized
into Chromatin 385
11.5 Proteins Control Translesion DNA 12.3 Mutational Analysis Deciphers Genetic
Synthesis and the Repair of Double-Strand Regulation of the lac Operon 447
Breaks 420 Analysis of Structural Gene Mutations 447
Translesion DNA Synthesis 420 lac Operon Regulatory Mutations 448
Double-Strand Break Repair 420 Molecular Analysis of the lac Operon 451
Genetic Analysis 12.1 452
12.7 Antiterminators and Repressors Control 13.3 RNA-Mediated Mechanisms Control Gene
Lambda Phage Infection of E. coli 464 Expression 498
The Lambda Phage Genome 465 Gene Silencing by Double-Stranded RNA 499
Early Gene Transcription 465 Constitutive Heterochromatin Maintenance 501
Cro Protein and the Lytic Cycle 466 The Evolution and Applications of RNAi 502
The l Repressor Protein and Lysogeny 468 Case Study Environmental Epigenetics 502
Resumption of the Lytic Cycle following Lysogeny Summary 503 • Preparing for Problem
Induction 468 Solving 504 • Problems 504
Case Study Vibrio cholerae—Stress Response Leads
to Serious Infection Through Positive Control of
Transcription 469
Summary 470 • Preparing for Problem
Solving 471 • Problems 471
14
Analysis of Gene Function
by Forward Genetics and
13 Reverse Genetics 507
Regulation of 14.1 Forward Genetic Screens Identify Genes
Gene Expression in by Their Mutant Phenotypes 509
Eukaryotes 476 General Design of Forward Genetic Screens 509
Specific Strategies of Forward Genetic Screens 509
13.1 Cis-Acting Regulatory Sequences Bind Analysis of Mutageneses 513
Trans-Acting Regulatory Proteins to Control Identifying Interacting and Redundant Genes Using
Eukaryotic Transcription 478 Modifier Screens 514
C.1 Cancer Is a Somatic Genetic Disease that Is 15.4 Cloning of Plants and Animals Produces
Only Occasionally Inherited 540 Genetically Identical Individuals 583
C.2 What Is Cancer and What Are the Case Study Gene Drive Alleles Can Rapidly Spread
Characteristics of Cancer? 540 Through Populations 585
Summary 587 • Preparing for Problem
Progression of Abnormalities 540
Solving 588 • Problems 588
The Hallmarks of Cancer Cells and Malignant Tumors 541
16
C.3 The Genetic Basis of Cancer 543
Single Gene Mutations and Cancer Development 543
The Genetic Progression of Cancer Development and
Cancer Predisposition 546
Genomics: Genetics
Breast and Ovarian Cancer and the Inheritance of Cancer from a Whole-Genome
Susceptibility 548 Perspective 593
C.4 Cancer Cell Genome Sequencing and
Improvements in Therapy 549 16.1 Structural Genomics Provides a Catalog of
The Cancer Genome Atlas 549 Genes in a Genome 594
Epigenetic Irregularities 549 Whole-Genome Shotgun Sequencing 596
Targeted Cancer Therapy 550 Reference Genomes and Resequencing 599
Problems 550 Metagenomics 600
Experimental Insight 16.1 601
Other “-omes” and “-omics” 621 Continual DNA Transfer from Organelles 654
Use of Yeast Mutants to Categorize Genes 624 Encoding of Organellar Proteins 655
Genetic Networks 625 The Origin of the Eukaryotic Lineage 656
Case Study Genomic Analysis of Insect Guts Secondary and Tertiary Endosymbioses 656
May Fuel the World 627 Case Study Ototoxic Deafness: A Mitochondrial
Summary 628 • Preparing for Problem Gene–Environment Interaction 658
Solving 628 • Problems 629 Summary 659 • Preparing for Problem
Solving 660 • Problems 660
17
Organellar Inheritance and
the Evolution of Organellar
18
Developmental
Genomes 632 Genetics 663
17.1 Organellar Inheritance Transmits Genes 18.1 Development Is the Building of a
Carried on Organellar Chromosomes 633 Multicellular Organism 664
The Discovery of Organellar Inheritance 633 Cell Differentiation 665
Homoplasmy and Heteroplasmy 634 Pattern Formation 665
Genome Replication in Organelles 635
Replicative Segregation of Organelle Genomes 635 18.2 Drosophila Development Is a Paradigm
for Animal Development 666
17.2 Modes of Organellar Inheritance Depend The Developmental Toolkit of Drosophila 667
on the Organism 636 Maternal Effects on Pattern Formation 669
Mitochondrial Inheritance in Mammals 637 Coordinate Gene Patterning of the Anterior–Posterior
Genetic Analysis 17.1 639 Axis 669
Mating Type and Chloroplast Segregation in Domains of Gap Gene Expression 670
Chlamydomonas 640 Regulation of Pair-Rule Genes 671
Biparental Inheritance in Saccharomyces cerevisiae 641 Specification of Parasegments by Hox Genes 673
Genetic Analysis 17.2 643 Downstream Targets of Hox Genes 675
Summary of Organellar Inheritance 644 Hox Genes throughout Metazoans 676
Genetic Analysis 18.1 677
17.3 Mitochondria Are the Energy Factories of Stabilization of Cellular Memory by Chromatin
Eukaryotic Cells 644 Architecture 678
Mitochondrial Genome Structure and Gene Content 645
Mitochondrial Transcription and Translation 646 18.3 Cellular Interactions Specify Cell Fate 679
Inductive Signaling between Cells 679
17.4 Chloroplasts Are the Sites of Lateral Inhibition 682
Photosynthesis 648 Cell Death During Development 682
Chloroplast Genome Structure and Gene Content 648
Chloroplast Transcription and Translation 649 18.4 “Evolution Behaves Like a Tinkerer” 683
Editing of Chloroplast mRNA 650 Evolution through Co-option 683
Constraints on Co-option 685
17.5 The Endosymbiosis Theory Explains
Mitochondrial and Chloroplast 18.5 Plants Represent an Independent
Evolution 651 Experiment in Multicellular Evolution 685
Separate Evolution of Mitochondria and Development at Meristems 685
Chloroplasts 651
Combinatorial Homeotic Activity in Floral-Organ
Experimental Insight 17.1 652 Identity 686
CONTENTS xiii
20
Genetic Analysis 18.2 689
19.2 Quantitative Trait Analysis Is Statistical 706 Natural Selection Favoring Heterozygotes 735
20.7 New Species Evolve by Reproductive D.5 Human Migrations around the Globe 770
Isolation 743 Europe 770
Genetic Analysis 20.3 744 Australia 771
Processes of Speciation 744
D.6 Genetic Evidence for Adaptation to New
Reproductive Isolation and Speciation 746
Environments 772
The Molecular Genetics of Evolution in Darwin’s
Lactose Tolerance 772
Finches 748
Skin Pigmentation 774
High Altitude 774
20.8 Molecular Evolution Changes Genes and
Genomes through Time 748 D.7 Domestication of Plants and Animals:
Vertebrate Steroid Receptor Evolution 749 Maize 775
Case Study Sickle Cell Disease Evolution and Natural
Selection in Humans 750 D.8 The Future 776
Summary 751 • Preparing for Problem Problems 777
Solving 752 • Problems 753
APPLICATION E
D
APPLICATION Forensic Genetics
Human Evolutionary
Genetics E.1 CODIS and Forensic Genetic Analysis 780
CODIS History and Markers 780
Electrophoretic Analysis 781
D.1 Genome Sequences Reveal Extent of
Forensic Analysis Using CODIS 782
Human Genetic Diversity 759
Paternity Testing 784
SNP Variation in Humans 760
Individual Identification 785
Variation in CNVs 761
Remains Identified following the 9-11 Attack 785
Identification of the Disappeared in Argentina 786
D.2 Diversity of Extant Humans Suggests an
African Origin 761 E.2 DNA Analysis for Genealogy, Genetic
Mitochondrial Eve 762 Ancestry, and Genetic Health Risk
Y Chromosome Phylogeny 762 Assessment 786
Autosomal Loci 763 Assessing Genealogical Relationships 786
Assessing Genetic Ancestry 787
D.3 Comparisons between Great Apes Identify Genetic Health Risk Assessment 788
Human-Specific Traits 763 Late-Onset Alzheimer Disease 788
Revelations of Great Ape Genomes 763 Celiac Disease 789
Comparing the Human and Chimpanzee One Side of the Equation 789
Genomes 765
Problems 789
References and Additional Reading R-1
D.4 Ancient DNA Reveals the Recent History of
Our Species 766 Appendix: Answers A-1
Neandertals 768 Glossary G-1
Denisovans 769
Credits C-1
Finding Genes that Make Us
Human 770 Index I-1
About the Authors
Mark F. Sanders has been a faculty John L. Bowman is a professor
member in the Department of Molecular in the School of Biological Sciences at
and Cellular Biology at the University Monash University in Melbourne, Aus-
of California, Davis, since 1985. In that tralia, and an adjunct professor in the
time, he has taught more than 150 genet- Department of Plant Biology at the Uni-
ics courses to nearly 35,000 undergradu- versity of California, Davis, in the United
ate students. Although he specializes in States. He received a B.S. in Biochem-
teaching the genetics course for which this book is written, istry at the University of Illinois at Urbana-Champaign in
his genetics teaching experience also includes a genetics 1986 and a Ph.D. in Biology from the California Institute
laboratory course, an advanced human genetics course for of Technology in Pasadena, California. His Ph.D. research
biology majors, and a human heredity course for nonscience focused on how the identities of floral organs are specified
majors, as well as introductory biology and courses in popu- in Arabidopsis (described in Chapter 18), and he conducted
lation genetics and evolution. He has also served as an advi- postdoctoral research at Monash University on the regulation
sor to undergraduate students and in undergraduate education of floral development. From 1996 to 2006, his laboratory
administration, and he has directed several undergraduate at UC Davis investigated developmental genetics of plants,
education programs. focusing on how leaves are patterned. From 2006 to 2011,
Dr. Sanders received his B.A. degree in Anthropology he was a Federation Fellow at Monash University, where his
from San Francisco State University, his M.A. and Ph.D. laboratory is studying land plant evolution using a develop-
degrees in Biological Anthropology from the University of mental genetics approach. He was elected a Fellow of the
California, Los Angeles, and 4 years of training as a postdoc- Australian Academy of Science in 2014. At UC Davis he
toral researcher studying inherited susceptibility to human taught genetics, “from Mendel to cancer,” to undergradu-
breast and ovarian cancer at the University of California, ate students, and he continues to teach genetics courses at
Berkeley. Monash University.
Dedication
To my extraordinary wife and partner Ita. She is a treasure For my parents, Lois and Noel, who taught me to love and
whose support, patience, and encouragement throughout revere nature, and Tizita, my partner in our personal genet-
this ongoing project make me very fortunate. To my won- ics experiments. And to all my genetics students who have
derful children Jana and Nick, to their spouses John and inspired me over the years, I hope that the inspiration was
Molly, to my grandson Lincoln, and to all my students, mutual.
from whom I have learned as much as I have taught.
John L. Bowman
Mark F. Sanders
We dedicate this third edition of Genetic Analysis: An Integrated Approach to our friend and
colleague Mel Green, who passed away in October 2017 at the age of 101. Mel was a stellar
geneticist and was engaged in genetics until the end. Over his long career, he made numerous
important contributions to genetics, inspiring scores of geneticists including the authors of this
textbook.
xv
Preface
We are now almost two decades into the second century of basic sets of observations. In this edition, we adhere to and
modern genetics, and the expansion of knowledge in this rap- strengthen the integration that has resonated strongly with
idly progressing field continues at a dizzying pace. Topics instructors and students.
that seemed impenetrable just a few years ago are coming
into focus. Novel approaches to old problems are providing
profound insights into the genomics, development, and evolu- New to This Edition
tion of organisms in all three domains of life. CRISPR–Cas9,
As was the case in our previous editions, our aim above all
which was discovered in basic research on bacterial immu-
is to assist the student by making the learning of genetics
nity, has been developed into a genome-editing system that
easier, more interesting, and more effective. Thus, three
has revolutionized the manipulation of genomic sequences
specific goals have driven this revision, and each is sup-
in living cells. Advancements in genomics, proteomics,
ported by new features that help accomplish it. Goal 1 is to
transcriptomics, and other enterprises of the “omic” world
provide more interesting, real-world applications of genet-
have opened avenues for research that were unimaginable in
ics. We have addressed this goal by writing five “Applica-
years past. And the resulting advancements in knowledge are
tion Chapters” that each highlight a particular applied topic
quickly being turned into new applications. These are great
in human genetics. Goal 2 is to make the job of learning
times to be a geneticist or a student studying genetics!
the details of genetics easier. We have addressed this goal
In keeping with these exciting times of revolution-
by writing “Caption Queries” to accompany chapter figures
ary change in our field, our textbooks too must undergo
and by providing a new feature, titled “Preparing for Prob-
change. This third edition of Genetic Analysis: An Integrated
lem Solving,” at the end of each chapter. Goal 3 is to facili-
Approach contains some significant changes that have been
tate group work and discussion of genetics problems and
made with students foremost in our minds. As authors and
concepts among classmates. We have addressed this goal
instructors of genetics, we have had front row seats in the
in part through the Caption Queries and in part by provid-
discipline and in the classroom. Between the two of us, we
ing a new category of chapter problems, called “Collabora-
have more than 50 years’ experience and experimentation in
tion and Discussion,” that are specifically designed to be
teaching genetics. We have used that experience to produce
tackled in groups. Along with these important pedagogical
this new edition. We hope that it conveys the excitement we
changes, this revision is also important for incorporating
feel about genetics and the dynamism at work in the field,
new genetic information that is defining the future of the
and that it offers students new and interesting examples of
field. The following descriptions highlight key new features
and insights into our favorite scientific discipline. As teach-
and information designed to accomplish our revision goals.
ers and student mentors, our highest goal is to see students
succeed. To accomplish this we seek to motivate students to
pursue and explore genetics more fully and to incorporate Application Chapters
what they learn into their thinking and plans for their future.
Many students come to genetics curious about human
We hope teachers and students alike will find motivation and
heredity and about how genetic principles are applied in
encouragement in the subject matter and examples in this
real-world activities. This edition, like the previous ones,
book.
features numerous human examples to help illustrate the
operation of genetic principles, and it features five new
Application Chapters—short chapters focused on specific
Our Integrated Approach applied topics in human genetics and evolutionary genet-
This third edition, like its predecessors, carries the unique ics. The Application Chapters are written to give students
subtitle An Integrated Approach. The phrase embodies our information on topics of particular interest and to illus-
pedagogical approach, consisting of three principles: (1) to trate some of the practical uses of genetics and genetic
integrate problem solving throughout the text—not relegat- analysis. Each of these special chapters is about half the
ing it to the ends of chapters—and consistently to model a length of a typical textbook chapter, and each has a spe-
powerful, three-step problem-solving approach (Evaluate, cific applied focus. They are spaced periodically through-
Deduce, and Solve) in every worked example; (2) to integrate out the book in such a way that each of them comes just
an evolutionary perspective throughout the book; and (3) to after the key prerequisite material has been presented.
integrate descriptions of Mendelian genetics with molecu- Importantly, these new Application Chapters do not add
lar genetics and genomics so as to demonstrate the value of to the length of the book. We have made reorganization
each of these different approaches for investigating the same and revision decisions that have maintained the depth of
xvi
PREFACE xvii
coverage while allowing for the addition of the Applica- does one help students examine a figure attentively enough
tion Chapters in a space-neutral way. to derive the critical content and meaning? One way is by
Every Application Chapter opens with a story that asking questions about the figure. In this revision, we have
exemplifies why the topic of the chapter is important, and written Caption Queries for virtually every figure in the book
each contains several end-of-chapter problems to guide to help students dissect the illustrated content and more fully
student learning and discussion. The five Application understand its meaning and importance. Several Caption
Chapters are: Queries have been printed below their corresponding figure
in the chapter itself, and all Caption Queries are available as
❚❚ Application Chapter A – Human Hereditary Disease
clicker questions for classroom use and in Mastering Genet-
and Genetic Counseling This chapter describes the
ics as assignable homework. Some Caption Queries require
role of genetic counselors and the genetic information
the student to solve a problem using information from the
and analysis they employ in medical decision-making.
figure, some require an explanation be provided, and oth-
Students interested in human hereditary transmission, as
ers ask students to expand on the information or idea in the
well as those potentially interested in careers in medical
figure. All Caption Queries, whatever their form, will help
genetics or genetic counseling, will find satisfying dis-
students focus on the figures and derive a better understand-
cussions of these topics in this chapter.
ing of their content.
❚❚ Application Chapter B – Human Genetic Screening Caption Queries serve a second purpose as well. Genet-
Numerous invasive and non-invasive methods of screening ics instructors are becoming increasingly interested in the
for inherited conditions are described in this chapter, and pedagogical approach known as “flipping the classroom.”
their results are discussed. Topics include carrier screen- This approach has students do their textbook reading and
ing; pre-natal, newborn, and pre-symptomatic genetic test- review of lecture, PowerPoint®, and other course materials
ing; and amniocentesis and chorionic villus sampling. outside of class, leaving class time open for discussion, prob-
lem solving, and inquiry-based learning. In our own class-
❚❚ Application Chapter C – The Genetics of Cancer
rooms, we have found that asking questions about chapter
This chapter discusses cancer from two perspectives.
figures is an effective way to stimulate discussion and jump-
The first is an overview of the major hallmarks of
start problem solving and inquiry-based learning. The clicker
cancer that have been articulated over the last decade
versions of Caption Queries can be the first line of interactive
or so. The second is a discussion of cancers that have
questions in this approach.
a simpler genetic basis and cancers for which inherited
susceptibility has been identified. New, immune system–
based approaches to cancer treatment are also discussed. Preparing for Problem Solving
❚❚ Application Chapter D – Human Evolutionary Building on the strong problem-solving guidance of our
Genetics This chapter presents the current interpreta- Genetic Analysis worked examples (the three-step problem-
tion of human evolution from a genomic perspective and solving approach described momentarily), we have added a
describes the relationship of modern humans to their new chapter feature titled Preparing for Problem Solving,
archaic predecessors. The discussion includes up-to- located between the Chapter Summary and the end-of-chap-
date information on Neandertal and Denisovan genome ter problems. This feature is a list identifying the specific
sequencing, along with recent evidence on interbreeding knowledge and skills required to answer chapter problems.
among archaic human populations. The listed items draw students’ attention back to the major
ideas described in the chapter and to the practical skills that
❚❚ Application Chapter E – Forensic Genetics This
were modeled there, before the students begin working on
chapter focuses on the uses and analysis of DNA in the
end-of-chapter problems.
contexts of crime scene analysis, paternity testing, and
direct-to-consumer genealogy, genetic ancestry test-
ing, and genetic health risk assessment. Examples of Collaboration and Discussion Problems
genetic analysis using the Combined DNA Index Sys-
Having students work in groups to solve problems is an
tem (CODIS) and of genetic analysis to determine the
increasingly popular and productive way to encourage par-
paternity index and combined paternity index are given.
ticipation in, and to enhance, active learning. In this revi-
Descriptions of the direct-to-consumer genetic analyses
sion, each end-of-chapter problem set has been expanded to
provided by AncestryDNA and 23andMe are part of the
include several new problems in a section titled Collabora-
chapter as well.
tion and Discussion. As the name implies, these problems
are designed to be evaluated and solved by small groups of
Caption Queries students working together. Whether assigned as homework or
Textbook figures are an integral part of the pedagogical as part of flipped classroom activities, these exercises offer
apparatus of a textbook, but they are only effective if the an array of opportunities for comprehensive and hands-on
reader takes the time to look at and understand them. How problem solving.
xviii PREFACE
Redesigned Chapter Content Genetic Analysis teaches how to start thinking about a
problem, what the end goal is, and what kind of analysis is
The content and coverage of all chapters has been reworked required to get there. The three steps of this problem-solving
in this revision to keep up with changes in the field and framework are Evaluate, Deduce, and Solve.
keep all discussions timely. Several chapter revisions reflect
changes in approaches to genetic analysis. In Chapter 5 Evaluate: Students learn to identify the topic of the
(“Genetic Linkage and Mapping in Eukaryotes”), for exam- problem, specify the nature or format of the requested
ple, the discussion of mapping of molecular genetic markers answer, and identify critical information given in the
has been substantially expanded. To make way for this expan- problem.
sion, discussion of tetrad analysis in yeast has been dropped. Deduce: Students learn how to use conceptual knowl-
Chapter 13 (“Regulation of Gene Expression in Eukary- edge to analyze data, make connections, and infer addi-
otes”) has undergone revision to feature more discussion tional information or next steps.
of epigenetic regulation and the roles of epigenetic readers,
writers, and erasers. Chapters 14 (“Analysis of Gene Func- Solve: Students learn how to accurately apply ana-
tion by Forward Genetics and Reverse Genetics”) and 15 lytical tools and to execute their plan to solve a given
(“Recombinant DNA Technology and Its Application”) have problem.
a greatly expanded descriptions of the CRISPR–Cas9 system
Irrespective of the type of problem presented to them,
and its applications in gene editing and gene drive systems.
this framework guides students through the stages of solv-
Chapter 16 (“Genomics: Genetics from a Whole-Genome
ing it and gives them the confidence to undertake new
Perspective”) has undergone substantial revision to feature
problems.
new genomic approaches.
Each Genetic Analysis worked example is laid out in a
Several chapters include important new information
two-column format to help students easily follow the steps
that became available just as writing was being completed.
of the Solution Strategy that are enumerated in the left-hand
Among numerous examples are the discussion in Chapter
column and executed in the right-hand column. “Break It
7 (“DNA Structure and Replication”) of the apparently sto-
Down” comments point to key elements in the problem state-
chastic pattern of DNA replication initiation in E. coli that
ment of each example, as an aid to students, who often strug-
was described in mid-2017; and the description in Appli-
gle to identify the concepts and information that are critical
cation Chapter C (Genetics of Cancer) of the CAR-T cell
to starting the problem-solving process. We also include
method for treating certain cancers that was recommended
problem-solving Tips to help with critical steps, as well as
for approval by a panel of the U.S. Food and Drug Adminis-
warnings of common Pitfalls to avoid; these suggestions and
tration in mid-2017.
admonitions are gathered from our teaching experience. It is
A chapter from the first two editions, “The Integration
also important to note that the Genetic Analysis examples are
of Genetic Approaches: Understanding Sickle Cell Disease,”
integrated into the chapters, right after discussions of impor-
has been removed in this edition to help make room for the
tant content, to help students immediately apply the concepts
inclusion of the Application Chapters. We know many profes-
they are learning. Each chapter includes two or three Genetic
sors are fond of this chapter, and they can access it in Master-
Analysis problems, and the book contains nearly 50 in all.
ing Genetics or in custom versions of this text.
Complementing the Genetic Analysis problems are
strong end-of-chapter problems that are divided into three
groups. Chapter Concept problems come first and review the
Maintaining What Works critical information, principles, and analytical tools discussed
While making numerous pedagogical and content changes in the chapter. These are followed by Application and Inte-
in this third edition of Genetic Analysis: An Integrated gration problems that are more challenging and broader in
Approach, we have maintained all of the features that made scope. Last come the chapter’s Collaboration and Discussion
previous editions of the book so popular and effective. These questions, a new addition described above. All solutions to
include the systematic problem-solving approach, the per- the end-of-chapter problems in the Study Guide and Solutions
vasive evolutionary perspective, and the consistent cross Manual use the evaluate–deduce–solve model to reinforce
connections drawn throughout between transmission and the book’s problem-solving approach.
molecular genetics.
An Evolutionary Perspective
A Problem-Solving Approach Geneticists are acutely aware of evolutionary relation-
To help train students to become more effective problem ships between genes, genomes, and organisms. Evolution-
solvers, we employ a unique problem-solving feature called ary processes at the organismal level, discovered through
Genetic Analysis that gives students a consistent, repeatable comparative biology, can shed light on the function of
method to help them learn and practice problem solving. genes and organization of genomes at the molecular level.
PREFACE xix
❚❚ Caption Queries: Questions that help students dissect extends your options for assigning challenging prob-
the illustrated content of book figures and more fully lems. Each problem includes specific wrong-answer
understand their meaning and importance. feedback to help students learn from their mistakes and
❚❚ Experimental Insights: Discussions of critical or illus- to guide them toward the correct answer.
trative experiments, including the observed results of ❚❚ Inclusion of nearly 90% of the end-of-chapter questions
the experiments and the conclusions drawn from their among the assignment possibilities in the item library.
analysis. The broad range of answer types the questions require,
❚❚ Research Techniques: Explorations of impor- in addition to multiple choice, includes sorting, labeling,
tant research methods, illustrating the results and numerical, and ranking.
interpretations. ❚❚ Learning Catalytics is a “bring your own device” (smart-
❚❚ Case Studies: Short, real-world examples, at the end of phone, tablet, or laptop) assessment and active class-
every chapter, that highlight central ideas or concepts of room system that expands the possibilities for student
the chapter while reminding students of some practical engagement. Instructors can create their own questions,
applications of genetics. draw from community content shared by colleagues,
or access Pearson’s library of question clusters that
❚❚ Preparing for Problem Solving: Immediately preced-
explore challenging topics through two- to five-question
ing the end-of-chapter problems, this list of approaches
series that focus on a single scenario or data set, build
and suggestions briefly highlights the tools and con-
in difficulty, and require higher-level thinking.
cepts students will use most often in answering chapter
problems.
example modeled after the Genetic Analysis feature of the ❚❚ PowerPoint® presentations containing clicker-based
main textbook. The solutions provided in the third section of Caption Query questions for all figures in the text.
the manual also reflect the evaluate–deduce–solve strategy ❚❚ In Word and PDF files, a complete set of the assessment
of the Genetic Analysis feature. Finally, for more practice, materials and study questions and answers from the test
we’ve included five to ten Test Yourself problems and accom- bank. Files are also available in TestGen format.
panying solutions for each chapter in the textbook.
Nevada, Reno; Christopher Halweg at North Carolina State Supplements and Media Contributors
University; and Nancy Staub at Gonzaga University for their
Laura Hill Bermingham, University of Vermont
more than generous expert advice.
Pat Calie, Eastern Kentucky University
Christy Fillman, University of Colorado–Boulder
Reviewers Kathleen Fitzpatrick, Simon Fraser University
Jade Atallah, University of Toronto Michelle Gaudette, Tufts University
Michelle Boissiere, Xavier University of Louisiana Christopher Halweg, North Carolina State University
Sarah Chavez, Washington University Jutta Heller, Loyola University
Claire Cronmiller, University of Virginia Steven Karpowicz, University of Central Oklahoma
Robert Dotson, Tulane University David Kass, Eastern Michigan University
Steven Finkel, University of Southern California Fordyce Lux III, Metropolitan State College
Benjamin Harrison, University of Alaska Anchorage Peter Mirabito, University of Kentucky
Laura Hill, University of Vermont Pam Osenkowski, Loyola University
Adam Hrincevich, Louisiana State University Jennifer Osterhage, University of Kentucky
Steven Karpowicz, University of Central Oklahoma Louise Paquin, McDaniel College
Kirkwood Land, University of the Pacific Fiona Rawle, University of Toronto Mississauga
Craig Miller, University of California at Berkeley Pamela Sandstrom, University of Nevada, Reno
Jessica Muhlin, Maine Maritime Academy Tara Stoulig, Southeastern Louisiana State
Anna Newman, University of Houston Kevin Thornton, University of California at Irvine
Joanne Odden, Pacific University Oregon Douglas Thrower, University of California, Santa Barbara
Matthew Skerritt, Corning Community College Sarah Van Vickle-Chavez, Washington University in St. Louis
Nancy Staub, Gonzaga University Dennis Venema, Trinity Western University
David Waddell, University of North Florida Andrew J. Wood, Southern Illinois University
Cynthia Wagner, University of Maryland Baltimore County
Rahul Warrior, University of California at Irvine
Unparalleled Problem-Solving
Support
Genetic Analysis expertly guides students through the core ideas of genetics while
introducing them to real-world applications and supporting them with unparalleled
problem-solving guidance.
Evaluate
1. Identify the topic this problem 1. The question concerns a DNA sequence. It asks for the sequence and polarity
addresses, and the nature of the of the complementary strand and the number of phosphodiester and hydrogen
required answer. bonds present in the fragment.
2. Identify the critical information given 2. The sequence and polarity are given for one strand of the DNA fragment.
in the problem.
Deduce
3. Review the general structure of a 3. DNA is a double helix composed of single strands that contain complementary
DNA duplex and the complementarity base pairs (A pairs with T, and G with C). The complementary strands are
of specific nucleotides. antiparallel (i.e., one strand is 5′ to 3′, and its complement is 3′ to 5′).
4. Review the patterns of phosphodies- 4. One phosphodiester bond forms between adjacent nucleotides on each strand
ter bond and hydrogen bond forma- of DNA. A-T base pairs (joining the two strands) contain 2 hydrogen bonds, and
tion in DNA. G-C base pairs contain 3 hydrogen bonds.
Solve
5. Identify the sequence of the comple- 5. The complementary sequence is TGCTGCGAT.
mentary strand.
6. Give the polarity of the complemen- 6. The polarity of the complementary strand is 3’-TGCTGCGAT-5’.
tary strand.
7. Count the number of phosphodiester 7. Between the adjacent nucleotides of this fragment there are eight phosphodies-
bonds in this DNA fragment. ter bonds per strand for a total of 16 phosphodiester bonds.
8. Count the number of hydrogen bonds 8. There are four A-T bases pairs containing 2 hydrogen bonds each, and five
between the two strands of this DNA G-C base pairs containing 3 hydrogen bonds each, for a total of 8 + 15 = 23
fragment. hydrogen bonds in this DNA fragment.
For more practice, see Problems 5, 8, 9, 16, and 17. Visit the Study Area to access study tools. Mastering Genetics
to form, the negative charge of an oxygen or nitrogen must original version. The high fidelity of DNA replication is essen-
occur opposite the positive charge of a hydrogen. This occurs tial to reproduction and to the normal development of biologi-
when complementary base pairs align in antiparallel strands. cal structures and functions. Without faithful DNA replication,
If a purine and a pyrimidine were aligned in parallel strands, the information of life would become hopelessly garbled by
positively charged hydrogens would be opposite one another, rapidly accumulatingNEW! Preparing
mutations that for Problem
would threaten survival.
as would negatively charged nitrogens and oxygens. These Considering the Solving
importance offeature in every
DNA throughout the chapter identi-
repelling forces would prevent hydrogen bond formation. biological world, it was no surprise to discover that the
Review Genetic Analysis 7.1 to explore complementary general mechanismfies of DNAspecific knowledge
replication is the same in alland skills students
base pairing and the formation of bonds creating single and needprocess
organisms. This universal to answer
evolved inend of chapter problems.
the earliest
double strands of DNA. life-forms and has been retained for billions of years. As
organisms diverged and became more complex, however,
an array of differences did develop among DNA replica-
7.3 DNA Replication Is tion proteins and enzymes. Despite the diversification
of these specific components of DNA replication, three
Semiconservative and Bidirectional attributes of DNA replication are shared by all organisms:
1. Each strand of the parental DNA molecule remains
Applications of Genetics
B
APPLICATION
A
APPLICATION
blood from a newborn infant. The blood is used to screen for disorders on the Recommended
Uniform Screening Panel (RUSP) list of human hereditary diseases, as discussed in this chapter.
Genetic counseling, a central activity in medical genetics, seeks to provide individuals, cou-
ples, and families with medical and genetic information they can use to make informed deci-
sions about genetic testing and medical treatment, in person-to-person meetings involving
physicians, genetic counselors, and consultands.
New types of questions help engage students while they read the book and
when they are in the classroom. Questions related to key figures and problems
for group work help support instructor efforts to build students’ critical thinking
and problem solving skills.
argin of the diagram, heterozygous, then there will be a roughly 1:1 ratio of
the horizontal margin. progeny with the dominant phenotype to progeny with the
Punnett diagram show recessive phenotype.
union of the male and One of Mendel’s test crosses of F1 plants to reces-
ying a possible geno- sive plants is shown in Figure 2.6. Based on his segregation
union. hypothesis, Mendel predicted that test-cross progeny pheno-
particulate inheritance types would be 50% dominant and 50% recessive. Figure 2.6
mber of plants in each illustrates Mendel’s test cross between an F1 plant producing
to frame a hypothesis round seeds (and suspected to have a heterozygous genotype)
thesis of Mendel’s is and a pure-breeding wrinkled-seed plant, known to be homo-
ometimes also known zygous rr. In the test cross, the wrinkled-seed plant, being
e particulate nature of homozygous rr, produces only r-containing gametes. If the
(separation) of alleles F1 plant is indeed heterozygous, it should produce reproduc-
his process more fully tive cells with R and r genotypes at a frequency of 12 each.
m union of gametes to Consequently, the progeny of the cross should be 12 Rr and
1
rtions: 2 rr, resulting in a 1:1 ratio of round : wrinkled. As the fig-
ure indicates, Mendel performed this cross and observed
leles for each trait 193 round peas and 192 wrinkled peas, or a 1:1 ratio, in test-
another during cross progeny. Mendel reported test-cross results for five of
will have an equal his traits and observed a 1:1 ratio in each case (Table 2.2).
mete. Random These results verify the prediction that the F1 progeny of
l unite one gamete pure-breeding crosses are heterozygous. If the F1 were
ny in ratios that are homozygous dominant instead of heterozygous, the test-cross
progeny would all have the dominant phenotype instead of
at when pure-breeding the observed 1:1 ratio.
enotypes are crossed,
nt phenotype and have Pure- Pure-
breeding breeding
of reproduction of het- RR rr
ation means that one-
P ×
NEW! Caption Queries accompany many
F1 parent are expected
e-half are expected to
figures in the book, helping students focus on the
Cross-fertilization
om union of reproduc- Pure- illustrations and more fully understand the content.
lants leads to the 3:1 Heterozygous breeding
ic ratio of the F2. Rr rr
Test cross of dominant F1 Some questions ask students to solve a problem
plant to a recessive plant
F1 × to determine if the F1 is using information from the figure, some require
heterozygous.
Cross Analysis an explanation, and others ask students to expand
on to explain the phe-
Test-cross fertilization
on the information or idea in the figure. As an
F1 and F2 generations
F2 1
2 r
– 1
–
2r instructor resource, we provide Caption Queries for
o critical parts of his
ervation of F1 and F2 1
–
2 R
Rr Rr all book figures as clicker questions for in-class use.
If the F1 is heterozygous,
emonstrate they were the ratio of its gametes
cifically, Mendel pre- rr rr
1
– r will be 1:1.
2
periment were hetero-
ny with the dominant Punnett square
ozygous genotype and
In Mendel’s test-cross experiment, he
e. found 193 round and 192 wrinkled
e F1 were heterozy- test-cross progeny—a 1.01:1 ratio.
wn in genetics as a
rganism that has the Figure 2.6 Test-cross analysis of F1 plants. A test cross between
the recessive pheno- an F1 plant and one that is homozygous recessive produces prog- NEW! Collaboration and Discussion
nant organism has the eny with a 1:1 ratio of the dominant to the recessive phenotype if
the F1 plant is heterozygous.
Problems have been added to every end of
ozygous genotype. If
type is homozygous, Q If a test-cross experiment identical to the one shown here
chapter question set to facilitate group work
s will have the domi- produces 826 progeny plants, how many plants are expected in and hands-on problem solving in class.
ominant organism is each phenotype category?
Learn Genetics Concepts
and Problem Solving
Title
Activities feature
personalized
wrong-answer
feedback and hints
that emulate the
office-hour experi-
ence to guide stu-
dent learning. New
tutorials include
coverage of topics
like CRISPR-Cas.
140 Practice
Problems offer
more opportunities
to develop problem-
solving skills. These
questions appear
only in Mastering
Genetics and
include targeted
wrong answer
feedback to guide
students to the
correct answer.
Access the text anytime, anywhere
with Pearson eText
Genetic Analysis expertly guides students through the core ideas of genetics while
introducing them to real-world applications and supporting them with unparalleled
problem-solving guidance.
Evaluate
1. Identify the topic this problem 1. The question concerns a DNA sequence. It asks for the sequence and polarity
addresses, and the nature of the of the complementary strand and the number of phosphodiester and hydrogen
required answer. bonds present in the fragment.
2. Identify the critical information given 2. The sequence and polarity are given for one strand of the DNA fragment.
in the problem.
Deduce
3. Review the general structure of a 3. DNA is a double helix composed of single strands that contain complementary
DNA duplex and the complementarity base pairs (A pairs with T, and G with C). The complementary strands are
of specific nucleotides. antiparallel (i.e., one strand is 5′ to 3′, and its complement is 3′ to 5′).
4. Review the patterns of phosphodies- 4. One phosphodiester bond forms between adjacent nucleotides on each strand
ter bond and hydrogen bond forma- of DNA. A-T base pairs (joining the two strands) contain 2 hydrogen bonds, and
tion in DNA. G-C base pairs contain 3 hydrogen bonds.
Solve
5. Identify the sequence of the comple- 5. The complementary sequence is TGCTGCGAT.
mentary strand.
6. Give the polarity of the complemen- 6. The polarity of the complementary strand is 3’-TGCTGCGAT-5’.
tary strand.
7. Count the number of phosphodiester 7. Between the adjacent nucleotides of this fragment there are eight phosphodies-
bonds in this DNA fragment. ter bonds per strand for a total of 16 phosphodiester bonds.
8. Count the number of hydrogen bonds 8. There are four A-T bases pairs containing 2 hydrogen bonds each, and five
between the two strands of this DNA G-C base pairs containing 3 hydrogen bonds each, for a total of 8 + 15 = 23
fragment. hydrogen bonds in this DNA fragment.
For more practice, see Problems 5, 8, 9, 16, and 17. Visit the Study Area to access study tools. Mastering Genetics
to form, the negative charge of an oxygen or nitrogen must original version. The high fidelity of DNA replication is essen-
occur opposite the positive charge of a hydrogen. This occurs tial to reproduction and to the normal development of biologi-
when complementary base pairs align in antiparallel strands. cal structures and functions. Without faithful DNA replication,
If a purine and a pyrimidine were aligned in parallel strands, the information of life would become hopelessly garbled by
positively charged hydrogens would be opposite one another, rapidly accumulatingNEW! Preparing
mutations that for Problem
would threaten survival.
as would negatively charged nitrogens and oxygens. These Considering the Solving
importance offeature in every
DNA throughout the chapter identi-
repelling forces would prevent hydrogen bond formation. biological world, it was no surprise to discover that the
Review Genetic Analysis 7.1 to explore complementary general mechanism of DNA replication is the same in alland skills students
fies specific knowledge
base pairing and the formation of bonds creating single and needprocess
organisms. This universal to answer
evolved inend of chapter problems.
the earliest
double strands of DNA. life-forms and has been retained for billions of years. As
organisms diverged and became more complex, however,
an array of differences did develop among DNA replica-
7.3 DNA Replication Is tion proteins and enzymes. Despite the diversification
of these specific components of DNA replication, three
Semiconservative and Bidirectional attributes of DNA replication are shared by all organisms:
Given the role of DNA as an information repository and 1. Each strand of the parental DNA molecule remains
an information transmitter, the integrity of the nucleotide intact during replication.
sequence of DNA is of paramount importance. Each time DNA 2. Each parental strand serves as a template directing the
is copied, the new version must be a precise duplicate of the synthesis of a complementary, antiparallel daughter strand.
A02_SAND5173_03_SE_WALK.indd 1 11/15/17 4:48 AM
244
Applications of Genetics
B
APPLICATION
A
APPLICATION
blood from a newborn infant. The blood is used to screen for disorders on the Recommended
Uniform Screening Panel (RUSP) list of human hereditary diseases, as discussed in this chapter.
Genetic counseling, a central activity in medical genetics, seeks to provide individuals, cou-
ples, and families with medical and genetic information they can use to make informed deci-
sions about genetic testing and medical treatment, in person-to-person meetings involving
physicians, genetic counselors, and consultands.
New types of questions help engage students while they read the book and
when they are in the classroom. Questions related to key figures and problems
for group work help support instructor efforts to build students’ critical thinking
and problem solving skills.
the vertical margin of the diagram, heterozygous, then there will be a roughly 1:1 ratio of
parent along the horizontal margin. progeny with the dominant phenotype to progeny with the
body of the Punnett diagram show recessive phenotype.
m the random union of the male and One of Mendel’s test crosses of F1 plants to reces-
quare identifying a possible geno- sive plants is shown in Figure 2.6. Based on his segregation
ed by gamete union. hypothesis, Mendel predicted that test-cross progeny pheno-
concept of particulate inheritance types would be 50% dominant and 50% recessive. Figure 2.6
unted the number of plants in each illustrates Mendel’s test cross between an F1 plant producing
ndel was able to frame a hypothesis round seeds (and suspected to have a heterozygous genotype)
This first hypothesis of Mendel’s is and a pure-breeding wrinkled-seed plant, known to be homo-
egregation, sometimes also known zygous rr. In the test cross, the wrinkled-seed plant, being
t describes the particulate nature of homozygous rr, produces only r-containing gametes. If the
e segregation (separation) of alleles F1 plant is indeed heterozygous, it should produce reproduc-
n (we discuss this process more fully tive cells with R and r genotypes at a frequency of 12 each.
oses the random union of gametes to Consequently, the progeny of the cross should be 12 Rr and
1
ictable proportions: 2 rr, resulting in a 1:1 ratio of round : wrinkled. As the fig-
ure indicates, Mendel performed this cross and observed
ion The two alleles for each trait 193 round peas and 192 wrinkled peas, or a 1:1 ratio, in test-
ate) from one another during cross progeny. Mendel reported test-cross results for five of
nd each allele will have an equal his traits and observed a 1:1 ratio in each case (Table 2.2).
clusion in a gamete. Random These results verify the prediction that the F1 progeny of
ertilization will unite one gamete pure-breeding crosses are heterozygous. If the F1 were
produce progeny in ratios that are homozygous dominant instead of heterozygous, the test-cross
e. progeny would all have the dominant phenotype instead of
ion means that when pure-breeding the observed 1:1 ratio.
omozygous genotypes are crossed,
e the dominant phenotype and have Pure- Pure-
breeding breeding
e. In the case of reproduction of het- RR rr
law of segregation means that one-
P ×
NEW! Caption Queries accompany many
cells of each F1 parent are expected
allele and one-half are expected to
figures in the book, helping students focus on the
Cross-fertilization
ele. The random union of reproduc- Pure- illustrations and more fully understand the content.
rozygous F1 plants leads to the 3:1 Heterozygous breeding
1:2:1 genotypic ratio of the F2. Rr rr
Test cross of dominant F1 Some questions ask students to solve a problem
plant to a recessive plant
F1 × to determine if the F1 is using information from the figure, some require
heterozygous.
ng by Test-Cross Analysis an explanation, and others ask students to expand
w of segregation to explain the phe-
Test-cross fertilization
on the information or idea in the figure. As an
bserved in the F1 and F2 generations
F2 1
2 r
– 1
–
2r instructor resource, we provide Caption Queries for
ments, but two critical parts of his
e seen by observation of F1 and F2 1
–
2 R
Rr Rr all book figures as clicker questions for in-class use.
If the F1 is heterozygous,
l needed to demonstrate they were the ratio of its gametes
pothesis. Specifically, Mendel pre- rr rr
1
– r will be 1:1.
2
geny in his experiment were hetero-
the F2 progeny with the dominant Punnett square
with the homozygous genotype and
In Mendel’s test-cross experiment, he
gous genotype. found 193 round and 192 wrinkled
hesis that the F1 were heterozy- test-cross progeny—a 1.01:1 ratio.
what is known in genetics as a
cross of an organism that has the Figure 2.6 Test-cross analysis of F1 plants. A test cross between
one that has the recessive pheno- an F1 plant and one that is homozygous recessive produces prog- NEW! Collaboration and Discussion
her the dominant organism has the eny with a 1:1 ratio of the dominant to the recessive phenotype if
the F1 plant is heterozygous.
Problems have been added to every end of
or the heterozygous genotype. If
minant phenotype is homozygous, Q If a test-cross experiment identical to the one shown here
chapter question set to facilitate group work
the test cross will have the domi- produces 826 progeny plants, how many plants are expected in and hands-on problem solving in class.
trast, if the dominant organism is each phenotype category?
9/26/17 11:44 AM
Activities feature
personalized
wrong-answer
feedback and hints
that emulate the
office-hour experi-
ence to guide stu-
dent learning. New
tutorials include
coverage of topics
like CRISPR-Cas.
140 Practice
Problems offer
more opportunities
to develop problem-
solving skills. These
questions appear
only in Mastering
Genetics and
include targeted
wrong answer
feedback to guide
students to the
correct answer.
The Helix Bridge is a 280-meter pedestrian bridge spanning the marina ESSENTIAL IDEAS
in downtown Singapore. The bridge design is inspired by the structure of
DNA and features two twisting helices with colored lights representing ❚❚ Modern genetics developed during the
the A-T and G-C base pairs. 20th century and is a prominent disci-
pline of the biological sciences.
❚❚ DNA replication produces exact copies
L
of the original molecule.
ife is astounding, both in the richness of its history and in ❚❚ The “central dogma of biology” describ-
ing the relationship between DNA, RNA,
its diversity. From the single-celled organisms that
and protein is a foundation of molecular
evolved billions of years ago have descended millions of spe- biology.
cies of microorganisms, plants, and animals. These species are ❚❚ Gene expression is a two-step process
connected by a shared evolutionary past that is revealed by the that first produces an RNA transcript of a
gene and then synthesizes an amino acid
study of genetics, the science that explores genome composi- string by translation of RNA.
tion and organization and the transmission, expression, varia- ❚❚ Inherited variation can be detected by
tion, and evolution of hereditary characteristics of organisms. laboratory methods that examine DNA,
RNA, and proteins.
Genetics is a dynamic discipline that finds applications
❚❚ Evolution is a foundation of modern genet-
everywhere humans interact with one another and with other ics that occurs through four processes.
organisms. In research laboratories, on farms, in grocery stores,
in medical offices, in courtrooms, and in other settings, genetics
1
2 CHAPTER 1 The Molecular Basis of Heredity, Variation, and Evolution
plays a prominent and expanding role in our lives. Zacharias Jansen. The genesis of ideas about cells—their ori-
Modern genetics is an increasingly genome- and gene- gins, structure, contents—was made possible by the Jansen’s
invention, and by numerous improvements in microscope
based discipline—that is, it is increasingly focused on
technology over the centuries. Collectively, these develop-
the entirety of the hereditary information carried by ments paved the way for theories like the cell theory and the
organisms and on the molecular processes that con- germ plasm theory that are foundational to modern genetics.
trol and regulate the expression of genes. Despite its In 1665, Robert Hooke first described cells he observed in
thin sections of cork. In the 1670s and 1680s, Anton van Leeu-
increasingly gene-focused emphasis, however, genetics
wenhoek, often called the father of microbiology, described
retains a strong interest in traditional areas of inquiry the abundance of tiny single-celled organisms in pond water
and investigation—heredity, variation, and evolution. and made numerous observations of bacteria. In the 1830s,
The fascinating discipline of genetics explores the basis Matthias Schleiden and Theodor Schwann described cells
in plants and in animals, respectively, and are credited with
of life—past and present—and its study will provide
proposing the cell theory that states all life is composed of
you with an exciting and rewarding journey. cells and that cells are the basic building blocks of organisms.
In this chapter, we survey the scope of modern Rudolph Virchow expanded and extended the ideas of the cell
genetics and reacquaint you with some basic informa- theory in 1855, declaring that “every cell stems from another
cell.” Virchow’s contribution was important for giving the cell
tion about deoxyribonucleic acid—DNA, the carrier of
theory an evolutionary basis. In 1831, Robert Brown provided
genetic information. We begin with a brief overview of the first description of the nucleus of a cell; and after descrip-
the origins and contemporary range of genetic science. tions by others of the contents of the nucleus—including
Next we retrace some of the fundamentals of DNA chromosomes—Walter Fleming, Theodor Boveri, and Walter
Sutton in the 1880s described chromosome separation during
replication, and of transcription and translation (the two
cell division, cementing the importance of the cell theory and
main components of gene expression), by reviewing giving rise to the germ plasm theory.
what you learned about these processes in previous It was August Weismann who proposed the germ plasm
biology courses. We also look at some research tech- theory, in 1889, bringing together multiple threads of evi-
dence linking chromosomes and heredity. The germ plasm
niques that are indispensable for studying genetic varia-
theory posits that reproductive organs (ovaries and testes,
tion in the laboratory; and we meet the most prominent for example) carry full sets of genetic information and that
of the modern-day “-omic” avenues of research and the sperm and egg cells they produce carry the genetic infor-
investigation in genetics. The chapter’s final section de- mation brought together in fertilization. This was followed
by the proposal of Edmund Beecher Wilson in 1895 that
scribes the central position of evolution in genetics and
DNA, known at the time as “nuclein,” was the hereditary
discusses the roles of heredity and variation in evolution. molecule and a component of chromosomes (whose sepa-
ration during cell division was observed, as noted above,
by Fleming, Boveri, and Sutton). Just a few years later, a
British physician-scientist named Archibald Garrod iden-
1.1 Modern Genetics tified the first human hereditary condition, an autosomal
Is in Its Second Century recessive disorder called alkaptonuria, by examining several
generations of British families with the condition.
Humans have been implicitly aware of genetics for more than The ideas embodied in the cell theory, the germ plasm the-
10,000 years (Figure 1.1). From the time of the domestica- ory, and Wilson’s proposal that DNA was the hereditary mol-
tion of rice in Asia, maize in Central America, and wheat in ecule took shape against a backdrop of other developments in
the Middle East, humans have recognized that desirable traits 19th century biology. The most important of these was Charles
found in plants and animals can be reproduced and enhanced in Darwin’s theory of evolution by natural selection in 1859.
succeeding generations through selective mating. On the other Darwin recognized the importance of heredity in his theory of
hand, explicit exploration and understanding of the hereditary evolution, but despite his attempts to decipher a mechanism,
principles of genetics—what we might think of as the science he was never able to describe how organisms transmitted their
of modern genetics—is a much more recent development. hereditary traits. Little did Darwin know that the explanation
for hereditary transmission was already available. In 1866,
Gregor Mendel published the descriptions and analysis of his
The Development of Modern Genetics
experiments of the inheritance of seven traits in pea plants.
In a sense, modern genetics can trace its early roots back to Although Mendel’s work would lie in obscurity for nearly
the invention of the compound microscope in the 1590s by 35 years—until more than a decade after his death—his exper-
a father and son team of Dutch eyeglass makers, Hans and iments and analysis form the foundation of modern genetics.
1.1 Modern Genetics Is in Its Second Century 3
The Four Phases of Modern Genetics informational and regulatory processes of heredity, that is,
the encoding of information in genes and the processes of
In 1900, three botanists working independently of one transcription and translation. The current and fourth phase
another—Carl Correns in Germany, Hugo de Vries in of modern genetics can be described as the genomic era.
Holland, and Erich von Tschermak in Austria—reached strik- This phase began in the 1980s with the completion of the
ingly similar conclusions about the pattern of transmission of first genome sequences, but it reached popular recognition
hereditary traits in plants. Each reported that his results mir- in 2001 when the complete human genome was produced.
rored those published in 1866 by an obscure amateur bota-
nist and Augustinian monk named Gregor Mendel. (Mendel’s Location of the Genetic Material Fleming, Sutton, and
work is discussed in Chapter 2.) Although Correns, de Vries, Boveri independently used microscopy to observe chromo-
and Tschermak had actually rediscovered an explanation of some movement during cell division in reproductive cells.
hereditary transmission that Mendel had published 34 years They each noted that the patterns of chromosome move-
earlier, their announcement of the identification of principles ment mirrored the transmission of the newly rediscovered
of hereditary transmission gave modern genetics its start. Mendelian hereditary units. This finding implied that the
Biologists immediately began testing, verifying, and hereditary units, or genes, posited by Mendel are located on
expanding on the newly appreciated explanation of hered- chromosomes. We now know that genes—the physical units
ity. In 1901, during a train ride from Cambridge to London, of heredity—are composed of defined DNA sequences that
William Bateson read the publication by Archibald Garrod collectively control gene transcription (described later in the
describing the pattern of occurrence of alkaptonuria and chapter) and contain the information to produce RNA mol-
immediately realized that Garrod’s description depicted ecules, one category of which is called messenger RNA,
“exactly the conditions most likely to enable a rare, usu- or mRNA, and is used to produce proteins by translation
ally recessive character to show itself.” According to his (described later in the chapter). Chromosomes consist of sin-
own retelling, Bateson was converted into a firm believer in gle long molecules of double-stranded DNA that in plants and
Mendelism during that train ride. Garrod—with Bateson’s animals are bound by many different kinds of protein that give
interpretive assistance—having produced the first docu- chromosomes their structure and can affect the transcription
mented example of a human hereditary disorder, continued of genes the chromosomes carry. The chromosomes of sexu-
to study alkaptonuria for decades, eventually devising the ally reproducing organisms typically occur in pairs known as
designation “inborn error of metabolism,” a phrase still used homologous pairs, or, more simply, as homologs. Each chro-
today to describe many recessive genetic conditions. mosome carries many genes, and homologs carry genes for
From that starting point in the first years of the 20th the same traits in the same order on each member of the pair.
century, modern genetics has moved through four phases Bacteria and archaea are single-celled organisms that
that we discuss below and then explore in greater detail as do not have a true nucleus. In almost all cases, species of
we advance through the book. The first phase was the iden- bacteria and archaea have a single, usually circular chromo-
tification of the cellular and chromosomal basis of hered- some. As a consequence, in the genome of these organisms,
ity. The second phase was the identification of DNA as the there is just one copy of each gene, a condition described as
hereditary material. Phase three was the description of the haploid. Bacterial and archaeal chromosomes are bound by a
4 CHAPTER 1 The Molecular Basis of Heredity, Variation, and Evolution
relatively small amount of protein. Limited amounts of other Predictable patterns of gene transmission during sex-
proteins help localize bacterial chromosomes to a region of ual reproduction are a focus of later chapters that discuss
the cell known as the nucleoid. Some archaeal species have hereditary transmission and the analysis of transmission
chromosomes and associated proteins that in appearance ratios (Chapter 2), cell division and chromosome hered-
resemble those in bacteria, but other species appear to have a ity (Chapter 3), gene action and interaction of genes in pro-
more eukaryote-like chromosome organization. ducing variation of physical characteristics (Chapter 4), and
In contrast to bacteria and archaea, the cells of the analysis of genetic linkage between genes (Chapter 5).
eukaryotes—a classification that includes all single-celled Genetic experiments taking place in roughly the first
and multicellular plants and animals—contain a true nucleus half of the 20th century developed the concept of the gene
holding multiple sets of chromosomes. Almost all eukary- as the physical unit of heredity and revealed the relation-
otes have haploid and diploid stages in their life cycles. For ship between phenotype, meaning the observable traits of
example, sperm and eggs produced in animals are haploid, an organism, and genotype, meaning the genetic constitu-
having one copy of each chromosome pair in the genome. tion of an organism. Biologists also described how heredi-
In the diploid state, the eukaryotic genome contains two tary variation is attributable to alternative forms of a gene,
copies—a homologous pair—of each gene. (Even in a dip- called alleles. The alleles of a gene have differences in DNA
loid cell, genes located on eukaryotic sex chromosomes sequence that alter the product of the gene.
might not be present in two copies, as we see in Chapter 4.) During the early decades of the 20th century, the study
Numerous eukaryotic genomes, particularly those of plants, of gene transmission was established as a central focus of
contain more than two copies of each chromosome—a genetics. The concepts of gene action and gene interaction
genome composition known as polyploidy. in producing phenotype variation were described, as was the
In addition to the chromosomes carried in their nuclei— concept of mapping genes along chromosomes. It was also
the so-called nuclear chromosomes—plant and animal cells during this period that evolutionary biologists developed
also contain genetic material in specialized organelles called gene-based models of evolution. These, too, are integral to
mitochondria (singular: mitochondrion), and plant cells genetic analysis, and their use continues to the present day.
contain a third type of gene-containing organelle called
chloroplasts. Many of these organelles are present by the Identifying the Genetic Material An experiment con-
dozens in each cell, and each mitochondrion or chloroplast ducted in 1944 by Oswald Avery, Colin MacLeod, and Maclyn
carries one or more copies of its own chromosome. Mito- McCarty identified deoxyribonucleic acid (DNA) as the heredi-
chondrial and chloroplast genes produce proteins that work tary material and is commonly credited with inaugurating the
with proteins produced by nuclear genes to perform essen- “molecular era” in genetics (see Chapter 7). This new era,
tial functions in cells—mitochondria are essential for the which spanned the second half of the 20th century and contin-
production of adenosine triphosphate (ATP) that is the prin- ues to the present day, began an effort to discover the molec-
cipal source of cellular energy, and chloroplasts are neces- ular structure of DNA. Molecular genetic research reached a
sary for photosynthesis. Mitochondria and chloroplasts are milestone in 1953, when the experimental work of many biolo-
transmitted in the cytoplasm during cell division, and the gists, including, most famously, James Watson, Francis Crick,
term cytoplasmic inheritance is used to refer to the random Maurice Wilkins, and Rosalind Franklin, led to the identifica-
distribution of mitochondria and chloroplasts among daugh- tion of the double-helical structure of DNA. A few years later,
ter cells. in 1958, the general mechanism of DNA replication was ascer-
Mitochondria and chloroplasts have an evolutionary his- tained. We examine details of this work in Chapter 7.
tory, having descended from ancient parasitic bacterial inva-
sion of eukaryotic cells. Since the time of their acquisition Describing the Nature and Processing of Genetic Infor-
by eukaryotes, mitochondria and chloroplasts have evolved mation By the mid-1960s, the basic mechanisms of DNA
an endosymbiotic relationship with their eukaryotic hosts, transcription and messenger RNA (mRNA) translation were
and the precise genetic content of mitochondria and chloro- laid out, and the genetic code by which mRNA is translated
plasts varies by eukaryotic host species (see Chapter 17). into proteins was deciphered. This period also saw the first
A complete set of nuclear chromosomes are transmitted descriptions of mechanisms that regulate transcription in
during the cell-division process called mitosis, to produce cells of different types or in response to a wide variety of
genetically identical daughter cells. In contrast, sexual repro- stimuli from outside and inside cells. Chapters 8 and 9 are
duction to produce offspring occurs by the cell-division pro- devoted to discussions of transcription and translation, and
cess called meiosis, that produces reproductive or sex cells, Chapters 12 and 13 describe processes that regulate gene
often identified as gametes—sperm and egg in animals and expression in bacteria and in eukaryotes.
pollen and egg in plants. The gametes of a diploid species
are haploid and contain one chromosome from each of the The Genomics Era Gene cloning and the development of
homologous pairs of chromosomes in the genome. The union recombinant DNA technologies developed and progressed
of haploid gametes at fertilization produces a diploid fertil- rapidly during the 1970s. By the early 1980s, biologists real-
ized egg that begins mitotic division to produce the zygote. ized that to properly understand the unity and complexity
1.1 Modern Genetics Is in Its Second Century 5
of life, they would have to study and compare the genomes common ancestor and is most commonly divided into three
of species—the complete sets of DNA sequences, includ- major domains. These three domains of life are Eukarya,
ing all genes and regions controlling genes. This realization Bacteria, and Archaea.
launched the “genomics era” in genetics, which continues to The three-domain model of life is originally derived
expand rapidly today. from the research of Carl Woese and colleagues in the mid-
Since the inception of genome sequencing, biologists 1970s. In contrast to earlier models, which were based on
have deciphered thousands of genomes that range in size morphology alone, Woese used molecular sequences to
from a few tens of thousands of DNA base pairs in the determine phylogenetic relationships between existing
simplest viral genomes to tens of billions of base pairs in organisms and thus to trace the evolution of life. Woese used
the largest plant and animal genomes. Fittingly, in 2001, the sequence of ribosomal RNA (rRNA), a small molecule
a century after Garrod and Bateson’s historic identifica- produced directly from DNA in all organisms, as his basis
tion of alkaptonuria as a human hereditary disease, collab- for comparison. His premise was simple—evolutionary
orative scientific groups from around the world published theory predicts that closely related species will have more
the completed “first draft” of the human genome. Collec- similarity in their rRNA sequences than will species that are
tive efforts like the Human Genome Project and the other less closely related. Furthermore, species that are members
genome sequencing projects that have been and will be of the same evolutionary lineage will share certain rRNA
undertaken promise to provide databases that will make the sequence changes that are not shared with species outside
second century of genetics every bit as remarkable as its the lineage. Since Woese’s work, many researchers have
first century. Chapters 14, 15, and 16 are primarily devoted used other molecules to refine and propose additional details
to descriptions of the analysis and functions of genomes. to the three-domain model. The tree of life remains a work in
progress, but the three-domain model is well established. We
use this model in subsequent chapters to compare and con-
Genetics—Central to Modern Biology trast molecular features, activities, and processes that shed
One of the foundations of modern biology is the dem- additional light on the evolutionary relationships between
onstration that all life on Earth shares a common ori- the three domains.
gin in the form of the “last universal common ancestor,” A second foundation of biology is the recognition
or LUCA (Figure 1.2). All life is descended from this that the hereditary material—the molecular substance that
conveys and stores genetic information—is deoxyribonucleic egg and sperm (animals) or pollen (plants) or spores (yeast)
acid (DNA) in all organisms. Certain viruses use at fertilization, with the subsequent development of an organ-
ribonucleic acid (RNA) as their hereditary material. Most ism. DNA is the hereditary molecule in reproductive cells.
biologists argue that viruses are not alive. Rather, they are Similarly, in somatic (body) cells of plants and animals and
obligate intracellular parasites that are noncellular and must in organisms that reproduce by asexual processes, DNA is the
invade host cells to reproduce, at the expense of the host hereditary molecule that ensures that successive generations
cell. In living organisms, DNA has a double-stranded struc- of cells are identical. Clearly, then, discovering the molecular
ture described as a DNA double helix, or as a DNA duplex, structure of DNA would be the key that opened the door to
consisting of two strands joined together in accordance with two fundamental areas of inquiry: (1) how DNA could carry
specific biochemical rules. Certain viral genomes consist the diverse array of genetic information present in the vari-
of a small single-stranded DNA molecule that replicates to ous genomes of animals and plants; and (2) how the mole-
form a DNA duplex in a host cell. cule replicated. In this section, we review basic concepts of
Eukarya, Bacteria, and Archaea share general mecha- DNA structure and DNA replication.The molecular details
nisms of DNA replication, the process that precisely dupli- of DNA structure and replication are provided in Chapter 7.
cates the DNA duplex prior to cell division, and they also
share general mechanisms of gene expression, the processes
The Discovery of DNA Structure
through which the genetic information guides development
and functioning of an organism. All organisms express their In the early 1950s, James Watson, an American in his mid-
genetic information by a two-step process that begins with 20s who had recently completed a doctoral degree, and
transcription, a process in which one strand of DNA is used Francis Crick, a British biochemist in his mid-30s, began
to direct the synthesis of a single strand of RNA. Transcription working together at the University of Cambridge, England,
produces various forms of RNA, including messenger RNA to solve the puzzle of DNA structure. Their now-legendary
(mRNA), which in all organisms undergoes translation to collaboration culminated in a 1953 publication that ignited
produce proteins at structures called ribosomes. the molecular era in genetics.
As the biological discipline devoted to the examination Watson and Crick’s paper accurately described the molec-
of all aspects of heredity and variation, between genera- ular structure of DNA as a double helix composed of two
tions and through evolutionary time, genetics is central to strands of DNA, with an invariant sugar-phosphate backbone
modern biology. Modern genetics has three major branches. on the outside and nucleotide bases—adenine, thymine, gua-
Transmission genetics, also known as Mendelian genetics, nine, and cytosine—forming complementary base pairs within
is the study of the transmission of traits and characteristics in the center of the molecule. This discovery was of enormous
successive generations. Evolutionary genetics studies the importance, because with the structure of DNA unveiled, the
origins of and genetic relationships between organisms and “gene” had a known physical form and was no longer just
examines the evolution of genes and genomes. Molecular a conceptual entity. This physical form of a gene could be
genetics studies inheritance and variation in nucleic acids examined and sequenced, compared with other genes in the
(DNA and RNA), proteins, and genomes and tries to con- genome, and compared with similar genes in other species.
nect them to inherited variation and evolution in organisms. Watson and Crick’s description of DNA structure was
These branches of genetics are not rigidly differentiated. not the product of their work exclusively. In fact, unlike oth-
There is substantial cross-communication among them, and ers who made significant contributions to the discovery of
it is rare to find a geneticist today who doesn’t use analytical DNA structure, Watson and Crick were not actively engaged
approaches from all three. Similarly, not only are most biologi- in laboratory research. Outside of their salaries, they had
cal scientists, to a greater or lesser extent, also geneticists, but in very little financial support available to conduct research.
addition many of the methods and techniques of genetic experi- In lieu of laboratory research, Watson and Crick put their
mentation and analysis are shared by all biological scientists. efforts into DNA-model building, basing their interpreta-
After all, genetic analysis interprets the common language of tions on experimental data gathered by others.
life by integrating information from all three branches. Rosalind Franklin, a biophysicist working in a laboratory
with Maurice Wilkins at King’s College in London, was one
of the principal sources of information used by Watson and
Crick (Figure 1.3). Franklin used an early form of X-ray dif-
1.2 The Structure of DNA Suggests fraction imagery to examine the crystal structure of DNA. In
a Mechanism for Replication Franklin’s method, X-rays bombarding crystalline prepara-
tions of DNA were diffracted as they encountered the atoms
At its core, hereditary transmission is the process of dispers- in the crystals. The pattern of diffracted X-rays was recorded
ing genetic information from parents to offspring. In sexu- on X-ray film, and the structure of the molecules in the crystal
ally reproducing organisms, this process is accomplished by was deduced from that pattern. Franklin’s most famous X-ray
the generation of reproductive sex cells in males (the sperm diffraction photograph, Photo 51, clearly showed (to the well-
or pollen) and females (the egg), followed by the union of trained eye) that DNA is a duplex, consisting of two strands
1.2 The Structure of DNA Suggests a Mechanism for Replication 7
Helical X
Nucleotide
base pairs in
the twisting
DNA double
helix
The nucleotide bases are hydrophobic (water- as a duplex of two strands that have complementary base
avoiding) and naturally orient toward the water-free sequences, so that an A on one strand faces a T on the sec-
interior of the duplex. The bases can occur in any order ond strand and a G on one strand faces a C on the other. This
along one strand of the molecule, but DNA is most stable complementary base pairing is the basis of Chargaff’s rule
and produces equal percentages of A and T and of C and G
in double-stranded DNA molecules. Hydrogen bonds, non-
covalent bonds consisting of weak electrostatic attractions,
form between complementary base pairs to join the two
DNA strands into a double helix. Two hydrogen bonds form
between each A-T base pair and three hydrogen bonds are
formed between each G-C base pair. Each strand of DNA has
a 5′ end and a 3′ end. These designations refer to the phos-
phate group (5′) and hydroxyl group (3′) at the opposite ends
of each strand of DNA and establish strand polarity, that
is, the 5′@to@3′ orientation of each strand. The differences at
each end of a strand allow the ends to be readily distinguished
from one another. (Complementary strands of DNA are
antiparallel, meaning that the polarities of the complemen-
tary strands run in opposite directions—one strand is ori-
ented 5′ to 3′ and the complementary strand is oriented 3′
to 5′. Genetic Analysis 1.1 guides you through a problem that
tests your understanding of base-pair complementation and
complementary strand polarity.
If you are like many biology students, you have probably
wondered from time to time what DNA actually looks like,
both on the macroscopic and microscopic level. Even today’s
best microscopes have difficulty capturing high-resolution
Figure 1.5 James Watson and Francis Crick’s metal-and- images of DNA, although computer-aided techniques for
wire model of DNA constructed in 1953. analyzing molecular structure can produce an interpretation
Q Notice that the A-T base pairs and the G-C base pairs in this of its microscopic appearance, as you’ll see, for example,
model are each connected by two wires. If the wires represent in Chapters 7, 8, and 9. However, you do not need sophisti-
hydrogen bonds, what is wrong with the model? (See also cated instrumentation to produce a sample of DNA that you
Figure 1.6) can hold in your hand. Experimental Insight 1.1 presents a
1.2 The Structure of DNA Suggests a Mechanism for Replication 9
Complementary
3¿ base pairs 5¿
Strand 1 Complementary Sugar–phosphate
G C base pairs backbone
5¿
Sugar–phosphate A T 3¿
T A
backbone
T A G C
Sugar
G C
C G C G
Strand 2 Phosphodiester
A T bond
A T
C Phosphate
5¿ phosphate Nucleotide bases C G 3¿
group 5¿
Guanine A T
Cytosine
G C
G C
3¿ hydroxyl A T
group
Site of Deoxyribose Hydrogen G C
phosphodiester sugars bonds C G
bond
T A
T A T
A G C
5¿ phosphate
group C G
Thymine Adenine
A T
Strands are
5¿ 3¿
antiparallel
Figure 1.6 DNA composition and structure. DNA nucleotides contain a deoxyribose sugar, a phosphate
group, and a nucleotide base (A, T, G, or C). Phosphodiester bonds join adjacent nucleotides in each
strand, and hydrogen bonds join complementary nucleotides of strands that have antiparallel orientation.
simple recipe for DNA isolation you can do at home with In semiconservative replication, the mechanism by
common and safe household compounds. which DNA usually replicates, the two complementary strands
of original DNA separate from one another, and each strand
acts as a template to direct the synthesis of a new, complemen-
DNA Replication tary strand of DNA with antiparallel polarity. The mechanism
The identification of the double-helical structure of DNA is termed “semiconservative” because after the completion of
established a starting point for a new set of questions about DNA replication, each new duplex is composed of one parental
heredity. The first of these questions concerned how DNA strand (conserved from the original DNA) and one newly
replicates. After correctly describing DNA structure in their synthesized daughter strand (Figure 1.7).
1953 paper, Watson and Crick closed with a directive for DNA replication begins at an origin of replication, with
future research on the question of DNA replication: “It has the breaking of hydrogen bonds that hold the strands together.
not escaped our notice that the specific base-pairing we have (This process is much like what happens when a zipper comes
proposed immediately suggests a possible copying mecha- undone.) DNA polymerases are the enzymes active in DNA rep-
nism for the genetic material.” lication. Using each parental DNA strand as a template, these
Indeed, as a consequence of the A-T and G-C comple- enzymes identify the nucleotide that is complementary to the
mentary base-pairing rules, it was evident that each single first unpaired nucleotide on the parental strand and then catalyze
strand of DNA contains the information necessary to gen- formation of a phosphodiester bond to join the new nucleotide to
erate the second strand of DNA and that DNA replication the previous nucleotide in the nascent (growing) daughter strand.
generates two identical DNA duplexes from the original The biochemistry of nucleic acids and DNA polymer-
parental duplex during each replication cycle. At the time ases dictates that DNA strands elongate only in the 5′@to@3′
Watson and Crick described the structure of DNA, however, direction. In other words, nucleotides are added exclusively
the mechanism of replication was not known. It would take to the 3′ end of the nascent strand, leading to 5′@to@3′
another 5 years for Matthew Meselson and Franklin Stahl, growth. Like the parental duplex, each new DNA duplex
in an ingenious experiment of simple design, to prove that contains antiparallel strands. Each parental strand–daughter
DNA replicates by a semiconservative mechanism (see strand combination forms a new double helix of DNA that is
Chapter 7). an exact replica of the original parental duplex.
GENETIC ANALYSIS 1.1
PROBLEM Determine the sequence and polarity of the DNA strand complementary to the following strand:
BREAK IT DOWN: A DNA sequence BREAK IT DOWN: Complementarity of DNA
3’-...ACGGATCCTCCCTAGTGCGTAATACG...-5’ nucleotides pairs A with T and G with C (p. 8).
is a string of A, G, T, and C nucleotides that
is 5’ on one end and 3’ on the other (p. 7).
Evaluate
1. Identify the topic this problem addresses 1. This problem concerns nucleotide complementarity in a DNA duplex and the
and the nature of the required answer. polarity of complementary strands. The answer should contain the nucleotide
sequence and polarity of a strand complementary to the given one.
2. Identify the critical information given in 2. The problem provides the nucleotide sequence and polarity of one strand of
the problem. a DNA duplex. PITFALL: Always check the polarity of a strand
you are given; don’t assume it’s written with
Deduce either the 5′ or 3′ end facing a certain way.
3. Recall the base-pairing relationships 3. In complementary DNA strands, base pairing joins adenine with thymine and
of DNA nucleotides in complementary guanine with cytosine to form a DNA duplex.
strands.
TIP: Complementary DNA strands
are antiparallel, with one strand
3′ S 5′ and the other 5′ S 3′.
4. Recall the polarity relationship of 4. The second strand of this duplex will be oriented with its 5′ end to the left
complementary DNA strands. and its 3′ end to the right.
Solve
5. Give the sequence and polarity of the 5. By the rules of complementary base pairing and antiparallel strand orienta-
complementary DNA strand. tion, the second DNA strand is
5’-TGCCTAGGAGGGATCACGCATTATGC-3’
For more practice, see Problems 12, 15, and 16. Visit the Study Area to access study tools. Mastering Genetics
10
1.3 DNA Transcription and Messenger RNA Translation Express Genes 11
4. Add 1 tablespoon meat tenderizer to mixture, stir gently, 8 to 12 ounces of “juice” will collect at the bottom of the
and let stand at least 10 to 15 minutes (longer is fine). container. Discard the cheesecloth and its contents.
The papain will digest much of the protein released by 6. Pour the alcohol into the juice and stir very briefly. Let
the ruptured cells and also the proteins attached to DNA. the juice mixture stand for at least 5 to 10 minutes. As the
5. Place 2 to 3 layers cheesecloth loosely over the open- juice settles, the alcohol rises to the top, and the large
ing of the glass container, allowing the cloth to form a mass of floating cottony material in it is DNA.
small “bowl” inside the opening. Use the rubber band 7. When the alcohol has completely separated from the
to hold the cheesecloth in place. Pour the slurry mixture juice, you can “spool” the DNA onto a chopstick by
through the cheesecloth, scooping out the onion or straw- slowly twirling the stick in the cottony DNA.
berry debris as it fills the cheesecloth bowl. Approximately
Parental
strand 1
Parental
strand 2
1.3 DNA Transcription
3¿ 5¿ and Messenger RNA Translation
G C Express Genes
A T
The central dogma of biology is a statement describing the
T A flow of hereditary information. It summarizes the critical
G C relationships between DNA, RNA, and protein; the func-
Replication C G tional role that DNA plays in maintaining, directing, and
direction A T
regulating the expression of genetic information; and the
roles played by RNA and proteins in gene function. Francis
C
Crick proposed the original version of the central dogma,
C G
shown in Figure 1.8a, in 1956 to encapsulate the role DNA
A T
plays in directing transcription of RNA and, in turn, the
G C role messenger RNA plays in translation of proteins. As
Nucleotides
added A T Crick told the story years later, he wrote this concept as
C Parental
5¿ 3¿
strand 1 ;DNA S RNA S protein< (spoken as “DNA to RNA to
A
G C protein”) on a slip of paper and taped it to the wall above his
5¿
T A desk to remind himself of the direction of information trans-
G C fer during the expression of genetic information. The most
G C Daughter A T important idea it conveys is that DNA does not code directly
3¿ strand 2
A T for protein. Rather, DNA makes up the genome of an organ-
Daughter G C ism and is a permanent repository of genetic information in
G C strand 1
C G each cell, directing gene expression by the transcription of
C G T A
DNA to RNA and, ultimately, the production of proteins.
T A DNA Over the decades since Crick first introduced the
A T
poly- central dogma, biologists have developed a clear under-
A T C
merase 3¿ standing of the role of DNA in maintaining and expressing
C C G genetic information. Most of the details of the two-stage
C G A process by which genetic information in sequences of
A T
T DNA is transcribed to RNA and then translated to protein
5¿ 3¿ 5¿ are known, as described in later chapters (transcription
Daughter Parental Nucleotides in Chapter 8 and translation in Chapter 9). For example,
strand 2 strand 2 added
5¿ 3¿ biologists now know that several forms of RNA are found
in cells, and all these RNA molecules are transcribed from
DNA and play a variety of roles in cells, but only mRNA
is translated.
Replication Two important categories of RNA that are not translated
direction but nonetheless play critical roles in translation are ribo-
Figure 1.7 Semiconservative DNA replication. Each parental DNA somal RNA and transfer RNA. Ribosomal RNA (rRNA)
strand serves as the template for synthesis of its daughter strand. DNA forms part of the ribosomes, the plentiful cellular structures
polymerase synthesizes daughter strands one nucleotide at a time. where protein assembly takes place. Transfer RNA (tRNA)
12 CHAPTER 1 The Molecular Basis of Heredity, Variation, and Evolution
carries amino acids, the building blocks of proteins, to ribo- the template strand. The RNA-synthesizing enzyme RNA
somes. An updated central dogma of biology is shown in polymerase pairs template-strand nucleotides with comple-
Figure 1.8b. In addition to mRNA, rRNA, and tRNA, the mentary RNA nucleotides to synthesize new transcript in
figure identifies reverse transcription, a form of informa- the 5′@to@3′ direction; the transcript is antiparallel to the
tion flow in which an enzyme called reverse transcriptase DNA template strand (Figure 1.9).
synthesizes DNA from an RNA template that comes from The complementary partner of the DNA template strand
RNA-containing viruses (retroviruses). The figure also iden- is known as the coding strand. In the past, the coding strand
tifies micro-RNA (miRNA), the focus of a rapidly emerging has also been identified as the “nontemplate strand,” but that
new area of RNA investigation that studies the role of these term is rarely used anymore. Because the coding strand is
small RNA molecules in the regulation of gene expression both complementary and antiparallel to the DNA template
in plants and animals (see Chapter 13). strand, it has the same 5′ S 3′ polarity as the RNA tran-
script synthesized from the template strand; moreover, the
Transcription RNA transcript and the DNA coding strand are identical in
nucleotide sequence, except for the appearance of U in the
Transcription is the process by which information in a DNA place of T (see discussion below). Our descriptions in this
sequence is converted into an RNA sequence. Transcription book will refer to this DNA strand as the “coding strand,”
uses one strand of the DNA making up a gene to direct but it is also correct to identify the strand as the nontemplate
synthesis of a single-stranded RNA transcript. The DNA strand.
strand from which the transcript is synthesized is called RNA is composed of four nucleotides that are chemi-
cally very similar to DNA. RNA nucleotides consist of a
ribose sugar (as opposed to deoxyribose found in DNA), a
phosphate group, and one of four nitrogenous bases. Three
of the RNA nucleotide bases are adenine, cytosine, and gua-
Direction of transcription nine. They are identical to the same nucleotide bases found
RNA polymerase in DNA. The fourth RNA base is uracil (U). It is chemi-
T C C T GAG Coding strand cally closely related to thymine; thus, in DNA–RNA and in
DNA CT G A C GA
AC G (nontemplate RNA–RNA complementary base pairing, uracil pairs with
5¿ A T G G T G C C T C A 3¿ strand)
adenine. All other complementary base-pair arrangements
3¿ T A C C A C G TCA C C U G A C U C C U G A 3¿ CG A G T 5¿ Template
G are as we described them previously.
G U G G A C T G AG G A C TCC strand
T
polymerase to a nearby gene. Promoters themselves are codon specifying the first amino acid of a polypeptide, in the
regulatory sequences and are not transcribed. Instead, the necessary location (Figure 1.11b). The start codon is most
transcription of a gene begins near the promoter at the start commonly 5’-AUG-3’, and is the codon at which translation
of transcription, the DNA location where transcription of begins. The ribosome reads the start codon and then each
a sequence begins. Transcription ends at the termination subsequent codon, as the ribosome moves 5′ S 3′ along the
sequence, where another DNA sequence facilitates the mRNA to assemble the amino acid string.
cessation of transcription (Figure 1.10a). In bacteria and Amino acids are transported to ribosomes by transfer
archaea, protein-producing genes are transcribed into RNAs (tRNAs). At each codon, complementary base pairing
mRNA that is quickly translated to produce the protein. occurs between codon nucleotides and a three-nucleotide
Eukaryotic genes have a different structure than do bacte- sequence of tRNA called an anticodon. This interaction
rial and most archaeal genes. Nearly all eukaryotic genes assembles amino acids in the order dictated by the mRNA
are subdivided into exons, which contain the coding infor- sequence. Ribosomal proteins power the continuous pro-
mation that will be used during translation, and introns, gression of the ribosome along mRNA and catalyze peptide
which intervene between exons and are removed from the bond formation in the growing polypeptide chain. Transla-
transcript before translation (Figure 1.10b). Bacterial genes tion continues until the ribosome encounters a stop codon
do not contain introns, and only a tiny number of archaeal thus bringing translation to a halt.
genes are suspected to contain introns. The removal of The genetic code, through which mRNA codons specify
introns from eukaryotic mRNA and other modifications amino acids, was deciphered by a series of experiments that
before translation occur in the nucleus (see Chapter 8). took place during the early 1960s. The experiments revealed
that the genetic code contains 64 codons; every codon con-
sists of three positions that are each filled by one of the four
Translation RNA nucleotides. An mRNA codon is read in the 5′@to@3′
Translation converts the genetic message of mRNA into direction: The first base of the codon is at its 5′ end, the third
sequences of amino acids using the genetic code. The amino acids base is at its 3′ end, and the second base is in the middle.
are joined to one another by a covalent bond called a peptide A total of 61 of the 64 codons specify amino acids, and
bond. The resulting string of amino acids is a polypeptide, the other 3 are the stop codons. The 64 codons and their
which upon folding makes up all or part of a protein. amino acids are displayed in Table A (inside the book front
Translation of mRNA occurs at ribosomes, where cover) using the three-letter and one-letter abbreviations for
sets of three consecutive nucleotides in the mRNA, each the amino acids. Table B (also inside the book front cover)
set called a codon, specify the amino acid at each position lists the names and abbreviations of each amino acid, along
of a polypeptide. Each mRNA codon is a triplet of RNA with their codons. The genetic code is redundant, with indi-
nucleotides coded by three complementary DNA nucleo- vidual amino acids encoded by as many as six codons and
tides on the template strand. The DNA nucleotides comple- as few as one codon.
mentary to codon nucleotides are known as the DNA triplet Genetic Analysis 1.2 allows you to work through the
(Figure 1.11a). Translation begins with mRNA attaching transcription and translation of the DNA sequence assessed
to a ribosome in a manner that places the start codon, the in Genetic Analysis 1.1.
GENETIC
GENETIC
ANALYSIS
ANALYSIS
1.2X.X
BREAK IT DOWN: The coding strand has the same 5′ S 3′
PROBLEM The DNA duplex identified in Genetic Analysis 1.1 is polarity as the mRNA and also the same base sequence except
for the presence of uracil (U) instead of thymine (T) (p. 12).
3’-...ACGGATCCTCCCTAGTGCGTAATACG...-5’
5’-...TGCCTAGGAGGGATCACGCATTATGC...-3’ BREAK IT DOWN: Translation uses mRNA
codons (three consecutive mRNA nucleo-
One strand of the double-stranded DNA sequence serves as the coding strand and the other as the tides) to direct the assembly of polypeptides
template strand that is transcribed to produce an mRNA. The mRNA is translated into a polypeptide (strings of amino acids) (p. 13).
containing five amino acids, the first of which is methionine (Met), encoded by the start codon AUG.
BREAK IT DOWN: The start codon is AUG,
The mRNA also contains a stop codon. and it is followed by four more codons and
then a stop codon (p. 13).
a. Identify the DNA coding strand and the nucleotides corresponding to the start codon,
amino acid codons, and the stop codon. BREAK IT DOWN: Messenger RNA codons are
written and translated 5′ to 3′ using the genetic
b. Write the sequence and polarity of the mRNA transcript, showing the codons code, which contains three stop codons, UAA,
for the five amino acids and the stop codon. UAG, and UGA (inside front cover).
c. Write the amino acid sequence of the polypeptide produced, using both the three-letter and one-
letter codes for the sequence. (See the genetic code tables inside the front cover).
Evaluate
1. Identify the topic this problem 1. The problem concerns identification of the coding strand of DNA and the
addresses and the nature of the sequence of mRNA encoding five amino acids in a polypeptide and the stop
required answer. codon. The amino acid sequence is also required.
2. Identify the critical information given 2. The double-stranded DNA sequence is given. It contains a sequence corre-
in the problem. sponding to the start codon (AUG), encodes five amino acids, and contains a
stop codon.
Deduce
3. Scan the double-stranded DNA 3. The double-stranded DNA sequence contains two possible triplets corre-
sequence to identify possible DNA sponding to start codons (5’-ATG-3’), one on each strand. Each is highlighted
coding-strand triplets and triplets that here in bold:
might be a start codon. 3’-ACGGATCCTCCCTAGTGCGTAATACG-5’
5’-TGCCTAGGAGGGATCACGCATTATGC-3’
PITFALL: Don’t simply read left
to right. Instead, identify strand TIP: The start codon in mRNA is 5’-AUG-3’
polarity and read 5′ S 3′. (methionine), coded by the template-DNA
strand triplet 5’- ATG -3’.
4. Scan the double-stranded DNA to 4. Four DNA triplets potentially correspond to a stop codon. Each corresponding
identify possible DNA coding-strand stop codon is shown in bold type here:
triplets corresponding to possible 3’-ACGGATCCTCCCTAGTGCGTAAATCG-5’
stop codons. 5’-...TGCCTAGGAGGGATCACGCATTATGC...-3’
TIP: There are three stop codons, UAA, UAG,
and UGA, corresponding to DNA coding-strand
triplets TAA, TAG, and TGA, respectively.
Solve Answer a
5. Determine which 5’-ATG-3’ DNA 5. The potential start codon in the upper strand (5’-ATG-3’) corresponds to the
triplet is followed by four additional authentic start codon (AUG). The following 12 nucleotides correspond to the
codons (12 nucleotides) encoding amino acid codons and the stop codon (5’-TAG-3’, which corresponds to the
amino acids and then by a stop codon UAG stop codon of mRNA).
and therefore corresponds to the
TIP: The total length of this
authentic start codon. region would be 18 nucleotides.
Answer b
6. Determine the mRNA sequence and 6. The mRNA sequence is
polarity, showing the codons. 5’-AUG CGU GAU CCC UCC UAG-3’
Start Stop
Answer c
7. Determine the amino acid sequence 7. The polypeptide encoded by this mRNA is Met-Arg-Asp-Pro-Ser, or
of the polypeptide encoded by this M-R-D-P-S.
mRNA.
For more practice, see Problems 19, 20, and 29. Visit the Study Area to access study tools. Mastering Genetics
14
1.4 Genetic Variation Can Be Detected by Examining DNA, RNA, and Proteins 15
1
Start 23 gel. After biological samples are loaded into the wells,
GGU
codon 4 Direction of
5¿
translation an electrical current is applied, and the samples migrate
CG A
CU
5 UU through the gel.
AA 3¿
6 Most proteins, as well as DNA and RNA, are negatively
7 charged at physiological pH (about 7.0). As a result, during
Stop
codon an electrophoresis run, the molecules in a lane move toward
Figure 1.11 Overview of transcription and translation. (a) Mes- the positively charged end of the gel at a rate determined
senger RNA codons are complementary and antiparallel to DNA by one or more distinguishing characteristics of the mole-
triplets of the template strand. (b) Ribosomes initiate translation cules. These molecular characteristics are (1) the molecular
of mRNA at the start codon and move along the mRNA in the 3′ weight, related to the number of nucleotides or amino acids
direction, adding each new amino acid to the nascent polypeptide that make up the molecule; (2) the molecular charge, mean-
by reading each codon. Transfer RNA molecules carry amino acids ing the degree of negative charge the molecule carries; and
to ribosomes, where the tRNA anticodon sequences interact with (3) the molecular shape, or molecular conformation. The
codon sequences of mRNA. Translation terminates when the ribo- movement of protein in electrophoresis is usually influenced
some encounters a stop codon.
by all three of these molecular parameters. The movement
of DNA or RNA is often a matter of molecular weight alone
(i.e., how many nucleotides the molecules contain), partic-
1.4 Genetic Variation ularly if all the nucleic acid molecules in the samples are
Can Be Detected by Examining linear.
After a sufficient period is allowed for migration, the
DNA, RNA, and Proteins electrical current is turned off and the results of molecular
separation can be observed. The final position of a particu-
Many experimental techniques are used to identify variation in lar molecule of protein, DNA, or RNA is identified as the
DNA, RNA, and proteins. A few of these are described in later electrophoretic mobility of the molecule. The electropho-
chapters when knowing the details of a technique is necessary retic mobilities of the experimental molecules in a gel can
for understanding the analysis of experimental results. But be compared with one another, compared between gels, and
one technical approach to the assessment of nucleic acid and compared with molecular weight or size marker standards
protein variation—gel electrophoresis—forms the basis for (molecules with known electrophoretic mobilities) to ascer-
several other techniques and is worth presenting in advance. tain information about variation.
16 CHAPTER 1 The Molecular Basis of Heredity, Variation, and Evolution
1 Pour agarose gel into 2 Allow gel to solidify. 3 Remove comb; wells
plastic casting tray. are left in the gel.
Comb
Samples migrate
Buffered – through gel toward
– solution positive charge.
+
+
The well is the
origin of migration
Figure 1.12 Gel electrophoresis, an essential laboratory technique in biological science research.
The first use of gel electrophoresis was in 1949, when Stains, Blots, and Probes
Linus Pauling used it to determine that inherited variation
of the red blood cell protein hemoglobin was responsible for In Pauling’s electrophoretic analysis of hemoglobin, the
the hereditary anemia known as sickle cell disease (SCD). protein under study had already been isolated from other
The hemoglobin protein is composed of two different glo- substances in his samples, so the staining revealed either
bin molecules, and one of these globins, called b@globin, is one or two “bands” in the gel, each consisting of a protein
inherited in a variant form to produce SCD. The wild-type with a distinct electrophoretic mobility—and nothing else.
b@globin protein is designated bA and the mutant b@globin Typically, however, gel electrophoresis of proteins, DNA, or
protein is designated bS. People in Pauling’s study had one of RNA contains many different molecules that can be stained
three genotypes. Those that were bSbS had SCD, and those to make their positions known for analysis. The bands can
that were either bAbA or bAbS did not have the disease. Paul- be stained in such a way that all separated substances are
ing sought to distinguish these three hemoglobin genotypes visualized, or they can be stained in such a way that only a
from one another by detecting the different type or types of specific protein or a specific sequence of DNA or RNA will
b@globin protein each contained. Pauling’s electrophoretic show up. General stains or dyes are those that label all of
analysis revealed that the protein band seen in the bSbS lane the different proteins or all the nucleic acid bands in a gel.
of Figure 1.13 had lower electrophoretic mobility (smaller Specific labels, on the other hand, bind to just a single kind
distance migrated from the origin) than the protein band of protein or a particular nucleic acid sequence.
detected in the bAbA lane. A single band is detected in each When an investigator wants to see all of the molecules
of these lanes, suggesting that all the protein in the lane is present in a DNA or RNA electrophoretic gel, a general
identical. In contrast, when an electrophoresis lane contained labeling compound called ethidium bromide (EtBr) can be
protein from a bAbS individual, the protein in that lane sepa- used as a chemical tag. EtBr attaches to all DNA or RNA
rated into two bands, each corresponding to the electropho- in a gel by binding to the sugar-phosphate backbone. The
retic mobility of a different one of the protein bands in the exposure of gels containing EtBr-stained nucleic acids
other lanes. to ultraviolet light excites the EtBr and causes it to emit
1.4 Genetic Variation Can Be Detected by Examining DNA, RNA, and Proteins 17
(a) (b)
Lane 1 2 3 4 5 6 7 8 Lane 1 2 3 4
3000 bp
1000 bp
500 bp
Figure 1.14 Visualization of nucleic acids and proteins in electrophoresis gels. (a) The nucleic acids DNA
and RNA are visualized using the compound ethidium bromide (EtBr) that binds to nucleic acid molecules
and emits fluorescent light when excited by ultraviolet light. Molecular weight size markers in lanes 1 and
8 (bp = base pairs) aid in determining the size of molecules in the different bands in the experimental lanes
2 through 7. (b) General protein stains bind to proteins in electrophoresis gels to reveal bands. Protein stan-
dards in lane 1 (kDa = kilodaltons) aid in determining the sizes of proteins in experimental lanes 2 through 4.
18 CHAPTER 1 The Molecular Basis of Heredity, Variation, and Evolution
specific string of letters in response to a “find” command, elements. These are mobile DNA sequences that can move
biologists use molecular probes to locate target nucleic acid throughout the genome (see Section 11.7). It also showed
sequences or target proteins dispersed by electrophoresis. that almost 26% of the genome consists of noncoding
introns, and only 1.5% of the genome consists of protein-
DNA Sequencing and Genomics coding exons. Section 16.1 provides additional details of the
content and genetic annotation of the human genome.
Genomics is the field that focuses on the sequencing, inter-
Genome sequencing and analysis are not limited to liv-
pretation, and comparison of genomes of different organ-
ing species. Several extinct species have recently had their
isms. Genomic data collection and analysis involve an array
genomes sequenced for comparison with those of living rela-
of molecular techniques and analytical strategies that aid in
tives. These species include the mastodon (for comparison to
identification and examination of the totality of the DNA in
the elephant), the quagga (for comparison to the zebra), and
a cell, nucleus, or organelle (mitochondria and chloroplasts)
two extinct lineages of early humans, Neandertal and Den-
carried by a species. Indeed, genomics has made critical con-
isovans (for comparison to the modern human genome). We
tributions to many areas of biological investigation. From
look at the interesting results of the Neandertal–Denisovan–
medicine to the study of hereditary variation to the study of
Homo sapiens genome comparisons in the Case Study that
evolution, genomic data are proving critically important.
concludes the chapter.
Much has changed in DNA sequencing since it began
in the 1980s. Genome sequencing is accomplished today
Proteomics and Other “-omic” Analyses
by automated high-throughput methods, so-called next-
generation sequencing that is thousands of times faster, and On the heels of genomic sequencing, additional arenas of
far cheaper, than the original genome sequencing meth- “-omic” investigations and analyses have developed.
ods (see Chapters 7 and 18 for details and applications). Proteomics, the study of the proteome, the complete set
To date, thousands of genome sequences have been of proteins encoded in a genome, examines the functions of
compiled. Among the smallest genomes are those of viruses, proteins, their localization, their regulation, and their inter-
mitochondria, and chloroplasts, which generally contain actions in a comprehensive way. In other words, rather than
tens of thousands to a few hundred thousand base pairs. In analyzing the structure and function of individual proteins and
contrast, the largest sequenced genomes are those of some looking one by one for interacting partners, proteomics is a
plant species that carry multiple sets of chromosomes from methodology for examining large numbers of proteins at once.
their progenitors and have billions of base pairs. Genome Multiple techniques are used to collect and analyze the pro-
sizes are usually reported in megabases (Mb), with 1 Mb teomes of organisms. Among the numerous applications for
equal to 1 million base pairs. proteomics is the use of proteomic analyses to decipher com-
Certain selected species known as “model organisms” plex networks of protein–protein interaction in cells to find the
are commonly used in genetics and genomics experiments. number and types of such interactions there (see Section 11.1).
They are selected because their biology is well known, they Transcriptomics, the study of the transcriptome, the
are easy to work with and propagate, and they can be inves- complete set of genes that undergo transcription in a given
tigated through multiple experiments and thus be seen from cell, allows researchers to investigate and compare different
a more complete perspective. A reference table inside the cell types to identify differences in the genes that are tran-
book back cover provides genomic and other critical infor- scribed there, to characterize changes in the levels of gene
mation about nine model organisms, including the bacterium transcription within a single cell type, or to see how bio-
E. coli, the small flowering plant Arabidopsis thaliana, the logical changes affect transcription. Such studies can make
yeast Saccharomyces cerevisiae, the fruit fly Drosophila important contributions to the understanding of biological
melanogaster, and humans (Homo sapiens). abnormalities in cancer by identifying the genes whose tran-
Genomics has a seemingly limitless array of applica- scription is either increased or decreased in cancer cells ver-
tions. For example, genomic techniques and analyses can sus normal cells. Along the same lines, metabolomics, the
be used to identify specific genes, to identify allelic vari- study of chemical processes involving metabolites, exam-
ants producing hereditary diseases, to map genes, to identify ines metabolic processes and outcomes in specific cells, tis-
regions of genomes that increase or decrease the likelihood sues, organs, and organisms. Metabolomic comparisons of
of an organism expressing a particular trait, to compare gene related organisms ties directly to genomics through shared
sequences within and among species, to trace the evolu- genetic ancestry. Metabolomics can also reveal new genetic
tion of genes, and to identify the evolutionary relationships adaptations that have altered metabolism in organisms.
between related organisms. Each of these “-omic” approaches has its own goals, but
The Human Genome Project, completed in 2000, was collectively they also share a common goal—to contribute to
a landmark achievement that, by producing the nucleotide the comprehensive understanding of complex biological sys-
sequence of an entire representative human genome, set a tems. Systems biology, a comprehensive, systems-oriented
new course for the genetic investigation of humans. In so approach to understanding biological complexity, has
doing, it made some striking discoveries. For example, become possible through the development and integration of
45% of the human genome consists of transposable genetic genomics, proteomics, transcriptomics, and metabolomics.
1.5 Evolution Has a Genetic Basis 19
One overarching goal of the biological sciences—to which These early life-forms have given rise to a dazzling
genetics is a principal contributing discipline—is to achieve array of species, most now extinct. Some of those extinct
an all-inclusive understanding of the normal and abnormal ancestors, however, gave rise to the modern species that
biology of organisms through systems biology. inhabit every conceivable ecological niche on Earth, from
Applied to humans, for example, systems biology aims the most temperate to the most extreme.
to understand how cells work in health and disease, to explain
the details of how a single cell develops into a complete organ- Darwin’s Theory of Evolution
ism, and even to explain phenomena as complex as learning,
memory, personality, and the development of personality dis- Over the millennia since life originated, untold millions
orders. These enormously complex attributes of organisms of species have come and gone, through the operation of
result in part from networks of interactions between genes, shared processes that faithfully replicated their DNA and
proteins, metabolites, and environmental influences. They passed it on to the next generation while also allowing for
are the most challenging objects of study in modern biology, the accumulation of variation that drives diversification.
requiring both the understanding of genetic principles and This variation, the changes life has undergone, is explained
analysis and the use and application of new tools and technol- by the theory of evolution, which says that all organisms
ogies for data collection and assessment. This is the exciting are related by common ancestry and have diversified over
and dynamic world in which modern genetics operates. time. The four widely recognized evolutionary processes
are described below, but first some general comments on
Charles Darwin’s theory of evolution by natural selection.
This view of evolution was proposed separately and
1.5 Evolution Has a Genetic Basis independently by both Darwin and Alfred Wallace in the late
1850s. Both authors based their proposals on firsthand obser-
As biologists survey varieties of life, assess the genetic vations of the distribution and diversity of life across the globe.
similarities and differences between species, and explore Each author described higher rates of survival and repro-
the relationships of modern organisms to one another and duction of certain forms of a species over alternative forms
to their extinct ancestors, it becomes apparent that all life through the process of natural selection that favors the survival
is connected through DNA. Richard Dawkins, a biologist and reproduction of the most fit individuals in each genera-
and author of several books on evolution, made note of this tion. Unlike the other processes we describe in this overview
molecular connection, observing that life “is a river of DNA, of evolution, natural selection works at the phenotypic level,
flowing and branching through geologic time.” This shared but like all evolutionary processes, its effectiveness is based
DNA connecting all organisms throughout time is a basis on underlying genetic variation. Natural selection operating to
for identifying and studying relationships between organ- favor one morphological form over others increases the fre-
isms and tracing their evolutionary histories. quency of the favored form in the population and, by doing so,
Life is not static or uniform, of course; it evolves as DNA increases the frequencies of the alleles controlling the favored
diverges into separate “branches” whose metaphorical fork- form. Over many generations, forms that produce more off-
ing leads to new species. The Dawkins quote suggests that for spring also leave more copies of the alleles that control the
heredity to maintain genetic continuity across generations and
for variation to develop between organisms and evolve new
species, the biochemical processes that replicate DNA and
express the genetic information must also be universal. From
this perspective the universality of DNA as the hereditary
molecule of life, the shared processes of DNA replication and
transcription, and the use of the same genetic code by all life
are consistent with the idea of a single origin of life that has
evolved into the millions of species inhabiting Earth today as
well as other millions that preceded them but are now extinct.
Life on Earth originated from a single source during the
Archaean Eon that lasted from 4 billion to 2.5 billion years
ago. In 2011, an international group of scientists led by
David Wacey discovered fossils of a sulphur-metabolizing
single-celled organism in 3.49-billion-year-old rocks from
Western Australia (Figure 1.15). At that time in Earth’s his-
tory there was very little oxygen present, and the first living
organisms, likely not much different from those identified
in fossil form, metabolized sulphur-containing compounds Figure 1.15 Ancient fossilized single-celled organisms. These
for growth. Organisms with similar metabolism exist today single-celled sulphur-metabolizing organisms are fossilized in
around hot springs and thermal vents. 3.49-billion-year-old rocks collected in Western Australia.
20 CHAPTER 1 The Molecular Basis of Heredity, Variation, and Evolution
phenotype, creating the hallmark of evolutionary change— selective advantage nor a selective disadvantage to their
change in the genetic makeup of the population. bearer, yet their evolutionary basis is fundamentally the
Darwin’s theory of evolution by natural selection is same as that of adaptive evolution, as the following para-
now a firmly established scientific fact incorporating three graphs attest.
principles of population genetics that were obvious to many
naturalists in Darwin’s day but were not assembled into a Four Evolutionary Processes
coherent model until Darwin articulated their connection
in his 1859 publication The Origin of Species by Means of The foundations of evolutionary genetics (which, you will
Natural Selection. Darwin’s union of observation and prin- recall, studies and compares genetic changes in popula-
ciples into an evolutionary theory had a revolutionary effect tions and species over time) were established in the first
on biology and laid the foundation of the modern biological four decades of the 20th century by several notable evo-
sciences. Darwin’s principles of populations are lutionary biologists and innumerable lesser-known indi-
viduals. Interestingly, this work took place before DNA
1. Variation exists among the individual members of pop- was identified as the hereditary material and before the
ulations with regard to the expression of traits. chemical structure of genes was defined and understood.
2. Hereditary transmission allows the variation in traits to Ronald Fisher, Sewall Wright, J. B. S. Haldane, and many
be passed from one generation to the next. others devised mathematical and statistical models of gene
3. Certain variant forms of traits give the individuals that frequency distribution and evolution in populations and
carry them a higher rate of survival and reproduction in species, leading to evolutionary hypotheses that have been
particular environmental conditions. These organisms tested and verified countless times in laboratory and natu-
leave more offspring and increase the frequency of the ral populations.
variant form in the population. Through this massive body of work, evolutionary biol-
ogy has confirmed Darwin’s model of the evolution of spe-
Yet although Darwin laid out the general process by cies by natural selection and expanded the description of
which species evolved, he never understood the underlying evolution to include three additional processes. Thus, biolo-
hereditary mechanisms that allowed the process to occur. gists identify four processes of evolution, each leading to
Today, however, nearly 160 years after Darwin introduced changes in the frequencies of alleles in a population over
his revolutionary proposal, biologists fully understand the time, a hallmark characteristic of evolutionary change. The
role of genetics in evolution. With regard to Darwin’s evolu- four evolutionary processes are
tionary principles, biology has established that
1. Natural selection—the differential survival and repro-
1. Phenotypic variation of expressed traits reflects inher- duction of members of a population owing to posses-
ited genetic variation. DNA-sequence differences sion of favored traits. Population members with the
(allelic variation) must be the cause of phenotypic best-adapted morphological form are best able to sur-
variation if evolution is to occur. vive and reproduce, and they leave more offspring than
2. Hereditary transmission of phenotypic variation requires those possessing less-adaptive forms. Over time, the
that offspring inherit and express the alleles that were frequency of the best-adapted form and the alleles that
responsible for the variation in parental organisms. produce it increase in the population.
3. Organisms carrying alleles that are favored by natural 2. Migration—the movement of individual organisms
selection have a reproductive advantage over organisms from one population to another. This migratory move-
that do not carry favored alleles. The former group ment transfers alleles from one population to another,
therefore leave more copies of their alleles in the next and if the allele frequencies between the populations
generation, causing the population to evolve through a are different and if the number of migrating individu-
change in allele frequency. als is large enough, migration can rapidly alter allele
frequencies.
In other words, progressive phenotypic change in a popula-
tion is paralleled by genetic changes. 3. Mutation—the slow acquisition of inherited variation
In this particular process of evolution—evolution by that increases the diversity of populations and serves as
natural selection—one form reproduces in greater numbers the “raw material” of evolutionary change. Mutation,
than others in a population because of being better adapted occurring in many different ways in genomes, provides
to the conditions driving natural selection. This process, the genetic diversity that is essential for evolution.
also known as adaptive evolution, is common; but many 4. Genetic drift—the random change of allele frequen-
examples of so-called nonadaptive evolution (or neutral cies due to chance in randomly mating populations.
evolution), the evolution of characteristics that are repro- Genetic drift occurs in all populations, but it is most
ductively or functionally equivalent to other forms in the pronounced in very small populations, where statisti-
population, are also observed. Nonadaptive traits are neu- cally significant fluctuations in allele frequencies can
tral with respect to natural selection, conferring neither a occur from one generation to the next.
1.5 Evolution Has a Genetic Basis 21
By the middle of the 20th century, the modern synthesis have different functions, but they share the same underly-
of evolution—the name given to the merging of evolutionary ing structure in terms of the number and arrangement of
theory with the results of experimental, mathematical, and bones in the limbs. These similarities are due to the com-
molecular population biology—emerged as a unified view of mon ancestry of vertebrates.
evolution. The modern synthesis tells the story of morpho- In some apparent cases of synaptomorphy, the simi-
logical and molecular evolution of plant and animal species larities are not a result of sharing a close common ancestor.
using experimentally verified processes and mechanisms. Instead, convergent evolution has led unrelated organisms to
Among the best-known principal architects of the mod- display similar-looking traits. Such instances are known as
ern synthesis are Theodosius Dobzhansky and Ernst Mayr, homoplasmy. One example of homoplasmy is the presence
who drew together ideas from Darwin, Fisher, Wright, of wings in birds and bats. These wings—despite the simi-
Haldane, and others to demonstrate how evolution oper- larities brought about by convergent evolution—have inde-
ates in real populations. Dobzhansky and Mayr profoundly pendent origins.
influenced the thinking and research of generations of biol- Figure 1.16 shows a phylogenetic tree for 14 finch
ogists by demonstrating that evolutionary events revealed species that inhabit the Galápagos Islands. These finch
by laboratory investigations and in natural populations are
consistent with the predictions made by Fisher, Wright, and
Haldane. In simple terms, Dobzhansky and Mayr showed
Ground finches
that evolution in populations and evolution in species occur Seed eaters
as predicted by evolutionary theory. Today, having been
fleshed out by the work of countless researchers, the mod- Large
ern synthesis gives a clear and virtually complete picture of
the factors that produce the evolutionary changes in popula- Medium
tions and of the mechanisms that produce the evolution of
species. Evolutionary examples are incorporated into many Small
of the chapters of this book, and Chapter 20 is devoted spe-
cifically to evolution in species and in populations. Large Cactus
flower
Cactus eaters
Tracing Evolutionary Relationships
Evolutionary biologists investigate evolution by looking for Tree finches
evidence of morphological (physical) and molecular (DNA, Insect eaters
RNA, and protein) changes in populations and organisms Small
over time. Both morphological and molecular comparisons
Large
can be used to identify relationships between living species
and to reveal ancestor–descendant relationships. These sim- Medium
ilarities and differences can be depicted in a diagram called Woodpecker
a phylogenetic tree, a branching diagram that describes the
ancestor–descendant relationships among species or other Mangrove
taxa. The tree of life shown in Figure 1.3 is one type of
phylogenetic tree. These trees summarize the evolutionary Vegetarian Bud
histories of species by using branching points in the tree to finch eater
represent the common ancestors of descendant organisms.
The most commonly used approach to phylogenetic Sharp-beaked Seed
tree construction is the cladistic approach, which depicts finch eater
species’ evolutionary relationships by sorting the species
Warbler finches
into groups called clades, or monophyletic groups, based Insect eaters
on shared derived characteristics, or synaptomorphies, Common Gray
either morphological or molecular. Synaptomorphies are ancestor
shared by organisms that are members of a clade. Such Green
sharing of traits is interpreted to indicate that the common
ancestor shared by clade members also possessed the trait. Figure 1.16 Morphological evolution. A phylogenetic tree
Synaptomorphies, whether they are of body morphology, based on morphological and other characteristics shows the appar-
proteins, or nucleic acid sequence, occur through homology, ent evolutionary relationships between 14 species of finches inhab-
iting the Galápagos Islands.
the presence of the trait or sequence in a common ancestor.
An example of morphological homology is limb structure Q What role did geographic isolation play in the evolution of
in vertebrates. The limbs of humans, horses, bats, and seals Darwin’s finches?
22 CHAPTER 1 The Molecular Basis of Heredity, Variation, and Evolution
species were one of the groups studied by Darwin as he Constructing Phylogenetic Trees Using Proteins or
formulated his evolutionary theory. The tree shown here is Nucleic Acids Phylogenetic trees based on molecular
based on a number of morphological and behavioral char- characteristics are constructed in the same manner as those
acteristics, including the beak shape, beak size, feeding based on morphological characteristics, except the shared
habits, and habitat of each species, as well as its degree of features are DNA sequences or the amino acid sequences
isolation or separation from other species in the Galápagos of proteins. Descendant groups have nucleic acid or amino
Islands. acid sequences that are derived from ancient sequences
possessed by their common ancestors (i.e., homology).
Constructing Phylogenetic Trees Using Morphology As a consequence of DNA sequence homology, the most
and Anatomy Consider the features shared by various closely related molecular sequences are those that have the
animals listed in Figure 1.17. One morphological feature smallest number of differences between them, and they are
common to all these animals is the presence of a back- carried by the most closely related species.
bone. This feature unites these animals into a clade we Figure 1.18 examines the first 15 nucleotides of the
know as vertebrates that all share a common vertebrate b-globin gene from seven species (a to g). In the figure, the
ancestor. A second morphological feature, the presence sequences have been aligned vertically, and the number of
of four legs, unites all the tetrapod animals and excludes differences between the top sequence and each of the other
salmon. Thus, all the animals except the salmon can be sequences is noted in the first step of the figure.
united into a clade we call tetrapods. Because fish are not A common method of constructing a phylogenetic tree
within the clade of tetrapods, they form an outgroup to begins with pairwise comparisons of genes or nucleotide
tetrapods. An outgroup is a taxon or group of taxa that sequences, grouping the most similar sequences or genes
is related to, but not included within, the clade in ques- closest together (on the assumption that they are the most
tion. The species within the clade of interest are called closely related) and subsequently bringing in the more dis-
the ingroup. In our example, each successive clade is tantly related sequences to add to the tree. Analysis in this
identified by grouping species based on other shared example begins with sequences a and b, since they are iden-
characteristics. tical, and then successively attaches more distantly related
After a phylogenetic tree has been constructed, it may sequences to the tree. Sequence information from c, which
be used to infer the characters of ancestral species. For differs from a and b at one nucleotide, is appended next, fol-
example, we can infer that the common ancestor of all the lowed by the other sequences. A completed phylogenetic tree
taxa in Figure 1.17 had a backbone, which would there- constructed by following these steps recapitulates the known
fore be an ancestral character; but it did not have four legs, phylogeny of vertebrates.
which in this case would be a derived character that evolved Genetic Analysis 1.3 guides you in constructing a sim-
later, in the common ancestry of tetrapods. ple phylogenetic tree.
The availability of DNA sequence data and genomic
data has revolutionized how we construct and view phylog-
enies. Some groups that were traditionally grouped together,
such as mammals, birds, and amphibians, do prove, from
Morphologic characteristics DNA sequence and genomic data, to be monophyletic
Backbone Four Fur, Live Placenta Opposable groups. However, analyses have indicated that reptiles and
legs milk young thumbs
fish are not monophyletic groups. For example, crocodiles
Primate
clade
Kangaroo
clade
Sequence 1 5 10 15
2 Identical and very a GTGTGCTGGCCCACA
closely related Clade b GTGTGCTGGCCCACA
sequences form a clade. c GTGTGCTGGCTCACA
Sequence 1 5 10 15
3 Sequence d, the next Ancestral sequence for a–c a GTGTGCTGGCCCACA
The ancestral sequence for
closest, differs at the GTGTGCTGGCCCACA b GTGTGCTGGCCCACA species a–c can be inferred by
amino acid positions 1, c GTGTGCTGGCTCACA comparing sequences a–c with
6, and 10. At position that of an outgroup, species d.
d TTGTGTTGGGCCACA
11, d is the same as a
and b; this means C is
the ancestral nucleo- 1 5 10 15
tide at position 11. a GTGTGCTGGCCCACA
b GTGTGCTGGCCCACA
c GTGTGCTGGCTCACA
Successively add in the d TTGTGTTGGGCCACA
next closest sequence, etc. e TCGTCTTGGCCCGAA
1 5 10 15
4 Note that the T a GTGTGCTGGCCCACA
at position 11 in b GTGTGCTGGCCCACA
sequences c and f c GTGTGCTGGCTCACA
is derived through Ancestral sequence for a–e
TTGTC?T?GCCC?CA d TTGTGTTGGGCCACA
evolutionarily
independent e TCGTCTTGGCCCGAA Ancestral sequence is
mutations from f TTGTCATCGCTACAA ambiguous at the nodes
ancestral C; this is g TTGTCATTGCCGCAA between e and f, g.
homoplasy.
1 5 10 15
a GTGTGCTGGCCCACA Homo sapiens (human)
b GTGTGCTGGCCCACA Pan troglodytes (chimpanzee)
c GTGTGCTGGCTCACA Canis familiaris (domestic dog)
d TTGTGTTGGGCCACA Rattus norvegicus (Norway rat)
5 This phylogeny
recapitulates the e TCGTCTTGGCCCGAA Hynobius retardatus (salamander)
known phylogeny of f TTGTCATCGCTACAA Danio rerio (zebrafish)
vertebrates. g TTGTCATTGCCGCAA Salmo salar (Atlantic salmon)
Figure 1.18 Construction of a phylogenetic tree based on molecular characters, using the principle of
homology.
Q How is change in DNA sequence through mutation related to the concept of gene homology?
GENETIC
GENETIC
ANALYSIS
ANALYSIS
1.3X.X
PROBLEM Evolutionary biologists have searched the genomes of pigs, Organism Gene
whales, and cows to identify the presence or absence of six genes, labeled A
to F in the table at right. A gene is marked with a A B C D E F
BREAK IT DOWN: Correlation of
the presence or absence of certain plus symbol ( +) if it is found in a genome, or by a Pig + - - + - -
genes in comparisons between minus symbol ( -) if it is not found. Use the infor-
organisms provides clues to shared Whale + + + - + -
ancestry. More shared genes usually
mation in the table to construct the most likely
indicates a closer evolutionary rela- phylogenetic tree relating cow, whale, and pig. Cow + + + - - +
tionship (p. 22).
Evaluate
1. Identify the topic this problem 1. This problem concerns the use of genetic characteristics to construct a phylo-
addresses and the nature of the genetic tree depicting, in this case, the relationships between three mammals.
required answer.
2. Identify the critical information given 2. The presence or absence of each of six genes is given for each type of
in the problem. mammal.
Deduce
3. Identify genes shared by all three 3. Of the six genes tested, gene A is found in all three organisms. Genes B and
groups, genes shared by two of the C are shared by whale and cow genomes but are not detected in the pig
groups, and genes unique to one genome. Gene D is unique to pigs, E is unique to whales, and F is unique to
group. cows.
Solve TIP: Genes shared by organisms are likely to
have been present in their common ancestor.
4. Assign shared genes to phylogenetic 4. Gene A is assigned to the base of the phylogenetic tree, which ascends (when
branches that in the completed tree the diagram is viewed as a tree) from the common ancestor of the three
will be shared by the corresponding organisms. Genes B and C are assigned to a branch shared by whale and cow.
organisms.
B, C Whale
A Cow
Pig
5. Assign genes unique to each genome 5. Genes D, E, and F are unique to separate groups and therefore are placed on
to branches that are not shared by separate branches. The complete phylogenetic tree containing all genes is
other organisms. shown below.
E
B, C Whale
A F
Cow
D
Pig
For more practice, see Problems 22, 25, and 21. Visit the Study Area to access study tools. Mastering Genetics
C A SE ST U D Y
Ancient DNA: Genetics Looks into the Past
In 1878 the last member of a now extinct relative of the biochemist Allan Wilson visited a dusty warehouse in South
zebra died in the wild in South Africa. The animal, known as Africa and scraped some 125-year-old dried muscle tissue off
the quagga (“kwa-ga”) was once numerous in South Africa, the back of several quagga hides that had languished there for
but hunting wiped the species out. A few quaggas were cap- decades. The researchers were hoping to find DNA derived
tured for exhibition in zoos, including the Amsterdam Zoo, from the intracellular organelles called mitochondria that might
where the last living quagga died in 1883, and the Regent’s have been preserved in the desiccated tissue. The samples
Park Zoo in London, where the only known photographs of a they brought back to the laboratory for analysis yielded a tiny
quagga was taken in 1870 (Figure 1.19). amount of highly fragmented mitochondrial DNA (mtDNA),
One hundred years after the death of the last quagga, but it was sufficient for making a comparison between quagga
a group of molecular biologists working with the American mtDNA and mtDNA from living mountain zebras. The results,
24
Case Study 25
SU MMA RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
1.1 Modern Genetics Is in Its Second Century ❚❚ Certain DNA sequences, most commonly promoters, bind
RNA polymerase and other transcriptional proteins.
❚❚ Genetic principles first outlined by Gregor Mendel in 1865 ❚❚ Translation is the process that uses messenger RNA
were “rediscovered” in 1900 and so made modern genetics (mRNA) sequences to synthesize proteins.
a 20th-century scientific discipline.
❚❚ Messenger RNA codons base-pair with tRNA anticodons at
❚❚ Study of the transmission of morphological variation the ribosome.
during the first half of the 20th century established
❚❚ Each tRNA carries a specific amino acid that is added to the
transmission genetics as a central focus of genetic
growing polypeptide chain.
analysis.
❚❚ The genetic code contains 61 codons that specify amino
❚❚ The analysis of DNA, RNA, and protein beginning in the
acids and 3 that are stop codons.
second half of the 20th century established genetics as a
molecular discipline.
❚❚ Life on Earth has three domains—Bacteria, Archaea, 1.4 Genetic Variation Can Be Detected
and Eukarya—that share a common evolutionary by Examining DNA, RNA, and Proteins
history.
❚❚ Gel electrophoresis efficiently separates different proteins,
DNA fragments, or RNA based on their electrophoretic
1.2 The Structure of DNA Suggests mobility.
a Mechanism for Replication ❚❚ Following gel electrophoresis of DNA fragments,
Southern blotting uses labeled single-stranded nucleic
❚❚ Deoxyribonucleic acid (DNA) is the genetic material. acid molecular probes to bind to a specific target DNA
DNA is a double helix containing two strands of nucleo- sequence on a fragment by complementary base pairing
tides that are composed of a five-carbon deoxyribose (hybridization).
sugar, a phosphate group, and one of four nucleotide ❚❚ Northern blotting is performed by hybridizing a labeled
bases: adenine (A), thymine (T), cytosine (C), or single-stranded nucleic acid probe to mRNA.
guanine (G).
❚❚ Western blotting uses labeled antibodies as molecular
❚❚ Nucleotides in a DNA strand are joined by covalent phos- probes to bind to target proteins.
phodiester bonds between the 5′ phosphate of one nucleo-
tide and the 3′ OH of the adjoining nucleotide. ❚❚ Genomics, proteomics, transcriptomics, and metabolomics
are new investigative strategies that can help decipher com-
❚❚ DNA strands are joined by hydrogen bonds that form plex problems of systems biology.
between complementary base pairs. A pairs with T and C
pairs with G.
❚❚ Strands of the DNA duplex are antiparallel; one strand is 1.5 Evolution Has a Genetic Basis
oriented 5′ S 3′, and the complementary strand is oriented ❚❚ Four processes—natural selection, migration, mutation,
3′ S 5′. and genetic drift—drive the evolution of populations and
❚❚ DNA replicates by a semiconservative process that species.
produces exact copies of the original DNA double helix. ❚❚ The evolution of adaptive morphological characters occurs
❚❚ DNA polymerase uses one strand of DNA as a template to through natural selection pressures exerted on species by
synthesize a complementary daughter strand one nucleotide their environments. Nonadaptive characters that are neutral
at a time in the 5′@to@3′ direction. with respect to natural selection evolve by other evolution-
ary processes.
1.3 DNA Transcription and Messenger RNA ❚❚ The modern synthesis of evolution is the name applied to
the union of transmission genetics, molecular genetics,
Translation Express Genes Darwinian evolution, and modern evolutionary genetics.
❚❚ The central dogma of biology (DNA S RNA S protein) ❚❚ Phylogenetic trees describe the evolutionary relation-
identifies DNA as an information repository and describes ships among modern species and trace their descent from
how DNA dictates protein structure through a messen- common ancestors to identify the most likely pattern of
ger RNA intermediary that in turn directs polypeptide evolution.
synthesis. ❚❚ Shared derived characteristics are molecular or morphologi-
❚❚ Transcription is the process that synthesizes single-stranded cal attributes that evolve in descendant species from ancient
RNA from a template DNA strand. characters found in a common ancestor.
❚❚ RNA transcripts have the same 5′ S 3′ polarity and ❚❚ Molecular phylogenies trace the evolution of nucleic acid
sequence as the coding strand of DNA; they differ only in or protein sequences from common ancestors to modern
the presence of U rather than T. species.
Problems 27
PRE PA R I N G F O R P R O BLEM S O LV I NG
In addition to the list of problem-solving tips and sug- level (chromosome, nucleus, ribosome, etc.), and the
gestions given here, you can go to the Study Guide and phenotypic level (wild type, mutant, etc.).
Solutions Manual that accompanies this book for help at
3. Be prepared to describe and analyze the relationships
solving problems.
between DNA, RNA, and protein.
1. Understand the basic terminology of genetics. Key
4. Reacquaint yourself with the fundamentals of DNA rep-
terms are in bold when they are first defined and used
lication, transcription, and translation before studying
in descriptions. Key terms are also defined in the
the chapters where these processes are described in detail.
Glossary at the back of the textbook.
5. Understand the four processes that drive evolutionary
2. Recognize the levels at which genetic information and
change.
expression are described and analyzed: the molecular
level (DNA, RNA, protein, etc.), the sequence level 6. Be prepared to construct and analyze phylogenetic
(gene, allele, wild type, mutant, etc.), the microscopic trees.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Genetics affects many aspects of our lives. Identify three d. translation
ways genetics affects your life or the life of a family mem- e. DNA replication
ber or friend. The effects can be regularly encountered or f. gene
can be one time only or occasional. g. chromosome
h. antiparallel
2. How do you think the determination that DNA is the hered-
i. phenotype
itary material affected the direction of biological research?
j. complementary base pair
3. A commentator once described genetics as “the queen k. nucleic acid strand polarity
of the biological sciences.” The statement was meant to l. genotype
imply that genetics is of overarching importance in the m. natural selection
biological sciences. Do you agree with this statement? In n. mutation
what ways do you think the statement is accurate? o. modern synthesis of evolution
4. All life shares DNA as the hereditary material. From an 11. Compare and contrast the genome, the proteome, and the
evolutionary perspective, why do you think this is the case? transcriptome of an organism.
5. Define the terms allele, chromosome, and gene and 12. With respect to transcription describe the relationship
explain how they relate to one another. Develop an and sequence correspondence of the RNA transcript and
analogy between these terms and the process of using a the DNA template strand. Describe the relationship and
street map to locate a new apartment to live in next year sequence correspondence of the mRNA transcript to the
(i.e., consider which term is analogous to a street, which to DNA coding strand.
a type of building, and which to an apartment floor plan). 13. Plant agriculture and animal domestication developed inde-
6. Define the terms genotype and phenotype, and relate them pendently several times and in different locations in human
to one another. history. Do a brief Internet search and then list the approxi-
mate locations, time periods, and crops developed in three
7. Define natural selection, and describe how natural selec- of these agricultural events. What role do you think ideas
tion operates as a mechanism of evolutionary change. about heredity may have played in these events?
8. Describe the modern synthesis of evolution, and explain 14. Briefly describe the contribution each of the following
how it connects Darwinian evolution to molecular evolution. people made to the development of genetics or genetic
9. What are the four processes of evolution? Briefly describe analysis.
each process. a. Archibald Garrod
b. Rosalind Franklin
10. Define each of the following terms: c. Robert Hooke
a. transcription d. William Bateson
b. allele e. Rudolph Virchow
c. central dogma of biology f. Edmund B. Wilson
28 CHAPTER 1 The Molecular Basis of Heredity, Variation, and Evolution
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
15. If thymine makes up 21% of the DNA nucleotides in the 23. Fill in the missing nucleotides (so there are three per
genome of a plant species, what are the percentages of the block) and the missing amino acid abbreviations in the
other nucleotides in the genome? graphic shown here.
16. What reactive chemical groups are found at the 5′ and
3′ carbons of nucleotides? What is the name of the bond DNA
formed when nucleotides are joined in a single strand? Is Coding 5¿ GGC GA T 3¿
this bond covalent or noncovalent?
Template 3¿ C G 5¿
17. Identify two differences in chemical composition that dis-
tinguish DNA from RNA. mRNA codon
5¿ UAC A A 3¿
18. What is the central dogma of biology? Identify and
describe the molecular processes that accomplish the tRNA anticodon
flow of genetic information described in the central 3¿ UUA 5¿
dogma.
19. A portion of a polypeptide contains the amino acids Amino acid
Trp-Lys-Met-Ala-Val. Write the possible mRNA and tem- 3-letter MET
plate-strand DNA sequences. (Hint: Use A/G and T/C to 1-letter E S
indicate that either adenine/guanine or thymine/cytosine
could occur in a particular position, and use N to indicate
that any DNA nucleotide could appear.)
24. Suppose a genotype for a protein-producing gene can
20. The following segment of DNA is the template strand have any combination of three alleles, A1, A2, and A3.
transcribed into mRNA: a. List all the possible genotypes involving these three
5’-...GACATGGAA...-3’ alleles.
b. Each allele produces a protein with a distinct
a. What is the sequence of mRNA created from this electrophoretic mobility. Allele A1 has the highest
sequence? electrophoretic mobility, A3 has the lowest electro-
b. What is the amino acid sequence produced by translation? phoretic mobility, and the electrophoretic mobil-
21. Using the following amino acid sequences obtained from ity of A2 is intermediate between them. Draw the
different species of apes, construct a phylogenetic tree of appearance of gel electrophoresis protein bands
the apes. for each of the possible genotypes. Be sure to
label each lane of the gel with the corresponding
Pongo pygmaeus G G P H Y R L I A V E D genotype.
Pongo abelii G G P H Y R L I A V E D 25. Shorter fragments of DNA (those with fewer base pairs)
Pan paniscus G A P H F R L L A V E E have a higher electrophoretic mobility then larger frag-
Pan troglodytes G A P H F R L L A V E E ments. Thinking about electrophoresis gels as creating
a matrix through which fragments must migrate, briefly
Gorilla gorilla G A P H F R L I A V E E
explain why the size of a DNA fragment affects its elec-
Gorilla beringei G A P H F R L I A V E E trophoretic mobility.
Homo sapiens G A P H F N L L A V E E
26. Four nucleic-acid samples are analyzed to determine
Hylobates lar G G P H Y R L I S V E D
the percentages of the nucleotides they contain. Survey
Hoolock hoolock G G P H Y R L I S V D D the data in the table to determine which samples are
Common ancestor G G P H Y R L I S V D D DNA and which are RNA, and specify whether each
sample is double-stranded or single-stranded. Justify
22. Examine Figure 1.17 and answer the following each answer.
questions.
a. How many clades are shown in the figure? A G T U C
b. What characteristic is shared by all clades in the Sample 1 22% 28% 22% 0 28%
figure?
c. What characteristics are shared by the mammalian Sample 2 30% 30% 0 20% 20%
clade and the primate clade? What characteristic dis- Sample 3 18% 32% 0 18% 32%
tinguishes the primates from other members of the Sample 4 29% 29% 21% 0 21%
mammalian clade?
Problems 29
27. What is meant by the term homology? How is that a. How many phosphodiester bonds are required to form
different from the meaning of homoplasmy? this segment of double-stranded DNA?
b. How many hydrogen bonds are present in this DNA
28. If one is constructing a phylogeny of reptiles using DNA
segment?
sequence data, which taxon (birds, mammals, amphibians,
c. If the lower strand of DNA serves as the template tran-
or fish) might be suitable to use as an outgroup?
scribed into mRNA, how many peptide bonds are pres-
29. Consider the following segment of DNA: ent in the polypeptide fragment into which the mRNA
5’-...ATGCCAGTCACTGACTTG...-3’ is translated?
3’-...TACGGTCAGTGACTGAAC...-5’
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
30. Ethical and social issues have become a large part of the of colon cancer (choose one). Each person can undergo
public discussion of genetics and genetic testing. Choose genetic testing to identify a mutation that greatly increases
two of the propositions presented here and prepare a list susceptibility to the disease. Putting yourself in the place
of arguments for and against them. of the person you have chosen, provide answers to the fol-
a. The results of genetic testing for susceptibility to lowing questions.
cancer, heart disease, and diabetes should be available a. If you have a spouse or partner, are you obligated to
to insurance companies and current or prospective tell that person the result of the genetic test? Why or
employers to provide more information for decision why not?
making. b. If you have children, are you obligated to tell the chil-
b. Prenatal genetic testing and genetic testing of newborn dren the result of the genetic test? Why or why not?
infants should be available for hereditary conditions c. If you were the spouse or partner of the person you
that can be treated or managed. have selected, would you encourage or would you dis-
c. Prenatal genetic testing and genetic testing later in life courage the person from having the genetic test? Why?
should be available for hereditary conditions that can- d. If this person that you have selected were you, do you
not currently be treated or effectively managed. think you would have the genetic test or not? Can you
d. Gene therapy should be used on humans when it explain the reasons for your answer?
can correct a hereditary condition such as sickle cell
disease. 32. What information presented in this chapter and what
information familiar to you from previous general biol-
31. In certain cases, genetic testing can identify mutant alleles ogy courses is consistent with all life having a common
that greatly increase a person’s chance of developing a origin?
disease such as breast cancer or colon cancer. Between
50 and 70% of people with these particular mutations 33. It is common to study the biology and genetics of bacte-
will develop cancer, but the rest will not. Imagine you ria, yeast, fruit flies, and mice to understand biological
are either a 30-year-old woman with a family history of and genetic processes in humans. Why do you think this
breast cancer or a 30-year-old man with a family history is the case?
2 Transmission Genetics
CHAPTER OUTLINE
2.1 Gregor Mendel Discovered
the Basic Principles of Genetic
Transmission
2.2 Monohybrid Crosses Reveal the
Segregation of Alleles
2.3 Dihybrid and Trihybrid Crosses
Reveal the Independent
Assortment of Alleles
2.4 Probability Theory Predicts
Mendelian Ratios
2.5 Chi-Square Analysis Tests the Fit
between Observed Values and
Expected Outcomes
2.6 Autosomal Inheritance and
Molecular Genetics Parallel
the Predictions of Mendel’s
Hereditary Principles
ESSENTIAL IDEAS
❚❚ Mendel’s hereditary experiments with This statue of Gregor Mendel stands in the garden outside the entrance
pea plants identified two laws of heredity to the Mendel Science Center on the Philadelphia, Pennsylvania campus
known as segregation and independent of Villanova University. It was sculpted in 1998 by James Peniston and was
assortment. inspired by the statue of Mendel at the St. Thomas monastery in Brno, Czech
Republic. You can take a virtual tour of the Brno’s Mendel museum at www
❚❚ Consistent and predictable phenotype
.mendel-museum.com.
ratios in generations descending from
two parents differing for a single trait
W
support the law of segregation.
❚❚ The inheritance of two or more traits hen Gregor Mendel identified and described
is predicted by the law of independent two fundamental laws of hereditary transmission,
assortment. he ushered in a new era of understanding in biology. The
❚❚ The rules of probability predict genetic
terms Mendelian genetics and Mendelism were coined to
inheritance.
❚❚ The statistical method known as chi-
recognize this contribution, and they are used as synonyms
square analysis is used to evaluate for transmission genetics, the field that describes and
how closely the predicted outcomes investigates the patterns of transmission of genes and traits
of genetic crosses match experimental
observations. from parents to offspring. Like his contemporary Charles
❚❚ The inheritance of certain traits in Darwin, who elegantly described the process of evolution by
human families follows the hereditary natural selection, Mendel articulated a new way to view the
laws of segregation and independent
world.
assortment.
❚❚ Genes controlling four traits described
Mendel was one of a long list of amateur botanists of the
by Mendel have been identified and the 18th and early 19th centuries who conducted what were then
activity of their alleles characterized. called studies of plant hybridization in many species, including
30
2.1 Gregor Mendel Discovered the Basic Principles of Genetic Transmission 31
the edible pea plant (Pisum sativum) that was the Professor Andreas von Ettinghausen. From Professor Unger,
subject of Mendel’s experiments. Unlike those who Mendel learned to think critically about prevailing theories
of plant reproduction and hybridization. Doppler, an experi-
preceded him, however, Mendel was able to describe
mental physicist famous for describing the Doppler effect,
the mechanism of hereditary transmission, thanks in espoused a “particulate” view of physics and taught Mendel
large part to his unique and superior experimental how to study individual characteristics separately in experi-
design. Mendel’s experimental approach allowed ments. Professor Ettinghausen taught Mendel the principles
of combinatorial mathematics, the analysis of finite, or count-
him to formulate and test genetic hypotheses with a
able, sets of numbers. This branch of mathematics is central
level of rigor that no one had achieved before him or to probability theory. Mendel would apply all these lessons to
would achieve for another 35 years. his later research. In 1853, Mendel returned to Brno, where
In this chapter, we discuss the design and results of he took and passed the written portion of the permanent
teachers’ examination but apparently never completed the
Mendel’s experiments and the two laws of heredity they
oral portion, remaining a “temporary” teacher at the school
revealed. We will see (1) how Mendel’s unprecedented in Brno until he became abbot of the monastery in 1868.
experimental designs enabled him to detect genetic In the summer of 1856, after a 3-year period during
phenomena that escaped identification by his prede- which he pondered how he might pursue his interest in natu-
ral science, Mendel began his work on the heredity of traits
cessors and (2) how the transmission of traits can be
in the edible pea plant Pisum sativum. This species was
predicted using random probability theory. The chapter widely used in experimentation at the time, and Mendel had
concludes with a description of the molecular genetics no trouble gathering seeds that produced plants with distin-
of the four genes known to control traits described by guishing traits. Mendel began his studies by gathering 34
different varieties of peas. Over the next 2 years, he tested
Mendel. We begin, however, with a short biography of
each variety for its ability to uniformly reproduce identical
Mendel that explains how his educational experiences characteristics from one generation to the next. Ultimately,
shaped his approach to scientific exploration. he settled on 14 strains of Pisum representing seven individ-
ual traits, each of which had two easily distinguished forms
of expression in a seed or plant (Figure 2.1). Such traits are
called dichotomous. Mendel worked with these 14 strains
2.1 Gregor Mendel Discovered for the next 5 years, concluding his experiments in 1863.
On February 8 and March 8, 1865, Mendel discussed
the Basic Principles of Genetic his work on peas at two meetings of the Natural History
Transmission Society of Brunn (Brno). The society published his report in
its Proceedings the following year, 1866. After publication
Born in 1822 to a farming family of modest means in the of his work, Mendel corresponded with several prominent
village of Hynčice that is now part of the Czech Republic, botanists in Europe, most notably Karl Naegeli. Mendel’s
Johann (later known by his clerical name, Gregor) Mendel letters to Naegeli have scientific significance because they
completed the equivalent of high school at age 18 with a cer- clearly lay out his experiments, his results, and his conclu-
tificate attesting to exceptional academic abilities. He began sions. Unfortunately, neither Naegeli nor any of his contem-
his higher education at the Olomouc Philosophical Institute poraries seemed to grasp the importance of Mendel’s work.
in 1840, but these studies took a severe toll on his mental After becoming abbot of the monastery in 1868, Mendel
and physical health, and he gave them up after the first year. gave up his work in genetics but continued to pursue his inter-
In 1843, after attempting unsuccessfully to restart his edu- ests in bee keeping and meteorology. As abbot, he became
cation at Olomouc, he decided to pursue higher learning by involved in business activities in and around Brno, including
entering the priesthood instead. Based on its strong reputa- holding a seat on the board of directors of a local bank and
tion in teacher training and a recommendation from a former running a brewery that generated income for St. Thomas. He
teacher at Olomouc, he selected St. Thomas monastery in the faithfully served the monastery until his death in 1884. Men-
Czech city of Brno. Mendel’s duties at St. Thomas included del died in scientific obscurity, never having had the impor-
temporary teaching of natural science at a middle school in tance of his experiments understood or appreciated. Sixteen
Brno. His keen interest in teaching science and his desire to years after his death, in 1900, biologists would rediscover and
become a permanent teacher led monastery administrators to replicate his experiments and launch a revolution in biology.
send Mendel to the University of Vienna in 1851 to study
natural science as preparation for a teaching examination.
Mendel’s Modern Experimental Approach
In Vienna, Mendel studied plant physiology and plant Mendel successfully identified principles of hereditary
biology with Professor Franz Unger and physics with Pro- transmission that eluded investigators who preceded him
fessor Christian Doppler as well as Doppler’s successor, and continued to elude investigators for many years after
32 CHAPTER 2 Transmission Genetics
Seeds develop
the experimental design Mendel constructed is an example crosses; (3) selection of dichotomous traits; (4) quantifica-
of the hypothesis-driven experimental approach scientists tion of results; and (5) use of replicate, reciprocal, and test
use today, known as the scientific method. This method of crosses. These innovations are introduced briefly here and
experimentation has six steps: explored in greater detail as the chapter proceeds.
1. Make initial observations about a phenomenon or
Controlled Crosses between Plants In nature, pea
process.
plants self-fertilize (see Figure 2.2). Self-fertilization occurs
2. Formulate a testable hypothesis to explain the when sperm-containing pollen from the anther fertilizes an
observations. egg within the ovule. Fertilized ovules develop in the ovary
3. Design a controlled experiment to test the hypothesis. and then mature in the seed pod. A mature seed pod usu-
4. Collect data from the controlled experiment. ally contains five to seven peas, each of which results from
a different fertilization event. In genetic experiments, peas
5. Interpret the experimental results, comparing the
can be collected and counted by their phenotypes or can be
observed results with those expected under the assump-
planted to produce pea plants that are counted by their traits.
tions of the hypothesis.
Pea plants are also capable of cross-pollination. In
6. Draw reasonable conclusions, reformulating or retest- nature, plants are cross-pollinated by insects, birds, mam-
ing the hypothesis if necessary. mals, and wind. Mendel used his familiarity with plants to
Mendel followed these steps to collect data on individ- carry out artificial cross-fertilization, employing carefully
ual traits of the pea plant, formulate hypotheses to explain selected plants as pollen and egg donors to ensure that the
his phenotypic observations, and conduct independent progeny could be used to test a hereditary hypothesis. By
experiments to test his predictions. restricting reproduction to those plants he identified before-
hand as likely to yield informative results, Mendel per-
formed what are now known as controlled genetic crosses
Five Critical Experimental Innovations between selected organisms.
In addition to his use of the scientific method, five specific
features of Mendel’s breeding experiments distinguish them Pure-Breeding Strains to Begin Experimental Crosses
from those of his contemporaries and were critical to his During the 2 years before beginning his hereditary experi-
success: (1) controlled crosses between plants; (2) use of ments, Mendel performed dozens of controlled genetic
pure-breeding strains to begin the experimental controlled crosses to obtain strains that consistently produced a single
34 CHAPTER 2 Transmission Genetics
phenotype without variation. Strains that consistently pro- were bred for (1) seed color (yellow or green), (2) seed
duce the same phenotype are called pure-breeding strains shape (round or wrinkled), (3) pod color (green or yellow),
or true-breeding strains. For example, the self-fertilization (4) pod shape (inflated or constricted), (5) flower color
of a pure-breeding purple-flowered plant will yield only (purple or white), (6) flower position (axial or terminal),
purple flowers among progeny plants. Two plants from the and (7) plant height (tall or short).
same pure-breeding line can be crossed to one another and
will produce progeny with the same phenotype. Mendel’s Quantification of Results Each time Mendel made a con-
work ultimately led to the production of the 14 pure-breeding trolled cross, he carefully counted the number of progeny
strains for the seven traits shown in Figure 2.1. plants of each phenotype. This seemingly simple act—now
Mendel structured the experimental crosses for all standard in scientific data gathering—was revolutionary in
seven traits in the same way. He began with two pure- Mendel’s day. By obtaining large numbers of offspring from
breeding parental plants for a dichotomous trait, each each cross and by expressing his results numerically, Men-
having a different one of the two phenotypes for the trait. del could more easily analyze them for revealing patterns
These were the parental generation (P generation) of such as the occurrence of consistent ratios between pheno-
the cross. The pure-breeding parental plants were artifi- types. These ratios were critically important to Mendel’s
cially cross-fertilized to produce the first filial generation discovery of the rules by which he could predict transmis-
(F1 generation; Figure 2.4). The F1 plants were then sion of alleles during reproduction, and they are the founda-
crossed to produce the second filial generation (F2 genera- tion of Mendel’s two laws of heredity.
tion). The third filial generation (F3 generation) was pro-
duced by crossing plants from the F2 generation, and so on Replicate-, Reciprocal-, and Test-Cross Analysis The
for as many generations as needed. final features that distinguished Mendel’s experiments are
his use of three genetic-cross strategies that have become
Selection of Single Traits with Two Phenotypes Each tried-and-true approaches to genetic analysis. Rather than
of the seven traits Mendel studied had two forms. The two simply counting the results of a single cross, for example,
phenotypes are readily distinguished from one another, so Mendel made many replicate crosses, producing hundreds
there can be no ambiguity of assignment. For example, one of F1 plants and several thousand F2 plants by repeating the
trait was seed color: every seed was either yellow or green. same cross several times.
The alternative forms of the seven traits Mendel studied Mendel also performed reciprocal crosses, in which
are illustrated in Figure 2.1. The 14 pure-breeding strains plants with the same phenotypes are crossed but the sexes of
the donating parents are switched. The plant providing the
Pure-breeding Pure-breeding egg in the first cross is used as a source of pollen in the recip-
purple flower white flower rocal cross. Reciprocal crosses are always performed in pairs,
Finally, Mendel performed test crosses. These are
P × crosses designed to identify the alleles carried by an organ-
ism whose genetic makeup is not certain. We discuss the
structure of test crosses and their value as tools of genetic
analysis in the following sections.
F1 Purple-flower
progeny plants
2.2 Monohybrid Crosses Reveal the
Self-fertilized F1 or artificially fertilized F1
Segregation of Alleles
Purple Purple Purple White In this section we explore the results and interpretation of
Mendel’s experiments on the seven traits by focusing on
F2 Mendel’s examination of two traits, pea color (yellow or
green) and pea shape (round or wrinkled). The results and
interpretations for those traits apply equally well to the five
Self-fertilized F2 or artificially fertilized F2 other traits Mendel examined. The uniformity of Mendel’s
experimental results and interpretations are due to his deci-
F3 generation sion to conduct experiments on each trait in the same way.
Figure 2.4 Controlled genetic crosses of pea plants. Plants of Identifying Dominant and Recessive Traits
the P generation are artificially cross-fertilized to produce the F1
generation. Self-fertilization or crossing of F1 @generation plants Beginning each experiment with different pure-breeding
to one another produces the F2 generation. F2 plants either self- parental plants to produce an F1 generation, Mendel consis-
fertilize or are crossed to one another to produce the F3 generation. tently found that all of the F1 plants had the same phenotype
2.2 Monohybrid Crosses Reveal the Segregation of Alleles 35
as one of the pure-breeding parents. For example, when Using this scheme, Mendel signified a pure-breeding
Mendel crossed pure-breeding yellow-pea–producing plants organism as having a genotype consisting of two identi-
and pure-breeding green-pea producers, he found that all the cal symbols representing two copies of an allele. This gives
F1 plants produced yellow peas and none produced green us a second way of thinking about pure-breeding organisms,
peas (Figure 2.5). Mendel identified yellow as the d
ominant namely that they have a homozygous genotype, GG or gg
phenotype on the basis of its presence in the F1 , and he in the example shown in Figure 2.5. If a homozygous plant
identified green as the recessive phenotype since it is not is self-fertilized, or if two pure-breeding plants expressing the
seen among F1 progeny. same trait are crossed, the progeny have the same phenotype
Employing letters as symbols to represent each trait, for the trait and the same homozygous genotype as the par-
Mendel proposed a pattern of transmission from parents to ents. In contrast, in a genetic cross between pure-breeding par-
offspring that explained his phenotypic observations in the ents with different traits, the progeny all have a heterozygous
F1 and later generations. Today, numerous notational sys- genotype, consisting of one genotype symbol from each of the
tems for identifying genes and alleles are used, often dif- pure-breeding parents, or Gg in this example. Looking more
fering in their particulars along species lines, but the use closely at the example in Figure 2.5, note that while the pheno-
of letters remains a universal feature. A table describing type of the F1 is the same as that of the yellow parental plant,
gene naming (gene nomenclature) and other information the genotype is not. This observation is explained momentarily.
about the genes and genomes of model genetic organisms is Mendel next crossed F1 yellow plants to produce the
located inside the book back cover. Most commonly, a dom- F2 and observed reemergence of the recessive green pheno-
inant trait is shown with an uppercase letter, and a recessive type. Among the F2 , Mendel found that approximately three-
trait is shown in lowercase. fourths (75%) of the peas were yellow and the remaining
one-fourth (25%) were green. To repeat, the yellow : green
ratio in the F2 is 34 : 14 , or roughly 3:1. Mendel correctly inter-
Pure- Pure- preted these results to indicate that F2 offspring with the
breeding breeding dominant trait were a mixture of two genotypes—GG and
GG gg Gg—and that plants with the recessive trait were homozy-
P × gous recessive—gg. More generally, the dominant phenotype
Homozygous parent F2 can be classified as having the genotype G–(“G blank”),
Gamete formation contributes only one allele
indicating that the genotype is either GG or Gg.
of the gene.
G g Mendel made similar observations for his experiments
testing inheritance of pea shape. Replicate and reciprocal
Fertilization crosses of pure-breeding round-pea–producing plants with
pure-breeding wrinkled-pea–producing plants produced F1
Gg plants bearing exclusively round peas. This result identifies
F1 heterozygotes display
F1 the dominant phenotype round as the dominant phenotype and wrinkled as the reces-
seen in one parent. sive phenotype. His F1 cross produced F2 peas in the ratio
Gamete formation 75% round to 25% wrinkled—once again a roughly 3:1 ratio.
and self-fertilization
(a monohybrid cross)
Tabulating results over several growing seasons for
all seven traits, Mendel counted more than 20,000 F2 peas
1 1
or plants. Table 2.1 displays Mendel’s results, revealing
2G 2g
– –
F2 three consistent features: (1) dominance of one phenotype
GG Gg Segregation of alleles
1
2G
– from heterozygous Gg over the other in the F1 generation, (2) reemergence of the
produces G-containing recessive phenotype in the F2 generation, and (3) a ratio of
Gg gg and g-containing approximately 3:1 (dominant : recessive) among F2 pheno-
1
2g gametes at equal
–
types. Mendel determined that yellow is dominant to green
frequency.
and round is dominant to wrinkled based on F1 results.
Punnett square
Green pea color and wrinkled pea shape reemerge in the F2 ,
Genotypic ratio Phenotypic ratio Random union of which displays a consistent 3:1 ratio between the dominant
Homozygous 14– GG gametes to form the F2 and recessive phenotypes. For example, Mendel classified
Heterozygous 14– Gg 3
4 yellow (G–)
– produces a 1:2:1
genotypic ratio and a
8023 F2 peas by their color and 7324 F2 peas by their shape.
Heterozygous 14– Gg
1 3:1 phenotypic ratio. Among the F2 peas classified by color, he found 6022 yel-
Homozygous 14– gg 4 green (gg)
–
low seeds and 2001 green seeds, a ratio of almost exactly
Figure 2.5 Segregation of alleles for seed color. In the cross three to one. Of the F2 seeds classified for pea shape, 5474
between yellow-seeded and green-seeded pure-breeding parental were round and 1850 were wrinkled, again a ratio of very
plants, F1 progeny display the dominant yellow phenotype. A 3:1 nearly three to one. Data for each of the other five charac-
phenotypic ratio and a 1:2:1 genotypic ratio are observed in the F2 teristics revealed the same 3:1 ratio of dominant to recessive
generation. in the F2 .
36 CHAPTER 2 Transmission Genetics
Table 2.1 Mendel’s Observations for Seven Monohybrid Traits in the F1 and F2 Generations
Crosses between
Pure-Breeding Parental
Phenotypes F1 Phenotype F2 Phenotypes F2 Phenotype Ratio
Dominant Recessive
a
Round * wrinkled seeds All round seeds 5474 round 1850 wrinkled 2.96:1
Yellow * green seeds All yellow seeds 6022 yellow 2001 green 3.01:1
(interior seed color)
Purple * white flowersb All purple flowers (gray 705 purple 224 white 3.15:1
(gray * white seed coat, seed coat)
or exterior seed color)
Axial * terminal flowers All axial flowers 651 axial 207 terminal 3.14:1
Green * yellow pods All green pods 428 green 152 yellow 2.82:1
Inflated * constricted pods All inflated pods 882 inflated 299 constricted 2.95:1
Tall * short plants All tall plants 787 tall 277 short 2.84:1
TOTAL 14,949 5010 2.98:1
a
The dominant phenotype is written first and always appears as the F1 phenotype.
b
A single gene controls both flower color and seed-coat color. Mendel discussed both traits but recognized they were controlled by the same gene.
Evidence of Particulate Inheritance a term referring to a cross between two organisms that have
and Rejection of the Blending Theory the same heterozygous genotype for one gene. A monohy-
brid cross in pea plants can be made by either crossing het-
Mendel’s F1 experimental results reject the blending theory erozygous F1 with one another or by allowing F1 plants to
of heredity. Specifically, the observation that all F1 progeny self-fertilize. With a dominant and a recessive allele in their
have the same phenotype as one of the pure-breeding par- heterozygous genotype, these F1 plants donate one or the other
ents (i.e., the dominant phenotype) contradicts the blending of the alleles to each of their F2 progeny. The result of these
theory prediction that the F1 would display a phenotype that monohybrid crosses is a 3:1 phenotypic ratio among the F2 .
is a blend of the two parental phenotypes. The persistence of In other words, Mendel observed that approximately 75% of
the dominant phenotype and the reemergence of the reces- the F2 had the dominant phenotype and 25% had the recessive
sive phenotype in the F2 also run counter to the predictions phenotype. He also correctly predicted that the F2 generation
of the blending theory. would have three genotypes: The two homozygous genotypes
Having rejected the blending theory, Mendel exam- (the same genotypes present in the original pure-breeding par-
ined his experimental results and proposed a new hereditary ents) each occur in about one-fourth of the F2 progeny, and the
hypothesis—that each trait is determined by two “particles heterozygous genotype occurs in the remaining one-half of the
of heredity”—what today we call “alleles.” Mendel used the F2 progeny. Therefore, among the F2 , Mendel predicted a 1:2:1
German word elemente, a term meaning “unit or element,” genotypic ratio. The one-fourth of the F2 that are homozygous
to describe the two discrete units of hereditary information GG plus the one-half of F2 progeny that are heterozygous Gg
for each trait. This idea is the basis of Mendel’s theory of are the three-fourths of the F2 with the dominant phenotype.
particulate inheritance, which proposes that each plant car- The remaining one-fourth of the F2 contain the homozygous gg
ries two particles of heredity (i.e., two alleles) for each trait. genotype and have the recessive phenotype. The same inheri-
A plant receives one unit of heredity (allele) in the egg and a tance pattern occurs for all the other traits studied by Mendel.
second one in pollen. Each parental plant passes just one of its
two alleles to offspring during reproduction. This means that
inheritance of one G allele from the homozygous yellow paren-
Segregation of Alleles
tal plant is sufficient to produce the yellow phenotype, defin- Figure 2.5 uses letters as symbols to represent alleles and
ing the G allele as the dominant allele. In contrast, the g allele genotypes in parental, F1, and F2 organisms and introduces
that produces the green phenotype in the homozygous parental a simple and functional tool of genetic analysis called a
plant is the recessive allele. The recessive allele only produces Punnett square. The Punnett square method of diagram-
the recessive phenotype when it is in a homozygous genotype. ming the genetic content of gametes and their union to
After establishing that crosses of pure-breeding parental form offspring is named in honor of Sir Reginald Punnett,
plants produce F1 plants that always have the dominant phe- a famous geneticist of the early 20th century. The Punnett
notype, Mendel crossed F1 plants (Gg * Gg ) to produce the square separates the two alleles carried by each reproduc-
F2 generation (see Figure 2.5). This is a monohybrid cross, ing organism, placing the reproductive cells, or gametes,
2.2 Monohybrid Crosses Reveal the Segregation of Alleles 37
from one parent along the vertical margin of the diagram, heterozygous, then there will be a roughly 1:1 ratio of
and those from the other parent along the horizontal margin. progeny with the dominant phenotype to progeny with the
The squares within the body of the Punnett diagram show recessive phenotype.
the results expected from the random union of the male and One of Mendel’s test crosses of F1 plants to reces-
female gametes, each square identifying a possible geno- sive plants is shown in Figure 2.6. Based on his segregation
type of offspring produced by gamete union. hypothesis, Mendel predicted that test-cross progeny pheno-
Having formed the concept of particulate inheritance types would be 50% dominant and 50% recessive. Figure 2.6
and having carefully counted the number of plants in each illustrates Mendel’s test cross between an F1 plant producing
phenotype category, Mendel was able to frame a hypothesis round seeds (and suspected to have a heterozygous genotype)
to explain his results. This first hypothesis of Mendel’s is and a pure-breeding wrinkled-seed plant, known to be homo-
known as the law of segregation, sometimes also known zygous rr. In the test cross, the wrinkled-seed plant, being
as Mendel’s first law. It describes the particulate nature of homozygous rr, produces only r-containing gametes. If the
inheritance, identifies the segregation (separation) of alleles F1 plant is indeed heterozygous, it should produce reproduc-
during gamete formation (we discuss this process more fully tive cells with R and r genotypes at a frequency of 12 each.
in Chapter 3), and proposes the random union of gametes to Consequently, the progeny of the cross should be 12 Rr and
1
produce progeny in predictable proportions: 2 rr, resulting in a 1:1 ratio of round : wrinkled. As the fig-
ure indicates, Mendel performed this cross and observed
The law of segregation The two alleles for each trait 193 round peas and 192 wrinkled peas, or a 1:1 ratio, in test-
will separate (segregate) from one another during cross progeny. Mendel reported test-cross results for five of
gamete formation, and each allele will have an equal his traits and observed a 1:1 ratio in each case (Table 2.2).
probability 1 12 2 of inclusion in a gamete. Random These results verify the prediction that the F1 progeny of
union of gametes at fertilization will unite one gamete pure-breeding crosses are heterozygous. If the F1 were
from each parent to produce progeny in ratios that are homozygous dominant instead of heterozygous, the test-cross
determined by chance. progeny would all have the dominant phenotype instead of
The law of segregation means that when pure-breeding the observed 1:1 ratio.
parents with different homozygous genotypes are crossed,
all their F1 progeny have the dominant phenotype and have Pure- Pure-
breeding breeding
a heterozygous genotype. In the case of reproduction of het- RR rr
erozygous F1 plants, the law of segregation means that one-
P ×
half of the reproductive cells of each F1 parent are expected
to contain the dominant allele and one-half are expected to Cross-fertilization
contain the recessive allele. The random union of reproduc- Pure-
tive cells from the heterozygous F1 plants leads to the 3:1 Heterozygous breeding
Test cross of dominant F1
phenotypic ratio and the 1:2:1 genotypic ratio of the F2. Rr rr plant to a recessive plant
F1 × to determine if the F1 is
heterozygous.
Hypothesis Testing by Test-Cross Analysis
Test-cross fertilization
Mendel proposed the law of segregation to explain the phe-
notype proportions he observed in the F1 and F2 generations 1 1
F2 –
2r –
2r
of his breeding experiments, but two critical parts of his
Rr Rr
hypothesis could not be seen by observation of F1 and F2 1
–
2 R
If the F1 is heterozygous,
phenotypes, and Mendel needed to demonstrate they were the ratio of its gametes
true to validate his hypothesis. Specifically, Mendel pre- rr rr
1
– r will be 1:1.
2
dicted that all the F1 progeny in his experiment were hetero-
zygous and that among the F2 progeny with the dominant Punnett square
phenotype were plants with the homozygous genotype and
In Mendel’s test-cross experiment, he
plants with the heterozygous genotype. found 193 round and 192 wrinkled
To test the hypothesis that the F1 were heterozy- test-cross progeny—a 1.01:1 ratio.
gous, Mendel devised what is known in genetics as a
test cross. This is the cross of an organism that has the Figure 2.6 Test-cross analysis of F1 plants. A test cross between
dominant phenotype to one that has the recessive pheno- an F1 plant and one that is homozygous recessive produces prog-
type to determine whether the dominant organism has the eny with a 1:1 ratio of the dominant to the recessive phenotype if
homozygous genotype or the heterozygous genotype. If the F1 plant is heterozygous.
the plant with the dominant phenotype is homozygous, Q If a test-cross experiment identical to the one shown here
then all the progeny of the test cross will have the domi- produces 826 progeny plants, how many plants are expected in
nant phenotype. In contrast, if the dominant organism is each phenotype category?
38 CHAPTER 2 Transmission Genetics
Pure- Pure-
Table 2.2 Test-Cross Results from Mendel’s
breeding breeding
Experiments RR rr
Test Cross Test-Cross Progeny Ratio P ×
Dominant Recessive
Cross-fertilization
Round seed 193 round (Rr) 192 wrinkled 1.01:1
(Rr) * wrinkled (rr) Heterozygous
seed (rr) Rr
Yellow seed 196 yellow (Gg) 189 green (gg) 1.04:1 F1
(Gg) * green
seed (gg) Self-fertilization
Purple flower 85 purple (Pp) 81 white (pp) 1.05:1
(Pp) * white RR Rr Rr rr
Each pea results
flower (pp) F2 from a separate
Tall plants 87 tall (Tt) 79 short (tt) 1.10:1 fertilization event.
(Tt) * short Plant phenotypes
plants (tt)
TOTAL 561 541 1.04:1
Evaluate
1. Identify the topic of this problem 1. The problem presents the leaf-form phenotypes of progeny produced by three
and the kind of information the separate crosses of parental plants with unknown genotypes and phenotypes.
answer should contain. The answer must identify parental genotypes and phenotypes for each cross
2. Identify the critical information and use a Punnett square to diagram Cross 1.
given in the problem. 2. The information given for each cross is the number of progeny with hairy
TIP: The numbers of prog- (dominant) and smooth (recessive) leaves. Interpretation of the phenotype ratio
eny with each phenotype of progeny is required to determine parental genotypes and phenotypes.
can be expressed as a ratio.
Deduce
3. Examine the progeny of Cross 1, 3. Ratio of phenotypes in Cross 1 progeny:
and determine the approximate PITFALL: Genetics experiments produce
finite numbers of progeny, so phenotypes 32
ratio of progeny phenotypes. may vary from expected ratios. Don’t expect = 2.91 : 1
to see precise ratios in real data. 11
This is an approximate 3:1 ratio. The recessive phenotype appears in about 14
of the progeny 1 11
43 2 , and the remaining 4 1 43 2 have the dominant phenotype.
3 32
7. Based on the results of Cross 2, 7. Both parental plants in Cross 2 carry at least one copy of h. The 1:1 progeny
identify the genotypes and pheno- ratio is consistent with the ratio expected for a test cross of a heterozygous
types of the parents. organism to one that is homozygous recessive. This cross is Hh * hh.
For more practice, see Problems 10, 14, and 29. Visit the Study Area to access study tools. Mastering Genetics
39
40 CHAPTER 2 Transmission Genetics
Table 2.3 Results of Mendel’s Experiments to Identify F2@Plant Genotypes by Their F3 Progeny
outcomes and then verified the results by counting the prog- As Figure 2.8 illustrates, Mendel began each dihybrid
eny produced. The resulting data supported his segregation cross with pure-breeding lines. Any combination of two
hypothesis and illustrate how Mendel anticipated modern pure-breeding traits in parental plants can be used, but here
scientific methods, using approaches that would not be con- we see Mendel’s experimental cross in which one parent is
sistently applied to genetic experiments for several decades. pure-breeding for the two dominant pea traits of round and
Mendelian genetics is all around us. You’ll even find yellow (RRGG) and the other parent is pure-breeding for
it in the produce aisle of your local grocery store! Experi- the recessive pea traits wrinkled and green (rrgg). The gam-
mental Insight 2.1 describes an experiment in Mendelian etes, whether pollen or egg, produced by the round, yellow
genetics using ears of corn that have a mixture of yellow plant contain one allele for each type of gene and are RG.
and white kernels. In contrast, gametes from the wrinkled, green plant are rg.
Mendel’s model predicts that all of the F1 progeny will
therefore have the genotype RrGg. These F1 are described
2.3 Dihybrid and Trihybrid Crosses as dihybrid, meaning heterozygous for two traits, and dis-
Reveal the Independent Assortment play the dominant parental phenotypes round and yellow.
of Alleles
Pure- Pure-
Each of the seven traits investigated by Mendel showed the breeding breeding
same pattern of hereditary transmission that is explained by round, wrinkled,
the law of segregation. The predictability of phenotype pro- yellow green
RRGG rrgg
portions in F1 and F2 test-cross and self-fertilization progeny
suggests that the same mechanism is responsible for allelic P ×
segregation in each one of the selected traits. But what
Gamete formation
about the inheritance of two or more traits simultaneously?
Is there a pattern or ratio of phenotypes that allowed Men- RG rg
del to propose a transmission mechanism when two or more
genes are examined at the same time? Mendel believed that Cross-fertilization
the law of segregation applied to all genes simultaneously,
and he devised experiments to test this theory that led to his RrGg
identification of a second law of heredity. F1
Mendelism in the Produce Aisle produced by a fertilization event independent of the events
that produced adjacent kernels. This means that each mature
Many of the appealing characteristics of fruits and vegeta- ear of corn carries hundreds of progeny for analysis.
bles available in grocery stores and at farmer’s markets are Bicolor corn originates with the cross of two pure-breeding
the result of intensive selective breeding, a form of natural corn lines, one producing yellow kernels and the other pro-
selection generated by breeders, who select which organ- ducing white kernels. The yellow plant is WW, and the white
isms are to reproduce and determine the crosses that will plant is ww. When seed company geneticists cross these
occur. For example, in recent years many new vegetable parental stocks, the kernels on the F1 plants are yellow and
varieties have been introduced into the marketplace. Among have the heterozygous Ww genotype. This F1 seed is allowed
these is a variety of corn that goes by several names, includ- to mature and is packaged for sale to farmers and home gar-
ing “bicolor,” “peaches and cream,” and “yellow and white.” deners, who plant it to produce a crop. The seed is commonly
Most of the kernels on a cob of bicolored corn are yellow, labeled “hybrid,” meaning “monohybrid,” to reflect the het-
but a sizable number are white. With close inspection and a erozygosity at the kernel-color gene. Owing to segregation
little quantitative analysis, you should be able to identify the of alleles at the kernel-color gene, the plants that grow from
genetic mechanism that produces this variation in color. this F1 seed produce both yellow (W–) and white (ww) kernels
An ear of corn is a mini–genetic experiment: Each ker- on each ear.
nel on the ear, like each pea in a pod, is a separate seed, If you saw some of this corn in your grocery store, how
would you verify that the genetic basis of its yellow and white
kernels is the segregation of two alleles at a single gene? The
answer is that you would count the number of yellow kernels
and the number of white kernels on ears of bicolor corn with
the expectation of a ratio of approximately 3:1 between the
yellow and white kernels.
Recent genetics classes of one of the authors examined
several dozen ears of bicolor corn and counted 9304 yellow
kernels and 3052 white kernels. Among the total of 12,356
kernels, this meant 75.3% were yellow and 24.7% were white,
a ratio of 3.05:1. You will use these data in Problem 20 at
the end of the chapter to do a statistical test to see if the
observed data fit the hypothesis that this trait is the product
of the segregation of alleles of a single gene. The next time
you shop for fruits and vegetables, keep in mind that you are
looking at Mendelian genetics in action!
RRGg = 16
—2 the pure-breeding parents and allowing self-fertilization of
4
RrGg = 16
— the F1 , Mendel counted the phenotypes among the F2 and
1 1 1 1
RRgg = 16
1 3 R–gg found that both of the original parental phenotypes (round,
16 RRGg 16 RRgg 16 RrGg Rrgg
1
4 Rg —
– — — — — —
16
16 yellow and wrinkled, green) were present along with two
Rrgg = 16
—2
(Figure 2.11a).
This F2 observation contains two features of pivotal
1
–
4 rg —1
16 RrGg —1
16 Rrgg —1
16 rrGg —1
16 rrgg rrgg = 16
—1
—1 rrgg importance to Mendel’s hypothesis. First, parental and non-
16 parental phenotypes are seen at frequencies that differ from
one another. The most numerous class of F2 progeny dis-
play the dominant parental phenotypes for each trait, round
and yellow. The smallest class of F2 progeny have the two
Figure 2.10 Independent assortment of alleles of two genes. recessive parental phenotypes, wrinkled and green; and the
Crossing dihybrid F1 (RrGg) organisms to one another produces two nonparental F2 classes (round, green and wrinkled,
nine genotypes distributed in a 9:3:3:1 phenotypic ratio among
yellow) are intermediate and approximately equal in num-
F2 progeny.
ber. From these numbers, Mendel recognized that the ratios
between the dominant and recessive forms of each trait fol-
(3) both recessive phenotypes. The F2 phenotypes appear in lowed the familiar 3:1 pattern. In looking at pea shape, for
9 3 3 1
the ratio 16 : 16 : 16 : 16.
By examining the F2 phenotype proportions, we can see
the relationship between the 3:1 ratio for each trait and the (a) Self-fertilization of F1
9:3:3:1 ratio when the two traits are considered simultane- Heterozygous Heterozygous
ously. When pea shape and pea color are considered individ- RrGg RrGg
ually, monohybrid crosses produce F2 that are 34 dominant and F1 ×
1
4 recessive. The cross of two dihybrids also yields propor-
tions of 34 dominant to 14 recessive for each trait, making the Gamete formation Independent assortment
prediction of phenotypic ratios among the F2 for both traits results are expected in a
9:3:3:1 phenotype ratio
combined a problem of combinatorial arithmetic involving RG Rg rG rg in the F2.
the segregation of alleles for each of two traits. Figure 2.10 Self-fertilization
reminds us that genotypes falling into the R– and the G– F2 generation:
classes each occur in 34 of the progeny, while rr and gg geno- Round, yellow R–G– 315 The phenotypes
type classes each occur in 14 of the progeny. As we saw earlier, Round, green R–gg 108 are observed in a
the dash in the genotypes R– and G– is a “blank” that could Wrinkled, yellow rrG– 101 9.8:3.4:3.2:1 ratio.
be filled by either a second copy of the dominant allele or a Wrinkled, green rrgg 32
copy of the recessive allele. In either case, the resulting geno-
type—for example, RR or Rr—produces the dominant phe- (b) Counting F2 phenotypes by trait
notype. The co-occurrence of the two dominant phenotypes Rr × Rr produces:
The phenotypes are
Round 315 + 108 = 423
(round, yellow) is therefore expected to have a frequency Wrinkled 101 + 32 = 133
expected in a 3:1 ratio and
of 1 34 2 1 34 2 = 16 9
, the two recessive phenotypes (wrinkled, observed in a ratio of 3.2:1.
green) will occur with a frequency of 1 14 2 1 14 2 = 16 1
, and the Gg × Gg produces:
two phenotypic classes that display one dominant and one The phenotypes are
Yellow 315 + 101= 416 expected in a 3:1 ratio and
recessive trait (round, green and wrinkled, yellow) will each Green 108 + 32 = 140 observed in almost exactly
be found in a frequency of 1 34 2 1 14 2 = 16 3
. that ratio.
This outcome illustrates Mendel’s law of independent
assortment, also known as Mendel’s second law. Figure 2.11 Phenotype proportions in the progeny of a dihy-
brid cross performed by Mendel. (a) The phenotypic ratio Mendel
The law of independent assortment During gamete observed was close to the expected ratio of 9:3:3:1. (b) For each
formation, the segregation of alleles of one gene is inde- trait considered individually, the phenotype ratio in the progeny
pendent of the segregation of alleles of another gene. from the same cross is approximately 3:1.
2.3 Dihybrid and Trihybrid Crosses Reveal the Independent Assortment of Alleles 43
Evaluate
1. Identify the topic of this problem and 1. This is a transmission genetic problem in which parental genotypes are given.
the kind of information the answer Answers must predict the phenotypes of progeny and their expected propor-
should contain. tions. These are predicted by determining the parental gametes and their
proportions.
2. Identify the critical information given 2. Genotypes of parents are given for each cross. The genotypes are used to pre-
in the problem. dict the genotypes of parental gametes and the gamete proportions.
Deduce
3. For Cross 1, identify the genetically 3. Each of the parents can Cross 1
different gametes that can be pro- produce two genetically Male Female
duced by each parent and calculate different gametes at 1 1 1 1
1s Fs ( 12– )(1) = 12–
2 S FS (1)( 2 )= 2 2F
– – – –
the predicted proportion of each predicted frequencies 1F 1 1 1 1
–s
2 Fs (1)( 2 ) = 2
– –
2f
– 1s fs ( 12– )(1) = 12–
gamete. of 12 each.
TIP: A forked-line diagram is a useful tool for predicting
the alleles in gametes and gamete frequencies.
Cross 2
4. Identify the content and frequency 4. The male produces two
types of gametes at a Male Female
of the genetically different gametes 1 1 1 1
produced by the parents in Cross 2. predicted frequency of 12 2 S fS (1)( 2 ) = 2
– – –
1 2S
– FS ( 12– )( 12– ) = 14–
1f 1 2F
–
each. The female pro- 2s
– fs (1)( 2– ) = 12–
1 1
2s
– Fs ( 12– )( 12– ) = 14–
duces four genetically 1
1 1– 1
2 S fS ( 2 )( 2 ) = 4
– – 1–
PITFALL: Carefully identify the geno-
different gametes at fre-
–
2 f 1
type of each parent to avoid errors. 2s
– fs ( 2– )( 2– ) = 14–
1 1
quencies of 14 each.
5. Predict the gamete content and fre-
5. Both parents are dihybrids Cross 3
quencies for the parents in Cross 3.
that produce four geneti-
Male Female
cally different gametes at 1
frequencies of 14 each. 1 2 S FS
– ( 12– )( 12– ) = 14– 1
1
2S
– FS ( 12– )( 12– ) = 14–
2F 2F
– –
1
2s
– Fs ( 12– )( 12– ) = 14– 1
2s
– Fs ( 12– )( 12– ) = 14–
1
1 2 S fS
– ( 12– )( 12– ) = 14– 1
1 1
– 1
2 S fS ( 2 )( 2 ) = 4
– – 1–
–
2 f 1
–
2 f
2s
– fs ( 12– )( 12– ) = 14– 1
2s
– fs ( 2– )( 2– ) = 14–
1 1
Solve
FS Fs
6. Construct a Punnett square for 6. The predicted Cross 1 progeny are 12 long,
Cross 1 and predict the progeny spotted and 12 long, solid. Fs FFSs FFss
phenotypes and proportions.
fs FfSs Ffss
7. Construct a Punnett square for 7. The progeny predicted from Cross 2 are 38 long, spot-
fS fs
Cross 2 and predict the progeny ted; 18 long, solid; 38 short, spotted; and 18 short, solid.
FS FfSS FfSs
phenotypes and proportions.
Fs FfSs Ffss
FS Fs fS fs
8. The progeny produced by fS f fSS f fSs
8. Construct a Punnett square for FS FFSS FFSs FfSS FfSs
Cross 3 and predict the progeny Cross 3 are predicted to be fs f fSs f fss
9 3
phenotypes and proportions. 16 long, spotted; 16 long, Fs FFSs FFss FfSs Ffss
3
solid; 16 short, spotted; and
1 fS F fSS F fSs f fSS f fSs
16 short, solid.
fs F fSs Ffss f fSs f fss
For more practice, see Problems 6, 12, and 27. Visit the Study Area to access study tools. Mastering Genetics
44
2.3 Dihybrid and Trihybrid Crosses Reveal the Independent Assortment of Alleles 45
Pure-breeding parents
Frequency among
RRGGPP rrggpp
F2 progeny Mendel’s 639 plants
Flower color Phenotype Frequency Expected Observed Phenotype
P × 3
–
4 (round) round
Seed color (yellow) ( 34– )( 34– )( 34– )= 27 269.6 269 yellow
R–G–P– ––
64
3
Gamete formation
–
4 (purple) purple
R–G– – –
1
– (round) round
RGP rgp Seed shape 4
R–G–pp (yellow) ( 34– )( 34– )( 14– )= 64
9
–– 89.9 98 yellow
3
– (white) white
Fertilization 4
R–––––
3
–
4 (round) round
R–ggP– (green) ( 34– )( 14– )( 34– )= 64
9
–– 89.9 86 green
Trihybrid 1 (purple) purple
–
RrGgPp 4
R–gg– –
1
–
4
(round) round
R–ggpp (green) ( 34– )( 14– )( 14– )= 64
3
–– 29.9 27 green
F1 (white) white
× 3
– (wrinkled) wrinkled
4
Seed color (yellow) ( 14– )( 34– )( 34– )= 64
9
89.9 88 yellow
rrG–P– ––
3
–
4
(purple) purple
rrG– – –
1
– (wrinkled) wrinkled
Trihybrid Seed shape 4
rrG–pp (yellow) ( 14– )( 34– )( 14– )= 64
3
–– 29.9 34 yellow
RrGgPp 1
– (white) white
4
rr––––
3
–
4 (wrinkled) wrinkled
rrggP– (green) ( 14– )( 14– )( 34– )= 64
3
–– 29.9 30 green
1
–
4
(purple) purple
rrgg– –
1
–
4
(wrinkled) wrinkled
rrggpp (green) ( 14– )( 14– )( 14– )= 64
1
–– 10.0 7 green
(white) white
Figure 2.13 Trihybrid cross to verify independent assortment. Q Thinking about the relationships of the alleles involved, (a)
The forked-line method can be used to determine the expected explain why the expected frequency of round, yellow, purple
phenotype frequencies produced by a trihybrid cross. Expected F2 plants is greater than the expected frequency of wrinkled,
and observed results for the F2 generation of Mendel’s trihybrid- green, white ones and (b) explain the reason for the difference
cross experiment supported his hypothesis of independent between the expected frequencies of round, green, purple
assortment. plants and wrinkled, yellow, white plants.
Mendel performed this cross, and his results almost The forked-line diagram in Figure 2.13 shows the
exactly matched expectation. He found that the 207 test- number and expected frequency of gamete genotypes gen-
cross progeny were composed of 55 round, yellow; 51 round, erated by the trihybrid F1 , and it predicts the phenotype
green; 49 wrinkled, yellow; and 52 wrinkled, green plants. distribution of the F2 . In the general case, assuming there
This result confirmed the dihybrid genotype of the F1 plant are two alleles for each gene, the number of different gam-
and supported the hypothesis that alleles for pea shape assort ete genotypes is expressed as 2 n , where n = th e number
independently of those for pea color during gamete forma- of genes involved. In this example, there are three genes
tion and that gametes unite at random to form offspring. (n = 3 ), and 2 3 = 8 different combinations of alleles
possible for the three traits in gametes from the trihybrid
Testing Independent Assortment by plant. The frequency of each gamete genotype is deter-
mined as 1 12 2 n , or 1 12 2 3 = 18 .
Trihybrid-Cross Analysis The diagram also predicts the expected frequency
Mendel further tested the hypothesis of independent assort- of the eight phenotypic classes in the F2. For the general
ment by examining the results of a trihybrid cross, a cross case where there are two phenotypes (dominant and reces-
involving three traits—in this case, seed shape, seed color, sive) for each trait, there are 2n phenotypes in the F2. Once
and flower color. He began this experiment by crossing a again, n = the number of genes. In this example, there are
pure-breeding round, yellow, purple-flowered parental 23 = 8 phenotypes in the F2 progeny. Computation of each
plant (RRGGPP) to a pure-breeding wrinkled, green, white- expected phenotype frequency is based on the expected fre-
flowered plant (rrggpp) (Figure 2.13). The F1 are presumed quencies of 34 dominant and 14 recessive for each trait. The
to be trihybrid (RrGgPp), and these plants are crossed with expected frequency of each trihybrid class is the product of
one another (or they can be self-fertilized) to produce the F2 . three fractions representing the predicted probabilities of
46 CHAPTER 2 Transmission Genetics
the dominant or recessive form for each trait. For the eight and Erich von Tschermak were both working on Pisum
F2 phenotypes from a trihybrid cross, the expected phenotype sativum, the same plant Mendel had used, and Hugo de
9 9 3 9 3 3 1
ratio is 27
64 : 64 : 64 : 64 : 64 : 64 : 64 : 64 .
Vries was working on a different plant species, when
Mendel’s experimental results for this test are given in they became aware of Mendel’s 1866 paper. Each of the
Figure 2.13 for 639 F2 progeny. The results were remark- three, on their own, had identified the hereditary principles
ably close to expectation, and Mendel took this result as Mendel described. With support from the contemporane-
validation of his hypothesis of independent assortment. ous discoveries of the behavior of chromosomes during
In conclusion, Mendel made observations about hered- meiotic cell division, followed quickly by confirming evi-
itary transmission in pea plants and devised two hypotheses dence from other species of plants and animals, the basic
(his two laws of heredity) to explain those observations. He principles of segregation and independent assortment were
then carried out separate experiments to test and verify his widely and rapidly disseminated in the first decade of the
hypotheses, in keeping with the modern scientific method. 20th century.
Three and one-half decades after Mendel published his This chapter started by saying that the approach to
results, his work was rediscovered. That led quickly to the genetic analysis it describes is often dubbed Mendelian
confirmation of Mendel’s two laws, which are the founda- genetics. After all, Mendel was the first scientist to offer a
tion of our understanding of transmission genetics today. mechanism to explain the hereditary patterns he observed.
However, Mendel was not the first person to make these
observations. As Experimental Insight 2.2 shows, if
The Rediscovery of Mendel’s Work Charles Naudin had thought to quantify the results of his
In 1900, after remaining virtually unknown for 34 years, own crosses of pea plants, he could have been the first
Mendel’s experimental results and interpretations were scientist to succeed at explaining heredity. Just think, this
rediscovered almost simultaneously by three botanists whole discussion might have been known as “Naudinian
working independently of one another. Carl Correns genetics”!
Naudinian Genetics, Anyone? beat Mendel to the punch by 2 years. In that year, Naudin
reported the following:
Before Mendel, many “plant hybridists” experimented with
pea plants and other plants, attempting to discern the mech- ❚❚ The results of reciprocal crosses are identical. (Similar
anisms of plant reproduction and the process of hereditary observations by Mendel were important in his identifica-
transmission of traits. Mendel cited the work of several early tion of the particulate nature of hereditary factors.)
hybridists in his 1866 paper. ❚❚ F1 progeny display a single phenotype (as Mendel
Several of these plant hybridists came close to discover- reported 2 years later).
ing the hereditary principles that today bear Mendel’s name;
none succeeded fully. For example, in 1823, Thomas Andrew ❚❚ F2 progeny display two phenotypes. (These observations
Knight determined that gray seed coat is dominant to white are the result of the segregation of alleles.)
and that self-fertilization of certain gray-seeded plants pro- ❚❚ The hereditary units for traits are separated in pollen and
duces both gray and white seed in progeny plants. In 1822, egg formation. (This concept was fundamental to the
John Goss, working with a pea variety that had blue and segregation observation of Mendel.)
white seeds, reported that crossing a pure-breeding white-
❚❚ Nonparental combinations of phenotypes appear in the
seeded plant with a pure-breeding blue-seeded plant pro-
F2 generation. (This is identical to Mendel’s independent
duced only blue seeds in first-generation plants, and that
assortment observation.)
self-fertilization then produced a second generation with
a mixture of white and blue seeds in plants. Carl Friedrich After making these observations, why wasn’t Naudin able
Gaertner came tantalizingly close to explaining segregation to propose a hereditary mechanism to explain them? The
in 1827 when he reported results of a cross between pure- answer is that Naudin, like his predecessors and others who
breeding gold-kernel maize and pure-breeding red-striped would follow, failed to quantify his results. Naudin did not
maize. All the F1 had gold kernels, and among the F2 , 328 report the number of plants falling into different phenotypic
plants had only gold kernels and 103 had red-striped ker- categories, and he was therefore unable to recognize the
nels. If Gaertner had been able to correctly interpret his data, ratios between phenotypic classes that are the key to inter-
he would have identified a 3.18:1 ratio in the F2 . Alas, he preting hereditary transmission. Without quantitative data,
never did and missed his “golden” opportunity to explain Naudin was unable to formulate a testable hypothesis.
simple heredity. Alas, poor Naudin! Were it not for his failure to see the
Similar fates befell other plant hybridists, but arguably necessity of quantifying experimental results, we might well
the one who came closest to explaining heredity prior to be discussing Naudinian genetics in this chapter instead of
Mendel was Charles Naudin, who in 1863 seemed poised to Mendelian genetics!
2.4 Probability Theory Predicts Mendelian Ratios 47
2.4 Probability Theory Predicts will be one head and one tail in either order?” The answer
is 12 , which is obtained by adding the 14 chance (i.e., 12 * 12 )
Mendelian Ratios of getting a head first followed by a tail plus the 14 chance
(i.e., 12 * 12 ) of getting a tail first followed by a head. You
Mendel recognized that chance, or random probability, the also applied the sum rule to several genetic calculations in
same process that determines the outcome of coin flips and the preceding section. For example, in Figure 2.5 the proba-
rolls of the dice, is the arithmetic principle underlying the bility that F2 progeny of the cross Gg * Gg will be hetero-
operation of the law of segregation and the law of indepen- zygous is determined by adding the probabilities of the two
dent assortment. Our discussion of Mendel’s experiments ways of obtaining the genotype: 14 + 14 = 12 . Similarly, in
has demonstrated that the basic rules of Mendelian inheri- Figure 2.10, the probability that an F2 progeny of the cross
tance are based on chance. The Mendelian probabilities we of dihybrid heterozygotes (RrGg) will have the two domi-
have described are formally expressed by four rules of prob- nant phenotypes is obtained by applying the sum rule. This
ability theory—the product rule, the sum rule, conditional probability is 116 + 126 + 126 + 146 = 196 .
probability, and binomial probability. In this section, we
look more closely at these rules as they relate to the predic- Conditional Probability
tion of the outcomes of genetic crosses.
Certain questions of genetic probability can be asked before
a cross is made. An example is a question of Mendelian
The Product Rule probability such as, “What is the chance two heterozygotes
If two or more events are independent of one another, their have a child with the heterozygous genotype?” In this case,
joint probability, the likelihood of their simultaneous or the product rule and the sum rule are used to predict a 12
consecutive occurrence, is the product of the probabilities probability that the heterozygous genotype will be produced
of each one individually. The product rule, also called the by the cross. This is known in probability terms as a prior
multiplication rule, describes these circumstances. probability. Certain other genetic probability questions are
You have already used the product rule several times in asked after a cross has been made, such as questions about
determining the outcomes of genetic crosses, and you were the probability that an organism produced by a cross has a
probably familiar with it (though perhaps not by name) even particular genotype given that the organism has a particu-
before you started this chapter. As an example of your familiar- lar phenotype. This kind of probability is called conditional
ity with this rule, consider two consecutive flips of a coin and probability, and it is applied when specific information
ask, “What is the chance that both flips are heads?” The answer about the outcome of the cross modifies, or “conditions,”
is 14 , or one in four, which is obtained by multiplying the 12 the probability calculation.
chance of heads on the first coin flip times the 12 chance of heads An example of such a conditional probability might ask
on the second coin flip. Figure 2.5 shows how the product rule about the F2 progeny of an F1 cross Gg * Gg , “What is the
is used to determine the chance of producing an F2 plant with probability that yellow-seeded progeny plants are heterozy-
the recessive phenotype by crossing heterozygous F1 plants that gous Gg like the parents?” Yellow seed is present in 34 of the
are Gg. The probability of producing the recessive phenotype is progeny, but this phenotypic class contains two genotypes,
1 12 2 1 12 2 = 14 . Similarly, in Figure 2.9, the probability of any GG and Gg, that are not equally frequent: the genotype Gg
gamete from a dihybrid organism having a specific one of the is found in 23 of the yellow F2 progeny, and the other yel-
four possible genotypes is predicted by applying the product low F2 are GG (see Figure 2.5). Under the conditional cri-
rule in the forked-line diagram. Likewise, in Figure 2.10, the terion that the only progeny phenotype considered is yellow
probability that F2 offspring will be homozygous recessive for seeds, any nonyellow seeds are eliminated from the analy-
both traits from a cross of F1 dihybrid plants with the genotype sis. Looking only at the yellow-seeded progeny, we find that
RrGg is predicted by applying the product rule. they have a 23 probability of being Gg.
Mendel dealt with a version of this conditional prob-
ability question, asking “If the yellow-seeded F2 are allowed
The Sum Rule to self-fertilize, what proportion of them are expected to
The sum rule, also called the addition rule, calculates the breed true?” He asked this question as he devised an inde-
joint probability of occurrence of any set of two or more out- pendent test of his segregation hypothesis (see Table 2.3
comes when the possible outcomes for the individual events and the accompanying discussion). In Mendel’s test of his
are mutually exclusive by summing the probabilities of each segregation hypothesis, he predicted that 13 of the F2 with
outcome. This rule is applied when more than one outcome the dominant phenotype would be homozygous and that 23
satisfies the conditions of the probability question. Mutually would be heterozygous. He found that 13 of the dominant F2
exclusive events in this context are alternative outcomes, only bred true and that the other 23 produced progeny of both phe-
one of which can occur to the exclusion of the other outcomes. notypes and were heterozygous.
Again, you are probably already familiar with the use Genetic Analysis 2.3 in Section 2.6 will guide you in
of this rule. Think once more about two consecutive flips using conditional probability to predict the likelihood of a par-
of a coin, and this time ask, “What is the chance the result ticular outcome of a mating between two prospective parents.
48 CHAPTER 2 Transmission Genetics
Binomial Probability We can see that there is only one order in which to get either
three heads (HHH) or three tails (TTT). Each of these two
In determining the probabilities of certain kinds of outcomes, outcome classes (HHH or TTT) has a probability of 1 12 2 3
just one event need be predicted. The chance of obtaining a head or 18. (Notice that we use the product rule to obtain each
or a tail on a coin flip or the chance of making the genetic-cross probability.) But what about an outcome class of two tails
Gg * Gg and getting gg are examples. In contrast, questions and one head, with three possible orders, or two heads and
concerning a combination or sequence of such events require one tail, with three possible orders? Here we must recog-
a different approach. For example, determining the probability nize that each one of the possible orders has a probability
of getting four yellow and two green peas in a six-seeded pod of 1 12 2 3 = 18, and we use the sum rule to add together the
produced by a Gg * Gg cross or the risk of a recessive pheno- chances of the similar results. For both of these outcome
type occurring in one or more of the children of a couple who classes (one head and two tails; two heads and one tail),
are each heterozygous carriers of a recessive disease-producing using the sum rule, the probability is 18 + 18 + 18 = 38.
allele requires computation of all the different outcome patterns To arrive at this conclusion arithmetically, we use the
possible for the cross in question. To make these determina- binomial expansion to the third power 3 1 p + q 2 3 4 to rep-
tions, we use binomial probability calculations, expanding the resent the three successive coin flips. The general equation
binomial expression to reflect the number of outcome combi- for this binomial expands as follows:
nations and the probability of each combination.
(p + q)3 = p3 + 3p2q + 3pq2 + q3
Construction of a Binomial Expansion Formula A bino-
Inserting the coin flip probability values of 12 for both p
mial expression contains two variables, each representing
and q, the result is
the frequency of one of the two alternative outcomes. We
can express the likelihood of one outcome as having a fre- 1 1 3 1 3 3 1
quency p and the alternative outcome as having a frequency a + b = + + +
2 2 8 8 8 8
q. Since the events p and q are the only outcomes possible,
the sum of the two frequencies is (p + q) = 1. If we are Application of Binomial Probability to Progeny Pheno-
examining the probabilities of the outcomes for a series of types Binomial probability and the binomial expansion
two alternative events, such as multiple flips of a coin or the can be used whenever a probability question addresses a
sex of several successive children born to a couple, we can repeating series of events that have two alternative outcomes.
expand the binomial to the power of the number of succes- Let’s look at the production of yellow and green peas in pods
sive events (n) to calculate the probabilities. The binomial with six peas each. In this example, the dominant allele G
expansion formula is written as (p + q)n. determines yellow color, the recessive allele g determines
In some kinds of probability problems, the values of the green color, and the cross-producing progeny peas is a self-
binomial variables p and q will be equal; that is, p = q = 12, fertilization of a yellow-seeded heterozygous (Gg) plant.
as in the probability of producing a head or a tail from a The probability that a seed is yellow is 34, since the genotype
coin flip. In other cases, the two binomial values will not be would be either GG or Gg, and the probability that the seed
equal, as in the probability that heterozygous parents will is green, and therefore has the gg genotype, is 14. We will use
mate and produce a child with a recessive trait 1 14 2 versus a the variable p to represent the probability of yellow seeds
child with the dominant trait 1 34 2 . and the variable q to represent the probability of green seeds.
Let’s use combinatorial probability to predict the likeli- To repeat, there are two possible color outcomes for
hood of different numbers of heads and tails produced from each pea in our example and six peas per pod (n = 6), for a
three consecutive flips of a coin. A combinatorial approach total of 2n (26), or 64, different orders of peas in their pods.
allows us to list all the different orders of heads and tails The combinations of yellow and green peas in each pod fall
and to group the like combinations of outcomes into sets, or into seven outcome classes. For example, five yellow and
classes. The following table shows that there are 23, or eight, one green seed is one class, another is three yellow and
different orders of heads and tails in three coin flips. This three green, and so on. In most binomial genetic cases, the
value is determined based on two possible outcomes (which number of classes is n + 1, as it is in this case.
is the integer) for three successive events (which is the expo- Our goal in this example is to determine the expected
nent). The outcomes can be grouped into four sets according frequency of each outcome class. To do so, we must first ask
to number of heads and number of tails in each set. how many of the 64 different orders of peas occur in each of
the seven classes. The answer to this question can be found
0 heads 1 head 2 heads 3 heads using the formula P = n!/(x! y!), where n is the number of
3 tails 2 tails 1 tail 0 tails events, x is the number of occurrences of one of the out-
TTT TTH THH HHH comes, and y is the number of occurrences of the other out-
THT HTH come. The ! symbol indicates the factorial operation. Using
HTT HHT this equation for the case of four yellow and two green peas
1 3 3 1 in a six-seeded pod, there are 6!/(4! 2!) = 720/48 = 15 dif-
Probability:
8 8 8 8 ferent orders. To avoid having to make this calculation for
2.5 Chi-Square Analysis Tests the Fit between Observed Values and Expected Outcomes 49
every binomial expansion problem, a convenient shortcut sum of category probabilities and the sum of category fre-
called Pascal’s triangle can be used (Figure 2.14). quencies are each 1.00. This correspondence verifies that all
Figure 2.15 makes use of the values taken from the possible outcomes have been taken into account.
n = 6 line of Pascal’s triangle (highlighted in Figure 2.14).
These coefficients of the binomial expansion for n = 6 give
the proportions of each of the seven outcome classes for 2.5 Chi-Square Analysis Tests the
this example. The coefficients are 1, 6, 15, 20, 15, 6, and 1, Fit between Observed Values and
and they add up to a total of 64 different combinations. The
coefficients are used to multiply the binomial probability of Expected Outcomes
each outcome class. For this case where p = 34 and q = 14
the expected frequency of obtaining six yellow peas in a Sections 2.1 through 2.4 contain numerous examples of how
pod, for example, is calculated as 1(p6) = 1 34 2 6 = 0.178; the principles of probability can be used to predict the likeli-
for pods containing three yellow and three green peas, hood of different outcomes of genetic crosses. These genetic
the frequency is 20 3 1 34 2 3 1 14 2 3 4 = 0.132; the propor- calculations make predictions of expected outcomes based
tion of pods containing two yellow and four green peas is on Mendel’s two hereditary laws. But how do experiment-
15 3 1 34 2 2 1 14 2 4 4 = 0.033; and so on. The complete set of ers assess the general applicability of the experimental out-
expected frequencies for different combinations of seed comes? Genetic experiments almost never produce the exact
color is shown at the bottom of Figure 2.15. Notice that the outcome expected. How can we decide, for example, that
50 CHAPTER 2 Transmission Genetics
Mendel’s F2 results in Table 2.1 (none of them an exact 3:1 and expected results are very similar to one another—in other
ratio) are compatible with his segregation hypothesis pre- words, when the experimental outcome closely matches the
dicting a 3:1 phenotype ratio? Similarly, are the observed expected results. On the other hand, low P values correspond
results of Mendel’s experiment shown in Figure 2.13 com- to high chi-square values. They indicate substantial difference
patible with the predicted outcome? between observed and expected outcomes. The greater the dif-
Qualitative statements such as “the observed results sup- ference between observed and expected results of an experi-
port the hypothesis because they are close to the expected ment, the greater the x2 value and the lower the P value.
results” are unacceptable for scientific work. Instead, a quan- The P value for each experiment is dependent on the
titative approach, or in this case a statistical approach, is number of degrees of freedom (df) in the experiment being
needed to objectively compare the results of an experimental examined. For each experiment, the df value is most often
cross with the results predicted by probability. Mendel did not equal to the number of outcome classes (n) minus 1, or
have appropriate statistical tools available to him. But in the (n – 1). In a statistical sense, this df is equal to the num-
early 1900s, the chi-square test was derived as a statistical test ber of independent variables in an experiment. For example,
for comparing observed experimental results with the results suppose we were conducting a chi-square test of 100 coin
that are expected when chance is generating the outcome. flips. There are two outcome classes, heads and tails, each
By convention, observed experimental outcomes that of which we expect to see 50 times. However, once we
have a probability of less than 5% (6 0.05) are often con- record the number of events in one class, say 54 heads, the
sidered to represent a statistically significant difference number of events in the second class becomes dependent on
between the observed outcome and the expected outcome. that first number. In our coin flip example, if we flip a coin
Chi-square analysis tests for statistically significant devia- 100 times and there are 54 heads recorded, the other 46 flips
tion in genetic experimental results. This section describes must be tails. Here the number of degrees of freedom is one
the chi-square test and its application to the analysis of because, while there are two possible outcomes, the value of
genetic data, including some of Mendel’s F2 results. one is always dependent on the value of the other.
Table 2.4 is a chi-square table. In the body of the table
Chi-Square Analysis are the chi-square values for different degrees of freedom,
The chi-square (x2) test is the most common statistical method which are listed along the left-hand margin of the table. The
used in genetics for comparing observed experimental outcomes corresponding P values are listed along the top margin. To
with the results predicted by the hypothesis. Chi-square testing determine the P value for the chi-square value from an exper-
quantifies how closely an experimental observation matches iment, the first step is to determine the number of degrees of
the expected outcome by determining the probability of the freedom. The second step is to locate the chi-square value on
observed outcome. The chi-square test has proven flexible and the line corresponding to the degrees of freedom. The P value
accurate in measuring the fit between observed and expected for the result of the experiment in question is then found at
experimental results across a wide range of experiments. the top of the column containing the chi-square value.
Determining the chi-square value for the data set from a Interpretation of chi-square results is based on the cor-
genetic cross is a two-step process. First, the squared differ- responding P value. By the most common convention, men-
ence between the number observed and number expected in tioned above, a statistically significant result from chi-square
each outcome category is divided by the number expected in analysis is defined as one for which the P value is less than
the category; and second, the values obtained are summed 0.05. This means that there is less than a 5% chance ( 6 0.05)
for all outcome classes. The x2 formula is of obtaining the experimental observation by chance. Using
this criterion, when the results of a genetic experiment produce
x2 = a
(O - E)2 a P value of less than 0.05, the hypothesis of chance is rejected.
E In other words, if the P value is less than 0.05, the difference
where O is the observed number of offspring in each out- between the observed and expected results is considered statis-
come class, E is the number expected for each class, and the tically significant, and the experimental hypothesis is rejected.
summation (Σ) is taken over all outcome classes. Conversely, P values greater than 0.05 indicate a nonsignificant
Chi-square values are not directly comparable from one deviation between observed and expected values. These values
experiment to the next. Instead, each experimental chi-square result in failure to reject the chance hypothesis.
value is interpreted in terms of the results expected for an
experiment of that size. The interpretation is done by means of Chi-Square Analysis of Mendel’s Data
a probability value (P value), which is a quantitative expres- Modern statistical methods allow us to do something Mendel
sion of the probability that the results of another experiment of could not do—test his experimental data for its compatibil-
the same size and structure will deviate as much or more from ity with the predictions of the laws of segregation and inde-
expected results by chance. P values in chi-square analysis are pendent assortment. Table 2.1 contains data from Mendel for
directly related to how closely the observed and expected results F2 segregation of the seven traits he tested. In the first row
match one another. High values for P (values close to 1) are of the table, we see that Mendel examined 7324 F2 seeds
associated with low x2 values. These occur when the observed for round or wrinkled phenotypes. Among these, he counted
2.6 Autosomal Inheritance and Molecular Genetics Parallel the Predictions of Mendel’s Hereditary Principles 51
5474 round and 1850 wrinkled. Based on the predictions of The chi-square value is calculated as
his segregation hypothesis, Mendel expected that 75% of the
F2 would be round and the remaining 25% wrinkled. That x2 = (315 - 312.75)2/312.75 + (108 - 104.25)2/104.25
means he expected (7324)(0.75) = 5493 round seeds and + (101 - 104.25)2/104.25 + (32 - 34.75)2/34.75
(7324)(0.25) = 1831 wrinkled seeds. There is 1 degree of = 0.016 + 0.135 + 0.101 + 0.218 = 0.470
freedom in the experiment, and the chi-square is calculated as
In this case, d f = 3 , and the P value falls between 0.90 and
2 2
x = (5474 - 5493) /5493 + (1850 - 1831) /1831 2 0.95. This indicates a nonsignificant deviation, because the P
= 0.066 + 0.197 = 0.263 value is above the 0.05 cutoff value. Mendel’s F2 data for seed
color and seed shape are therefore also consistent with the predic-
For d f = 1 , the P value falls between 0.50 and 0.70 (see tions of independent assortment. A third example of chi-square
Table 2.4). This is well above the cutoff value of 0.05 and analysis, using trihybrid-cross results from one of Mendel’s
consequently represents a nonsignificant deviation between experiments, is shown in Table 2.5. From statistical analysis of
the observed outcome and the values expected for an experi- these data we conclude that Mendel’s results are consistent with
ment of this size. We fail to reject the hypothesis that chance the predictions of segregation and independent assortment.
is responsible for the observed outcome, and we can say,
therefore, that Mendel’s F2 data for seed shape are consis-
tent with the predictions of the law of segregation. 2.6 Autosomal Inheritance and
Figure 2.11 provides data Mendel collected on seed shape Molecular Genetics Parallel the
and seed color that we can use to test whether his results were
consistent with his predictions of independent assortment. Based Predictions of Mendel’s Hereditary
9 3 3 1
on the predicted 16 : 16 : 16 : 16, or 9:3:3:1, ratio (and converting Principles
9 3
the fractions to decimal numbers: 16 = 0.5625, 16 = 0.1875,
1
and 16 = 0.0625), the 556 F2 produced by Mendel would be
Immediately after the rediscovery of Mendel’s rules of heredi-
expected to have the following distribution:
tary transmission in 1900, biologists began testing Mendel’s
Round, yellow (556)(0.5625) = 312.75 findings in species other than pea plants. These studies were
Round, green (556)(0.1875) = 104.25 undertaken in an effort to verify the principles of heredity and to
Wrinkled, yellow (556)(0.1875) = 104.25 expand their application. One of the species in which hereditary
Wrinkled, green (556)(0.0625) = 34.75 transmission was studied was our own. This section discusses
556.00 some of the elements of hereditary transmission in humans.
52 CHAPTER 2 Transmission Genetics
Symbols
Table 2.5 Chi-Square Analysis of Mendel’s
Trihybrid-Cross Data Female Male
Do not express trait
Mendel’s Observationa
Express trait
Phenotype Number Number Expected
Deceased (d. 0000 = date of death)
Round, yellow, purple 269 269.58
Unspecified sex
Round, yellow, white 98 89.86
Round, green, purple 86 89.86 Lines
Round, green, white 27 29.95 Generation
Wrinkled, yellow, purple 88 89.86 Parents
Parents (closely related by blood)
Wrinkled, yellow, white 34 29.95
Adoption
Wrinkled, green, purple 30 29.95
Siblings
Wrinkled, green, white 7 9.98
Total 639 638.99 Identical twins
1 2
I d. 1956 d. 1960
1 2 3 4 5 6 7 8
II d. 1988 d. 1990
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
III
d. 1972
1 2 3 4 5 6 7 8 9 10 11 12
IV
Figure 2.17 Autosomal dominant inheritance. Table 2.6 sum- Q Using D for the dominant allele and d for the recessive allele,
marizes common observations in families with this pattern of give the genotypes for III-15, III-16, IV-9, IV-10, IV-11, and IV-12.
inheritance. (Hint: Look at the cross-producing III-15.)
Table 2.6 Common Characteristics of the Inheritance of Autosomal Dominant Traits Seen in Pedigrees
(see Figure 2.17)
1. Males and females have the trait in about equal frequency. (There are seven females and eight males with the dominant trait
in Figure 2.17.)
2. Each person with the trait has at least one parent with the trait. (Note this feature in each generation.)
3. Parents of either sex can transmit the trait to a child of either sex. (See generations I and II, for example.)
4. If neither parent has the trait, none of their children will have it. (See progeny of II-3 and II-4, and of II-7 and II-8, for example.)
(e.g., Aa) and in those with a certain homozygous geno- plants, he would make a prediction beforehand about the per-
type (e.g., AA). There are several common characteristics of centages of dominant and recessive phenotypes he expected to
autosomal dominant traits that can be evident in pedigrees. see among the cross progeny. That kind of prospective predic-
Table 2.6 lists some major ones, all of which can be seen tion occurs in the field of human genetics. If, for example, a
in the pedigree in Figure 2.17. For example, the first com- man and a woman know that each is heterozygous for an auto-
mon feature of autosomal dominant traits is that males and somal recessive disease, they can ask the question, “What is the
females will show the trait in approximately equal numbers. chance a child of ours will have the recessive condition?” In
In Figure 2.17, the 15 individuals having the dominant trait this case, the genetic cross is Aa * Aa, and there is a 14 chance
(darkened circles and squares) are 7 females and 8 males. that any offspring will have the homozygous genotype aa.
The study of heredity can also be retrospective. One fea-
Autosomal Recessive Inheritance ture making the study of inheritance in humans different from
that in other organisms is that human heredity is often exam-
Figure 2.18 shows a human pedigree displaying the characteris- ined after reproduction has taken place, when questions may
tics commonly observed for autosomal recessive inheritance. arise about the genotypes of individuals even though their
In this pattern of heredity, the recessive phenotype appears only phenotypes are known. For example, it is usually only after
in those individuals who have the genotype that is homozygous an adverse hereditary outcome has been detected in a family
for the recessive allele (e.g., aa). The major common charac- that the inheritance of the unusual trait becomes a subject of
teristics of autosomal recessive inheritance in pedigrees differ attention by medical genetic professionals. Construction of a
in several ways from those seen for autosomal dominant traits. pedigree may show the family to have a history of the heredi-
Table 2.7 lists common characteristics of autosomal recessive tary condition; alternatively, it may show the hereditary con-
traits that can be observed in the Figure 2.18 pedigree. dition to have previously been unknown in the family. In
either case, an adverse reproductive outcome is the trigger
Prospective and Retrospective Predictions for medical genetic investigation of the family.
Figure 2.19a shows a pedigree in which both parents
in Human Genetics
(I-1 and I-2) have the dominant phenotype. The parental
In the context of testing his hereditary laws, Mendel made genotypes for this trait are initially unknown. They have
prospective predictions about the outcomes of certain crosses. had four children: three of the children also have the domi-
In other words, when setting up specific crosses between pea nant phenotype (II-1, II-3, and II-4), but one child (II-2)
54 CHAPTER 2 Transmission Genetics
1 2
I
1 2 3 4
II
1 2 3 4 5 6 7 8
III
Figure 2.18 Autosomal
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 recessive inheritance.
IV Table 2.7 summarizes
common observations for
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 families with this pattern of
V
inheritance.
Table 2.7 Common Characteristics of the Inheritance of Autosomal Recessive Traits Seen in
Pedigrees (see Figure 2.18)
1. Males and females with the trait are approximately equally frequent. (Four males and four females in the
pedigree in Figure 2.18 have the recessive trait.)
2. Often, a child with the recessive trait has parents who both have the dominant trait and are heterozygous
carriers. (See progeny of III-4 and III-5, for example.)
3. If both parents have the trait (i.e., both are homozygous recessive), all their children will have the trait. (See
progeny of IV-5 and IV-6.)
4. The trait is not seen in every generation. Instead, it is usually seen among siblings. (See generation V.)
Evaluate
1. Identify the topic of this problem 1. The problem concerns the transmission of an autosomal recessive trait in a fam-
and the kind of information the ily. The problem requires deducing either complete or partial genotypes of fam-
answer should contain. ily members based on the transmission pattern. The genotypes of II-3 and II-4
can be evaluated as conditional probabilities.
2. Identify the critical information given 2. The autosomal recessive condition is present in one sibling each of II-3 and II-4. The
in the problem. phenotypes of the pedigree members in generations I and II are known, allowing
genotype deductions to be made and genotype probabilities inferred for II-3 and
II-4.
Deduce
3. Deduce the genotypes of the mem- 3. The recessive condition occurs when an individual has the genotype dd. Since
bers of generation I based on the the parental pairs in generation I have each produced a child with the recessive
emergence of the recessive condi- condition, and since none of those four parents has the recessive condition (i.e.,
tion in generation II. they all have the dominant phenotype), the members of each parental pair must
have the heterozygous Dd genotype.
4. State what is known about the geno- 4. All members of generation II are produced from crosses that are Dd * Dd. II-2
types of members of generation II. and II-5 have the recessive phenotype and must have the genotype dd. All other
family members in generation II are either DD or Dd. Because their genotypes
TIP: Use a Punnett square to accurately
determine the possible genotypes. are only partly known, the genotypes of II-1, II-3, II-4, and II-6 can be written as
D–.
5. Assign the probabilities of each pos- 5. Both II-3 and II-4 have the dominant phenotype, so neither
can have the dd genotype. The Punnett square shows that for D d
sible genotype for II-3 and II-4.
their possible D– genotypes, each of them has a two in three D DD Dd
1 23 2 chance of having the Dd genotype and a one in three 1 13 2
PITFALL: To avoid errors, first con- chance of having the DD genotype. d Dd dd
sider which genotype or genotypes
are not possible for these two indi-
viduals and then assess the likelihood
of the remaining possibilities.
Solve
6. Assign genotypes to members Answer a
of generation I and generation II, 6. The pedigree shown here includes the complete and partial genotypes assigned
except II-3 and II-4. to members of generations I and II.
1 2 3 4
I
Dd Dd Dd Dd
1 2 3 4 5 6
II
D– dd D– D– dd D–
(continued)
55
GENETIC ANALYSIS 2.3 CONTINUED
Solution Strategies Solution Steps
7. Determine the genotypes of II-3 and 7. To have the recessive phenotype, the child of II-3 and II-4 must have the dd geno-
II-4 that would have to be present type. Each of its parents must be Dd for this to occur. If we look at the Punnett
if they were to produce a child with square and consider only the genotypes that meet the condition of producing a
the recessive condition. dominant phenotype, we see that each parent has a 23 chance of being heterozygous.
8. Calculate the chance that III-1 would The probability that both are heterozygous is 1 23 2 1 23 2 = 49.
have the recessive condition. Answer b
8. The chance that both II-3 and II-4 are heterozygous is 49 . The cross of these
TIP: Use the product rule to determine
the probabilities of mating outcomes. heterozygotes (Dd * Dd) would have a 14 chance of producing a child that is d.
This probability is 1 49 2 1 14 2 = 19 . In other words, given the available information,
there is a one in nine chance that this couple will have a child with the recessive
phenotype.
For more practice, see Problems 3, 16, and 40. Visit the Study Area to access study tools. Mastering Genetics
genetic modes of analysis are two sides of the same coin. plants. The Le gene produces an enzyme called gibberellin
The Mendelian patterns of transmission of phenotype varia- 3b@hydroxylase. This enzyme catalyzes one step of the bio-
tion are traceable through examination of variation in the chemical pathway synthesizing the plant growth hormone gib-
hereditary molecules DNA and RNA, and in protein. berellin. Tall plants produce sufficient gibberellin to grow tall.
Mendel did not leave any neatly labeled packets of seeds However, a mutation in the recessive allele results in a very low
for later researchers to analyze, so the process of pinpoint- level of gibberellin production and leads to short stems. See
ing the exact traits he examined and the genes and proteins Experimental Insight 10.1 for more details about this mutation.
responsible for them has been complicated. The first suc-
Seed Color (Yellow and Green, Gene Sgr) Two stud-
cessful identification of one of Mendel’s genes was in 1990,
ies published in 2007, one by Ian Armstead and colleagues
and since then, three other of his genes have been identified.
and the other by Sylvain Aubry and colleagues, identi-
Discussion in this section and in Table 2.8 identifies these
fied a gene known as “stay-green,” or Sgr. The protein
four genes, describes the differences in function of the pro-
produced by Sgr in plants with the dominant yellow
tein products of the dominant and recessive alleles, and sum-
seed phenotype is an enzyme that catalyzes a step in the
marizes the processes that lead to the different phenotypes.
breakdown of chlorophyll, a green-colored compound,
In each case, the mutations generating the recessive allele
as the seed matures. A mutation producing the recessive
significantly reduce or entirely eliminate the normal produc-
allele prevents production of the chlorophyll-breakdown
tion or function of the protein product of the dominant allele.
enzyme. The absence of chlorophyll breakdown results in
Seed Shape (Round and Wrinkled, Gene Sbe1) In 1990, the retention of green color in mutant seeds. See Experi-
research published by Madan Bhattacharyya and colleagues mental Insight 10.1 for more details about this mutation.
described the identification and molecular analysis of a gene
Flower Color (Purple and White, Gene bHLH) In 2010,
responsible for round and wrinkled seed shape. The Sbe1 gene
the gene responsible for the white-flower mutation in Men-
produces the starch-branching enzyme that helps convert a lin-
del’s pea plants was identified. A research group led by Roger
ear form of starch called amylose into a complex branched form
Hellens determined that mutation of the bHLH gene in pea
of starch called amylopectin. As a consequence of the action of
plants produces the recessive mutant white flowers rather
fully functional starch-branching enzyme, round seeds have a
than purple flowers, the dominant phenotype. The protein
much higher percentage of amylopectin and a much lower per-
product of bHLH is a transcription factor protein that inter-
centage of amylose than do wrinkled seeds, which do not have
acts with other proteins to activate the transcription of certain
functional starch-branching enzyme. Amylose readily loses
genes. Some of the genes whose transcription is activated are
sugar molecules, leading to high concentrations of free sugar in
in the pathway that produces the purple-colored plant pigment
the developing seeds and, consequently, excessive water uptake
called anthocyanin. Purple-flowered plants produce enough
that swells them. As seeds mature they naturally dehydrate. The
of the bHLH gene product to activate transcription of anthocy-
maturing wrinkled seeds lose much more water than do matur-
anin-producing genes. White-flowered plants, however, have
ing round seeds, resulting in a partial collapse of the wrinkled
a defect of the bHLH gene product and are unable to activate
seed membranes that does not occur in round seeds. See Exper-
transcription of the anthocyanin-producing genes. See Experi-
imental Insight 11.2 for more details about this mutation.
mental Insight 10.1 for more details about this mutation.
Stem Length (Tall and Short, Gene Le) In 1997, two A common feature of each of the genes controlling
research groups, one led by David Martin and the other by Mendel’s traits is that, coincidentally, the more frequent of
Diane Lester, determined that a gene called Le controls the the two alleles of the pair is dominant to a mutant allele that
variation in stem length that Mendel saw as tall and short is recessive. This is a consequence of the loss of function on
56
Case Study 57
Seed color The gene was origi- The dominant allele (I) produces The recessive mutant allele (i) Armstead, I., et al.
(yellow seed nally named I gene an enzyme that catalyzes one contains two base substitutions 2007. Science
and green and was later renamed step in the chlorophyll break- and a base pair insertion. The 315: 73.
seed) Sgr (called “stay down pathway, which turns resulting mutant polypeptide Aubry, S., et al.
green”). The gene seeds yellow as they mature. has no function, leading to a 2008. Plant Mol.
produces an enzyme blockage of the chlorophyll Biol. 67: 243–256.
that helps break down breakdown pathway and caus-
chlorophyll. ing mutant seeds to retain their
immature green color.
Flower color Originally named The dominant allele (A) pro- The recessive mutant allele (a) Hellens, R. P., et al.
(purple flower gene A and renamed duces a protein that activates contains a base substitution 2010. PLoS One
and white bHLH, the gene pro- transcription of genes required that results in production of 5: 1–8.
flower) duces a protein that to synthesize the purple- abnormal mRNA. The mutant
activates transcription colored plant pigment called mRNA does not produce the
of target genes. anthocyanin. transcription-activating protein,
thus blocking anthocyanin
production and resulting in the
development of white flowers.
Note: For a comprehensive review, see Reid, J. B., and J. J. Ross. 2011. Genetics 189: 3–10.
the part of the mutant alleles. For each of these genes, the parallels the pattern of transmission of phenotypic variation and
presence of one or two copies of the dominant allele results (2) phenotypic variation in pea plants results from differences
in the dominant phenotype, whereas the mutant phenotype in the structure and function of the proteins produced by the
is produced in plants that are homozygous for the mutant alleles. Molecular genetic analysis has led to (3) identification of
allele. We discuss this and other kinds of dominance rela- the DNA-sequence differences between alleles, determination
tionships between alleles in Section 4.1. of the impact of those differences on mRNA, and description of
In broader terms, the conclusions from molecular the alteration of protein structures resulting from each mRNA;
studies identifying genes Mendel examined in his crosses and also to (4) functional analysis of the protein product of each
are that (1) the inheritance of allelic variants precisely allele to describe the role it plays in producing the phenotype.
C A SE S T U D Y
OMIM, Gene Mutations, and Human Hereditary Disease
The human genome consists of the DNA making up the of each cell. The human genome also includes the small
22 pairs of autosomal chromosome pairs and the one pair of amount of DNA that makes up the single chromosome of mito-
sex chromosomes (two X chromosomes in females and an X chondria that inhabit the cytoplasm of cells. We discuss mito-
and a Y chromosome in males) that are located in the nucleus chondria and their genes in Chapter 17. In all, there are a little
58 CHAPTER 2 Transmission Genetics
more than 3 billion DNA bases in the human genome, and the If you were interested in searching OMIM for information
genome encodes approximately 22,500 genes, although the on a genetic disease, you could go to either the official home
exact number remains the subject of active research. page or the searchable website and enter the name of the dis-
Many human genes are involved in determining elements ease or condition. For example, if you enter “cystic fibrosis” in
of the human phenotype, which includes both the outward the search bar at either site you will be given a number of click-
appearance of the body and its many biochemical and meta- able pages. If you select the page “*602421 cystic fibrosis trans-
bolic processes. As we will describe in later chapters, no gene membrane conductance regulator; CFTR” you will be taken to
really works alone to determine a phenotypic characteristic. a synopsis of the autosomal recessive condition cystic fibrosis
Instead, genes work together in pathways that involve the that is caused by mutations of the CFTR gene. The asterisk (*)
action of different genes at different steps of the process to preceding the six-digit number indicates that a gene is known
produce a trait or to execute a biological function. Despite this for this condition. Any other genetic condition of interest can be
cooperation among genes, or perhaps because of it, muta- searched in a similar manner. Often you will see a hash character
tions of single genes can disrupt or block a pathway. Gene (#) before a six-digit number. This indicates that the information
mutations that prevent production of the normal protein or is for a phenotypic description. Often a “cytogenetic location” is
produce an abnormal amount of the normal protein can lead given. This indicates the chromosome location of a gene caus-
to phenotypic abnormalities that are often identified as heredi- ing or contributing to a disease. We discuss deciphering these
tary diseases in humans. How many genes have mutations that chromosome location designations in Chapter 10.
are implicated in the production of such hereditary diseases? Each OMIM entry is accompanied by a six-digit number
according to the following scheme:
CATALOGING HEREDITARY DISEASES AND DISEASE
GENES One way to answer this question is to determine 1- - - - - and 2- - - - - (100,000 and up and 200,000 and up) are
how many single-gene mutations are described as the source autosomal genes or phenotypes listed before May 15, 1995.
of a hereditary disease or condition. The Online Mendelian 3- - - - - (300,000 and up) are X-linked genes and phenotypes.
Index in Man (OMIM) is a continuously updated, public data- 4- - - - - (400,000 and up) are Y-linked genes and phenotypes.
base containing a list of human genes and phenotypes asso-
5- - - - - (500,000 and up) are mitochondrial genes and phe-
ciated with gene mutations. The official home page of OMIM
notypes (see Chapter 17).
is at http://www.omim.org. A searchable research page is
located at https://www.ncbi.nlm.nih.gov/omim. From this 6- - - - - (600,000 and up) are autosomal genes and pheno-
page you can also search numerous other database websites types listed after May 15, 1995.
maintained by the U.S. National Institutes of Health (NIH),
the National Laboratory of Medicine (NLM), or the National THE FREQUENCY OF GENE MUTATIONS How frequently do
Center for Biological Information (NCBI). mutations of OMIM-listed genes cause a hereditary condition
In 2016, OMIM celebrated its 50th anniversary. It began in to appear in a newborn infant? This question is a little more
1966 as the brainchild of Victor McKusick, a physician who took difficult to answer for three reasons. First, most abnormalities
a great interest in human genetics and in the roles genes play present at the birth of a newborn infant are not the result of
in human disease. OMIM started out as a comprehensive cata- gene mutation. Instead, most abnormalities at birth result from
log then called the Mendelian Index in Man (MIM), with 1486 an error in fetal development that can be caused by disease
entries, most of them genetic disease phenotypes. McKusick agents, malnutrition, exposure to drugs or chemicals, or a num-
and his staff assembled this first list, and they maintained and ber of other factors. Second, some of the hereditary diseases
periodically updated the list for many years. Twelve editions of listed in OMIM are not present at birth. Instead, the symptoms
a thick book containing the complete MIM list were published of these conditions take several months to several decades to
annually from 1966 to 1988. In 1987, the published information develop. Finally, hereditary abnormalities resulting from errors
was first made available on the Internet, and in 1995 the con- in the number or structure of chromosomes are their own cat-
tent was made available to the public on the worldwide web. egory of birth defects, not listed in OMIM. We discuss chromo-
Since that time the catalog has been known as OMIM. some changes and their consequences in Chapter 10.
Today, OMIM contains more than 24,000 entries. More To apply some numbers to the question, however, we can
than 8000 of these are hereditary diseases and 16,000 are take statistics on births and birth defects from the U.S. Centers
human genes, from all 22 autosomes, the X chromosome, for Disease Control (CDC). For 2014, the most recent full year
and the Y chromosome. Table 2.9 gives the number of for which the data have been published, the CDC reports that
genes on each type of chromosome that are currently found 3,988,076 babies were born in the United States. Nearly 97%
on OMIM. Of the hereditary diseases listed on OMIM, almost of these babies were born healthy, but about 3%, or 1 in 33,
5900 have a known molecular basis. This means that the have some kind of abnormality detected at birth. Between 20%
abnormality that causes the disease is known. The genes and 30% of these birth defects are caused either by gene muta-
causing about 3650 of these diseases have been identified. tions or by abnormalities of chromosome number or structure;
the remainder are developmental abnormalities. The CDC example, Application Chapter B titled “Human Genetic Test-
estimates that in 2014 about 1 in 110 to 1 in 150 babies had ing” discusses genetic tests performed on newborn infants
defects caused by an inherited gene mutation, and an additional and genetic testing done later in life to identify the presence
1 in 150 to 1 in 200 babies were born with chromosome defects. of genetic disease or a mutation that can cause genetic dis-
As we move on in this book we will pay special attention ease. Application Chapter A titled “Human Hereditary Disease
to a number of topics relating to human genome sequence and Genetic Counseling” discusses how genetic information
variation, including inherited human diseases, testing for is managed and presented to families in a medical context.
human genetic diseases, and the management and applica- The other application chapters, on cancer genetics, human
tions of information concerning human genetic variation and evolutionary genetics, and on DNA analysis in forensic genet-
inherited diseases. Much of this discussion takes place in the ics applications, describe additional uses of human genetic
application chapters distributed throughout the book. For information.
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
2.1 Gregor Mendel Discovered the Basic simultaneously or consecutively. The joint probability in
Principles of Genetic Transmission this case is determined by multiplying the probabilities of
the independent events.
❚❚ A broad education in science and mathematics prepared ❚❚ The sum rule of probability is applied when two or more
Mendel to design hybridization experiments that could outcomes are possible. In this case, the individual prob-
reveal the principles of hereditary transmission. abilities of the outcomes are added together to determine
the joint probability.
2.2 Monohybrid Crosses Reveal ❚❚ Conditional probability is the probability of outcomes that
the Segregation of Alleles are contingent on particular conditions.
❚❚ Binomial probability theory describes the outcomes of an
❚❚ Mendel’s experimental design had five important features:
experiment in terms of the number of outcome classes and
controlled crosses, use of pure-breeding parental strains,
the frequency of each class.
examination of discreet traits, quantification of results, and
the use of replicate and reciprocal crosses.
❚❚ Crosses between pure-breeding parental plants with differ-
ent phenotypes produce monohybrid F1 progeny with the 2.5 Chi-Square Analysis Tests the Fit between
dominant phenotype. Observed Values and Expected Outcomes
❚❚ Monohybrid crosses produce a 3:1 ratio of the dominant to
❚❚ The chi-square test (x2) is used to compare observed results
the recessive phenotype among F2 progeny and demonstrate
with the results predicted by a genetic hypothesis that is
the operation of the law of segregation.
based on chance. It shows how closely predictions match
❚❚ The law of segregation states that two alleles of a gene will results.
separate from one another during gamete formation, each
❚❚ The significance of a chi-square value is determined by
allele has an equal probability of inclusion in a gamete, and
the P (probability) value corresponding to the number of
gametes unite at random during reproduction.
degrees of freedom in the experiment.
❚❚ Mendel used test-cross analysis to demonstrate that F1
plants are monohybrid, and he used the self-fertilization of
F2 plants with the dominant phenotype to demonstrate that
the latter have a 2:1 ratio of heterozygotes to homozygotes. 2.6 Autosomal Inheritance and Molecular
Genetics Parallel the Predicitions of Mendel’s
2.3 Dihybrid and Trihybrid Crosses Reveal Hereditary Principles
the Independent Assortment of Alleles ❚❚ Traits transmitted by autosomal inheritance are equally
❚❚ The F2 progeny of dihybrid F1 plants display a 9:3:3:1 phe- likely in males and females.
notype ratio that demonstrates the operation of the law of ❚❚ Autosomal dominant inheritance produces a vertical pattern
independent assortment. of transmission in which each organism with the dominant
❚❚ Mendel used trihybrid-cross analysis to demonstrate trait has at least one parent with the trait.
that alleles of multiple genes are transmitted in accor- ❚❚ Traits transmitted in an autosomal recessive pattern are
dance with the predictions of the law of independent usually distributed in a horizontal pattern in which off-
assortment. spring with the recessive trait frequently descend from
parents that are heterozygous and have the dominant
2.4 Probability Theory Predicts Mendelian phenotype.
Ratios ❚❚ Molecular analysis of four of Mendel’s traits illustrates
how transmission genetics and molecular genetics
❚❚ The product rule of probability is used to determine the characterize the same hereditary processes at different
likelihood of two or more independent events occurring levels.
60 CHAPTER 2 Transmission Genetics
PRE PA R IN G F O R P R O BLE M S O LV I NG
In addition to the list of problem-solving tips and suggestions 5. Work backward from offspring genotypes or pheno-
given here, you can go to the Study Guide and Solutions Man- types to predict the genotypes or phenotypes of parents
ual that accompanies this book for help at solving problems. in a cross.
1. Be familiar with Mendel’s laws of segregation and 6. Recognize the use of the product rule and the sum
independent assortment and the ways in which prob- rule in predicting offspring genotype and phenotype
ability determines the outcomes of genetic crosses proportions.
involving these two laws.
7. Recognize the circumstances that dictate the use of
2. Familiarize yourself with monohybrid and dihybrid conditional probability, and understand the uses of
crosses and the ratios they generate in offspring. binomial probability.
3. Review test crosses and the phenotype ratios produced 8. Be familiar with the use of chi-square analysis to test
from test crosses. the fit between the observed results of a cross and the
results that are expected.
4. Use the Punnett square and the forked-line method to
predict the expected genotypic or phenotypic propor-
tions from genetic crosses.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Compare and contrast the following terms: c. Use the forked-line method to predict the expected
a. dominant and recessive ratio of offspring phenotypes.
b. genotype and phenotype 7. If a chi-square test produces a chi-square value of 7.83
c. homozygous and heterozygous with 4 degrees of freedom,
d. monohybrid cross and test cross a. In what interval range does the P value fall?
e. dihybrid cross and trihybrid cross b. Is the result sufficient to reject the chance hypothesis?
2. For the cross BB * Bb, what is the expected genotype c. Above what chi-square value would you reject the
ratio? What is the expected phenotype ratio? chance hypothesis for an experiment with 7 degrees of
3. For the cross Aabb * aaBb, what is the expected geno- freedom?
type ratio? What is the expected phenotype ratio? 8. Determine whether the statements below are true or false.
4. In mice, black coat color is dominant to white coat color. If a statement is false, provide the correct information or
In the pedigree shown here, mice with a black coat are revise the statement to make it correct.
represented by darkened symbols, and those with white a. If a dihybrid cross is performed, the expected geno-
coats are shown as open symbols. Using allele symbols B typic ratio is 9:3:3:1.
and b, determine the genotypes for each mouse . b. A student uses the product rule to predict that the prob-
ability of flipping a coin twice and getting a head and
then a tail is 14.
c. A test cross between a heterozygous parent and a
homozygous recessive parent is expected to produce a
1:1 genotypic and phenotypic ratio.
d. The outcome of a trihybrid cross is predicted by the
law of segregation.
e. Reciprocal crosses that produce identical results dem-
onstrate that a strain is pure-breeding.
5. Two parents plan to have three children. What is the prob-
f. If a woman is heterozygous for albinism, an autosomal
ability that the children will be two girls and one boy?
recessive condition that results in the absence of skin
6. Consider the cross AaBbCC * AABbCc. pigment, the proportion of her gametes carrying the
a. How many different gamete genotypes can each organ- allele that allows pigment expression is expected to be
ism produce? 75%.
b. Use a Punnett square to predict the expected ratio of g. The progeny of a trihybrid cross are expected to have
offspring phenotypes. one of 27 different genotypes.
Problems 61
h. If a dihybrid F1 plant is self-fertilized, b. Choose symbols for each allele, and identify the
(1) 9
16 of the progeny will have the same phenotype as
genotypes of the brown male and the two black
the F1 parent. females.
(2) 1
16 of the progeny will be true-breeding.
13. Figure 2.12 shows the results of Mendel’s test-cross
1 analysis of independent assortment. In this experiment, he
(3)
2 of the progeny will be heterozygous at one or first crossed pure-breeding round, yellow plants to pure-
both loci. breeding wrinkled, green plants. The round yellow F1 are
9. In the datura plant, purple flower color is controlled by crossed to pure-breeding wrinkled, green plants. Use chi-
a dominant allele P. White flowers are found in plants square analysis to show that Mendel’s results do not differ
homozygous for the recessive allele p. Suppose that a pur- significantly from those expected.
ple-flowered datura plant with an unknown genotype is 14. An experienced goldfish breeder receives two unusual
self-fertilized and that its progeny are 28 purple-flowered male goldfish. One is black rather than gold, and the
plants and 10 white-flowered plants. other has a single tail fin rather than a split tail fin. The
a. Use the results of the self-fertilization to determine the breeder crosses the black male to a female that is gold.
genotype of the original purple-flowered plant. All the F1 are gold. She also crosses the single-finned
b. If one of the purple-flowered progeny plants is male to a female with a split tail fin. All the F1 have a
selected at random and self-fertilized, what is the prob- split tail fin. She then crosses the black male to F1 gold
ability it will breed true? females and, separately, crosses the single-finned male
10. The dorsal pigment pattern of frogs can be either “leop- to F1 split-finned females. The results of the crosses are
ard” (white pigment between dark spots) or “mottled” shown below.
(pigment between spots appears mottled). The trait is
Black male * F1 gold female:
controlled by an autosomal gene. Males and females are
selected from pure-breeding populations, and a pair of Gold 32
reciprocal crosses is performed. The cross results are Black 34
shown below.
Single-finned male * F1 split-finned female:
Cross 1: P: Male leopard * female mottled Split fin 41
F1 : All mottled Single fin 39
F2 : 70 mottled, 22 leopard
a. What do the results of these crosses suggest about the
Cross 2: P: Male mottled * female leopard inheritance of color and tail fin shape in goldfish?
F1 : All mottled b. Is black color dominant or recessive? Explain. Is single
F2 : 50 mottled, 18 leopard tail dominant or recessive? Explain.
c. Use chi-square analysis to test your hereditary hypoth-
a. Which of the phenotypes is dominant? Explain your esis for each trait.
answer. 15. The accompanying pedigree shows the transmission of
b. Compare and contrast the results of the reciprocal albinism (absence of skin pigment) in a human family.
crosses in the context of autosomal gene inheritance.
c. In the F2 progeny from both crosses, what proportion
is expected to be homozygous? What proportion is 1 2 3
expected to be heterozygous? I
d. Propose two different genetic crosses that would allow
you to determine the genotype of one mottled frog 1 2 3 4 5 6 7 8 9
II
from the F2 generation.
11. Black skin color is dominant to pink skin color in pigs.
Two heterozygous black pigs are crossed. a. What is the most likely mode of transmission of albi-
nism in this family?
a. What is the probability that their offspring will have
b. Using allelic symbols of your choice, identify the gen-
pink skin?
otypes of the male and his two mates in generation I.
b. What is the probability that the first and second off-
c. The female I-1 and her mate, male I-2, had four chil-
spring will have black skin?
dren, one of whom has albinism. What is the prob-
c. If these pigs produce a total of three piglets, what is the
ability that they could have had a total of four children
probability that two will be pink and one will be black?
with any other outcome except one child with albinism
12. A male mouse with brown fur color is mated to two dif- and three with normal pigmentation?
ferent female mice with black fur. Black female 1 pro- d. What is the probability that female I-3 is a heterozy-
duces a litter of 9 black and 7 brown pups. Black female 2 gous carrier of the allele for albinism?
produces 14 black pups. e. One child of female I-3 has albinism. What is the prob-
a. What is the mode of inheritance of black and brown ability that any of the other four children are carriers of
fur color in mice? the allele for albinism?
62 CHAPTER 2 Transmission Genetics
16. A geneticist crosses a pure-breeding strain of peas a. The F1 progeny of this cross are allowed to self-
producing yellow, wrinkled seeds with one that is pure- fertilize. What is the expected phenotypic distribution
breeding for green, round seeds. among the F2 progeny?
a. Use a Punnett square to predict the F2 progeny b. Suppose that all of the F2 progeny with terminal flow-
that would be expected if the F1 are allowed to ers, i.e., plants with terminal flowers and inflated pods
self-fertilize. and plants with terminal flowers and constricted pods,
b. What proportion of the F2 progeny are expected to are saved and allowed to self-fertilize to produce a
have yellow seeds? Wrinkled seeds? Green seeds? partial F3 generation. What is the expected phenotypic
Round seeds? distribution among these F3 plants?
c. What is the expected phenotype distribution among the c. If an F1 plant from the initial cross described above
F2 progeny? is crossed with a plant that is terminal, constricted,
what is the expected distribution among the resulting
17. Suppose an F1 plant from Problem 16 is crossed to the
progeny?
pure-breeding green, round parental strain. Use a forked-
d. If the plants with terminal flowers produced by the
line diagram to predict the phenotypic distribution of the
cross in part (c) are saved and allowed to self-fertilize,
resulting progeny.
what is the expected phenotypic distribution among the
18. In pea plants, the appearance of flowers along the main progeny?
stem is a dominant phenotype called “axial” and is con-
19. If two six-sided dice are rolled, what is the probability
trolled by an allele T. The recessive phenotype, produced
that the total number of spots showing is
by an allele t, has flowers only at the end of the stem and
is called “terminal.” Pod form displays a dominant pheno- a. 4?
type, “inflated,” controlled by an allele C, and a recessive b. 7?
“constricted” form, produced by the c allele. A cross is c. greater than 5?
made between a pure-breeding axial, constricted plant and d. an odd number?
a plant that is pure-breeding terminal, inflated.
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
20. Experimental Insight 2.1 describes data, collected by a 23. List all the different gametes that are possible from the
genetics class like yours, on the numbers of kernels of following genotypes.
different colors in bicolor corn. To test the hypothesis a. AABbCcDd
that the presence of kernels of different colors in each b. AabbCcDD
ear is the result of the segregation of two alleles of a c. AaBbCcDd
single gene, the class counted 12,356 kernels and found d. AabbCCdd
that 9304 were yellow and 3052 were white. Use chi-
24. Organisms with the genotypes AABbCcDd and
square analysis to evaluate the fit between the segregation
AaBbCcDd are crossed. What are the expected propor-
hypothesis and the class results.
tions of the following progeny?
21. The accompanying pedigree shows the transmission of a. A–B–C–D–
a phenotypic character. Using B to represent a dominant b. AabbCcDd
allele and b to represent a recessive allele, c. a phenotype identical to either parent
d. A–B–ccdd
1 2
I 25. Blue moon beans produce beans that are either the domi-
nant color blue or the recessive color white. The bean
1 2 3 4 pods for this species always contain four seeds each. If
II
two heterozygous plants that each have the Bb genotype
are crossed, what are the predicted frequencies of each
of the five outcome classes for combinations of blue and
a. Give the genotype(s) possible for each member of the
white seeds in pods?
family, assuming the trait is autosomal dominant.
b. Give the genotype(s) possible for each member of the 26. In the fruit fly Drosophila, a rudimentary wing called
family, assuming the trait is autosomal recessive. “vestigial” and dark body color called “ebony” are inher-
22. The seeds in bush bean pods are each the product of an ited as independently assorting genes and are recessive
independent fertilization event. Green seed color is domi- to their dominant counterparts full wing and gray body
nant to white seed color in bush beans. If a heterozygous color. Dihybrid dominant-phenotype males and females
plant with green seeds self-fertilizes, what is the probabil- are crossed, and 3200 progeny are produced. How many
ity that 6 seeds in a single pod of the progeny plant will progeny flies are expected to be found in each phenotypic
consist of class?
a. 3 green and 3 white seeds? 27. In pea plants, plant height, seed shape, and seed color are
b. all green seeds? governed by three independently assorting genes. The
c. at least 1 white seed? three genes have dominant and recessive alleles, with tall
Problems 63
(T) dominant to short (t), round (R) dominant to wrinkled 30. A male and a female are each heterozygous for both
(r), and yellow (G) dominant to green (g). cystic fibrosis (CF) and phenylketonuria (PKU). Both
a. If a true-breeding tall, wrinkled, yellow plant is conditions are autosomal recessive, and they assort
crossed to a true-breeding short, round, green plant, independently.
what phenotypic ratios are expected in the F1 and F2? a. What proportion of the children of this couple will
b. What proportion of the F2 are expected to be tall, have neither condition?
wrinkled, yellow? ttRRGg? b. What proportion of the children will have either PKU
c. What proportion of the F2 that produce round, green or CF but not both?
seeds (regardless of the height of the plant) are c. What proportion of the children will be carriers of one
expected to breed true? or both conditions?
28. A variety of pea plant called Blue Persian produces a 31. A woman expressing a dominant phenotype is heterozy-
tall plant with blue seeds. A second variety of pea plant gous (Dd) for the gene.
called Spanish Dwarf produces a short plant with white a. What is the probability that the dominant allele carried
seed. The two varieties are crossed, and the resulting by the woman will be inherited by a grandchild?
seeds are collected. All of the seeds are white; and b. What is the probability that two grandchildren of the
when planted, they produce all tall plants. These tall woman who are first cousins to one another will each
F1 plants are allowed to self-fertilize. The results for inherit the dominant allele?
seed color and plant stature in the F2 generation are as c. Draw a pedigree that illustrates the transmission of
follows: the dominant trait from the grandmother to two of her
grandchildren who are first cousins.
F2 Plant Phenotype Number 32. Two parents who are each known to be carriers of an
autosomal recessive allele have four children. None of
Blue seed, tall plant 97
the children has the recessive condition. What is the prob-
White seed, tall plant 270 ability that one or more of the children is a carrier of the
Blue seed, short plant 33 recessive allele?
White seed, short plant 100 33. An organism having the genotype AaBbCcDdEe is self-
TOTAL 500 fertilized. Assuming the five genes assort independently,
determine the following proportions:
a. gametes that are expected to carry only dominant
a. Which phenotypes are dominant, and which are reces- alleles
sive? Why? b. progeny that are expected to have a genotype identical
b. What is the expected distribution of phenotypes in the to that of the parent
F2 generation? c. progeny that are expected to have a phenotype identi-
c. State the hypothesis being tested in this experiment. cal to that of the parent
d. Examine the data in the table by the chi-square test and d. gametes that are expected to be ABcde
determine whether they conform to expectations of the e. progeny that are expected to have the genotype
hypothesis. AabbCcDdE–
29. In tomato plants, the production of red fruit color is under 34. A man and a woman are each heterozygous carriers of an
the control of an allele R. Yellow tomatoes are rr. The autosomal recessive mutation of a disorder that is fatal
dominant phenotype for fruit shape is under the control of in infancy. They both want to have multiple children, but
an allele T, which produces two lobes. Multilobed fruit, they are concerned about the risk of the disorder appear-
the recessive phenotype, have the genotype tt. Two differ- ing in one or more of their children. In separate calcula-
ent crosses are made between parental plants of unknown tions, determine the probabilities of the couple having
genotype and phenotype. Use the progeny phenotype five children with 0, 1, 2, 3, 4, and all 5 children being
ratios to determine the genotypes and phenotypes of each affected by the disorder.
parent.
35. For a single dice roll, there is a 16 chance that any particu-
lar number will appear. For a pair of dice, each specific
1
Cross 1 progeny: 3 two-lobed, red combination of numbers has a probability of 36 occurring.
8
3
Most total values of two dice can occur more than one
8 two-lobed, yellow way. As a test of random probability theory, a student
1
multilobed, red decides to roll a pair of six-sided dice 300 times and tabu-
8
late the results. She tabulates the number of times each
1 multilobed, yellow
8 different total value of the two dice occurs. Her results are
Cross 2 progeny: 1 two-lobed, red the following:
4
1
4 two-lobed, yellow
1
4 multilobed, red
1
4 multilobed, yellow
64 CHAPTER 2 Transmission Genetics
Total Value of Two Dice Number of Times Rolled d. If the first child has galactosemia, what is the prob-
ability that the second child will have galactosemia?
2 7 Explain the reasoning for your answer.
3 11
38. Sweet yellow tomatoes with a pear shape bring a high
4 23 price per basket to growers. Pear shape, yellow color, and
5 36 terminal flower position are recessive traits produced by
6 42
alleles f, r, and t, respectively. The dominant phenotypes
for each trait—full shape, red color, and axial flower
7 53 position—are the product of dominant alleles F, R, and T.
8 40 A farmer has two pure-breeding tomato lines. One is full,
9 38 yellow, terminal and the other is pear, red, axial. Design
a breeding experiment that will produce a line of tomato
10 30 that is pure-breeding for pear shape, yellow color, and
11 12 axial flower position.
12 8
39. A cross between a spicy variety of Capsicum annum pep-
TOTAL 300 per and a sweet (nonspicy) variety produces F1 progeny
plants that all have spicy peppers. The F1 are crossed, and
The student tells you that her results fail to prove among the F2 plants are 56 that produce spicy peppers
that random chance is the explanation for the outcome of and 20 that produce sweet peppers. Dr. Ara B. Dopsis, an
this experiment. Is she correct or incorrect? Support your expert on pepper plants, discovers a gene he designates
answer. Pun1 that he believes is responsible for spicy versus sweet
flavor of peppers. Dr. Dopsis proposes that a dominant
36. You have four guinea pigs for a genetic study. One male allele P produces spicy peppers and that a recessive
and one female are from a strain that is pure-breeding for mutant allele p results in sweet peppers.
short brown fur. A second male and female are from a strain a. Are the data on the parental cross and the F1 and F2
that is pure-breeding for long white fur. You are asked to consistent with the proposal made by Dr. Dopsis?
perform two different experiments to test the proposal that Explain why or why not, using P and p to indicate
short fur is dominant to long fur and that brown is dominant probable genotypes of pepper plants.
to white. You may use any of the four original pure-breed- b. Assuming the proposal is correct, what proportion of
ing guinea pigs or any of their offspring in experimental the spicy F2 pepper plants do you expect will be pure-
matings. Design two different experiments (crossing dif- breeding? Explain your answer.
ferent animals and using different combinations of pheno-
types) to test the dominance relationships of alleles for fur 40. Alkaptonuria is an infrequent autosomal recessive condi-
length and color, and make predictions for each cross based tion. It is first noticed in newborns when the urine in their
on the proposed relationships. Anticipate that the litter size diapers turns black upon exposure to air. The condition is
will be 12 for each mating and that female guinea pigs can caused by the defective transport of the amino acid phe-
produce three litters in their lifetime. nylalanine through the intestinal walls during digestion.
About 4 people per 1000 are carriers of alkaptonuria.
37. Galactosemia is an autosomal recessive disorder caused Sara and James had never heard of alkaptonuria and
by the inability to metabolize galactose, a component of were shocked to discover that their first child had the
the lactose found in mammalian milk. Galactosemia can condition. Sara’s sister Mary and her husband Frank are
be partially managed by eliminating dietary intake of lac- planning to have a family and are concerned about the
tose and galactose. Amanda is healthy, as are her parents, possibility of alkaptonuria in one of their children.
but her brother Alonzo has galactosemia. Brice has a simi- The four adults (Sara, James, Mary, and Frank) seek
lar family history. He and his parents are healthy, but his information from a neighbor who is a retired physician.
sister Brianna has galactosemia. Amanda and Brice are After discussing their family histories, the neighbor says,
planning a family and seek genetic counseling. Based on “I never took genetics, but I know from my many years
the information provided, complete the following activi- in practice that Sara and James are both carriers of this
ties and answer the questions. recessive condition. Since their first child had the condi-
a. Draw a pedigree that includes Amanda, Brice, and tion, there is a very low chance that the next child will also
their siblings and parents. Identify the genotype of have it, because the odds of having two children with a
each person, using G and g to represent the dominant recessive condition are very low. Mary and Frank have no
and recessive alleles, respectively. chance of having a child with alkaptonuria because Frank
b. What is the probability that Amanda is a carrier of the has no family history of the condition.” The two couples
allele for galactosemia? What is the probability that each have babies and both babies have alkaptonuria.
Brice is a carrier? Explain your reasoning for each a. What are the genotypes of the four adults?
answer. b. What was incorrect about the information given to
c. What is the probability that the first child of Amanda Sara and James? What is incorrect about the informa-
and Brice will have galactosemia? Show your work. tion given to Mary and Frank?
Problems 65
c. What is the probability that the second child of Mary two flies involved in this mating, and determine the prob-
and Frank will have alkaptonuria? ability of each possible outcome.
d. What is the chance that the third child of Sara and
James will be free of the condition? 44. Situs inversus is a congenital condition in which the
e. The couples are worried that one of their grandchildren major visceral organs are reversed from their nor-
will inherit alkaptonuria. How would you assess the mal positions. Investigations into the genetics of this
risk that one of the offspring of a child with alkapton- abnormality revealed that individuals with at least one
uria will inherit the condition? dominant allele (SI) of an autosomal gene are normal
but, surprisingly, of individuals that are homozygous
41. Humans vary in many ways from one another. Among for a recessive allele (si), 12 are situs inversus and 12 are
many minor phenotypic differences are the following normal.
five independently assorting traits that (sort of) have a a. What genotypes and phenotypes are expected in prog-
dominant and a recessive phenotype: (1) forearm hair eny from a cross of two si si individuals?
(alleles F and f )—the presence of hair on the forearm is b. What genotypes and phenotypes are expected in prog-
dominant to the absence of hair on the forearm; eny from a cross of two SI si individuals?
(2) earlobe form (alleles E and e)—unattached earlobes
are dominant to attached earlobes; (3) widow’s peak 45. Domestic dogs evolved from ancestral grey wolves.
(alleles W and w)—a distinct “V” shape to the hairline Wolves have coats of short, straight hair and lack “fur-
at the top of the forehead is dominant to a straight hair- nishings,” a growth pattern marked by eyebrows and a
line; (4) hitchhiker’s thumb (alleles H and h)—the abil- mustache found in some domestic dogs. In domestic dogs,
ity to bend the thumb back beyond vertical is dominant coat variation is controlled by allelic variation in three
and the inability to do so is recessive; and (5) freckling genes. Recessive mutant alleles in the FGF5 gene result
(alleles D and d)—the appearance of freckles is domi- in long hair, while dogs carrying the dominant ancestral
nant to the absence of freckles. In reality, the genetics of allele have short hair. Likewise, recessive mutant alleles
these traits are more complicated than single gene varia- in the KRT71 gene result in curly hair, whereas dogs with
tion, but assume for the purposes of this problem that an ancestral dominant allele have straight hair. Dominant
the patterns in families match those of other single-gene mutant alleles in the RSPO2 gene cause the presence of
variants. furnishings, while dogs homozygous for the ancestral
If a couple with the genotypes Ff Ee Ww Hh Dd recessive allele have no furnishings.
and Ff Ee Ww Hh Dd have children, what is the chance A pure-breeding curly- and long-haired poodle with
the children will inherit the following characteristics? furnishings was crossed to a pure-breeding short- and
straight-haired border collie lacking furnishings.
a. the same phenotype as the parents
a. What are the genotypes and phenotypes of the
b. four dominant traits and one recessive trait
puppies?
c. all recessive traits
b. If dogs of the F1 generation are interbred, what propor-
d. the genotype Ff EE Ww hh dd
tions of genotypes and phenotypes are expected in the
42. In chickens, the presence of feathers on the legs is due F2 ?
to a dominant allele (F), and the absence of leg feath- 46. Alleles of the IGF-1 gene in dogs, encoding insulin-like
ers is due to a recessive allele (f). The comb on the top growth factor, largely determine whether a domestic
of the head can be either pea-shaped, a phenotype that dog will be large or small. Dogs with an ancestral domi-
is controlled by a dominant allele (P), or a single comb nant allele are large, whereas dogs homozygous for the
controlled by a recessive allele (p). The two genes assort mutant recessive allele are small. Chondrodysplasia, a
independently. Assume that a pure-breeding rooster that short-legged phenotype (as in dachshunds and basset
has feathered legs and a single comb is crossed with a hounds), is caused by a dominant gain-of-function allele
pure-breeding hen that has no leg feathers and a pea- of the FGF4 gene. The MSTN gene encodes myostatin, a
shaped comb. The F1 are crossed to produce the F2 . regulator of muscle development. Dogs with a dominant
Among the resulting F2 , however, only birds with a single ancestral allele of the MTSN gene have normal muscle
comb and feathered legs are allowed to mate. These development, while dogs homozygous for recessive
chickens mate at random to produce F3 progeny. What are mutants in the MTSN gene are “double muscled” and
the expected genotypic and phenotypic ratios among the have trouble running quickly. However, dogs heterozy-
resulting F3 progeny? gous for the mutant allele run faster than either of the
43. A pure-breeding fruit fly with the recessive mutation cut homozygotes.
wing, caused by the homozygous cc genotype, is crossed You breed a pure-breeding small basset hound of
to a pure-breeding fly with normal wings, genotype CC. normal musculature with a pure-breeding “bully” whip-
Their F1 progeny all have normal wings. F1 flies are pet, a double-muscled large dog with normal legs.
crossed, and the F2 progeny have a 3:1 ratio of normal a. What are the genotypes and phenotypes of the F1
wing to cut wing. One male F2 fly with normal wings is puppies?
selected at random and mated to an F2 female with nor- b. If the F1 of this cross is interbred, what proportion of
mal wings. Using all possible genotypes of the F2 flies the F2 are expected to be fast runners and what propor-
selected for this cross, list all possible crosses between the tion normal-speed runners?
66 CHAPTER 2 Transmission Genetics
47. The accompanying pedigree shows a family in which one c. What are the probabilities for each of the possible
child (II-1) has an autosomal recessive condition. On the genotypes for II-2, II-3, and II-4?
basis of this fact alone, provide the following information. d. What is the probability that all three of the children in
generation II who have the dominant phenotype are Aa?
e. What is the chance that among the three children in
1 2 generation II who have the dominant phenotype, one
I
of them is AA and two of them are Aa? (Hint: Consider
1 2 3 4 all possible orders of genotypes.)
II
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
48. A pea plant that has the genotype RrGgwwdd is crossed 51. For a number of human hereditary conditions, genetic
to a plant that has the rrGgWwDd genotype. The R gene testing is available to identify heterozygous carriers.
controls round versus wrinkled seed, the G gene controls Some heterozygous carrier testing programs are commu-
yellow versus green seed, the W gene controls purple ver- nity-based, often as part of an organized effort targeting
sus white flower, and the D gene controls tall versus short specific populations in which a disease and carriers of
plants. Determine the following; a disease are relatively frequent. For example, carrier
a. What are the phenotypes of each plant? genetic testing programs for Tay–Sachs disease target
b. What proportion of the progeny are expected to have Ashkenazi Jewish populations; and sickle cell disease
the genotype RrGGwwDd? carrier testing programs target African American popula-
c. What proportion of the progeny are expected to have tions. The testing is usually free or available at minimal
the genotype rrggwwdd? cost, the wait time for results is short, and the results are
d. What proportion of the progeny are expected to be confidential and unavailable to third parties such as insur-
round, yellow, purple, and tall? ance companies. Neither the Tay–Sachs nor sickle cell
allele produces serious consequences for heterozygous
49. Go to the OMIM website (http://www.ncbi.nlm.nih.gov/
carriers.
omim) and locate the Search button at the top of the page.
Use the Search function to look up, one by one, the fol- a. From a genetic perspective, what is the value of the
lowing three human hereditary diseases that are relatively information obtained by genetic testing of the type
common in certain populations: “Tay–Sachs disease” described?
(select OMIM number 272800 from the search results b. In a broader sense, what is the value of a community-
list); “cystic fibrosis” (select OMIM number 602421 from based effort targeting specific populations for selected
the search results list); and “sickle cell anemia” (select diseases?
OMIM 603903 from the search results list). For each of c. Do you personally think you would participate in the
these diseases, look through the information and provide kind of carrier genetic testing described if you were a
the following details: member of a population targeted for such testing?
a. On which chromosome is the gene for the disease 52. In humans, the ability to bend the thumb back beyond ver-
located? tical is called hitchhiker’s thumb and is dominant to the
b. What gene is mutated in the disease? inability to do so (OMIM 274200; see Problem 41). Also,
c. Briefly describe the disease. the presence of attached earlobes is recessive to unat-
d. In which population(s) does the disease most com- tached earlobes (OMIM 128900).
monly occur? a. Check your own phenotype and those of several
50. Select a human hereditary disease or condition you friends or classmates.
would like to know more about. Using the OMIM web- b. Using all available and willing members of your
site (http://www.ncbi.nlm.nih.gov/omim) search for the family, or members of another family if yours is
disease and prepare a short synopsis of your findings. not easily accessible, trace the transmission of both
Include the following information: traits in a pedigree. Use allelic symbols H and h for
the thumb and E and e for earlobes, and identify the
a. The gene mutated in the disease and its chromosome
genotypes for each family member as completely as
location.
possible. Bring the pedigree back to share with your
b. A description of the disease or condition.
group.
c. Any available information about the population(s) in
which the disease is most common.
Cell Division
and Chromosome Heredity 3
CHAPTER OUTLINE
3.1 Mitosis Divides Somatic Cells
3.2 Meiosis Produces Cells for
Sexual Reproduction
3.3 The Chromosome Theory of
Heredity Proposes That Genes
Are Carried on Chromosomes
3.4 Sex Determination Is
Chromosomal and Genetic
3.5 Human Sex-Linked Transmission
Follows Distinct Patterns
3.6 Dosage Compensation Equalizes
the Expression of Sex-Linked
Genes
ESSENTIAL IDEAS
❚❚ The cell cycle consists of interphase,
during which cells carry out regular
functions and replicate their DNA, and
Cell division is a complex but precisely controlled process. In this cell in the
M phase, the cell-division segment of
anaphase segment of cell division, chromosomes are stained green and micro-
the cycle.
tubules are stained blue. The chromosomes are in the process of migrating to
opposite poles of the cell. ❚❚ Mitosis divides somatic cells and produces
two genetically identical daughter cells.
A
❚❚ Meiosis occurs in germ-line cells and
number of years ago, at the moment of conception produces four genetically different
that culminated in your birth, two gametes united to haploid cells that form gametes for
reproduction.
form the single fertilized cell—the zygote—from which you
❚❚ The separation of chromosomes and sister
developed. Your chromosomal sex was determined in that chromatids during meiosis is the mechani-
instant. Your mother’s egg carried an X chromosome, and cal basis of Mendel’s law of segregation
your sex was determined by whether your father’s sperm car- and law of independent assortment.
ried an X chromosome, making you female (XX), or a Y chro- ❚❚ The chromosome theory of heredity
identified chromosomes as the cell struc-
mosome, making you male (XY). Shortly after fertilization, cell tures containing genes.
division began that over the next few hours increased the tiny ❚❚ Sex determination is controlled by chro-
zygote to two cells, then four cells, then eight cells, and so on. mosomal and genetic factors that vary
among species.
Over several days, these cell divisions continued while the mass
❚❚ Dosage compensation equalizes the
of cells, called a trophoblast, moved down the fallopian tube expression of sex-linked genes of males
toward the uterus. About 1 week after fertilization, the cluster of and females of animal species.
67
68 CHAPTER 3 Cell Division and Chromosome Heredity
hundreds of cells, now called a blastocyst, implanted Gametes, produced from germ-line cells, are the
into the uterine wall; and within 2 weeks of conception, germinal, or reproductive, cells: sperm and egg in ani-
genetically controlled processes of cell differentiation mals or pollen and egg in plants. Germ-line cells divide
and cell specialization began to form the first embry- by meiosis. Meiotic cell division reduces the number of
onic organs and structures. These processes eventually chromosomes in the nucleus of each daughter cell by
determined the structure and function of each cell in one-half to the haploid number. In humans, the number
your body. of chromosomes in each egg and sperm nucleus is 23.
Since then, your body has produced thousands Each of the 23 human chromosome pairs has one repre-
of generations of cells. The mechanism of cell divi- sentative in each sperm or egg. The union of the sperm
sion that produced most of them is called mitosis. and egg nuclei at fertilization produces the fertilized
It is an ongoing process that with each division cre- egg with 46 chromosomes in its nucleus. Thus human
ates two identical daughter cells. These two cells reproduction, like that of other sexually reproducing
are exact genetic replicas of one another and of the organisms ensures that exactly one-half of the genetic
parental cell from which they are derived. Mitosis information in an offspring comes from each parent.
produces somatic cells, the structural cells of the In this chapter, we examine both mitosis and mei-
body. It is responsible for the growth and mainte- osis, and we look closely at the connection between
nance of your body, its organs, and its various struc- meiotic cell division and Mendel’s laws of heredity. We
tures; it repairs the damage and injury your body also explore patterns of sex determination in eukary-
sustains, and it produces new cells to replace those otes and look at processes that equalize the expres-
that undergo programmed cell death (apoptosis). sion of genes carried on sex chromosomes, the chro-
While you have been reading this passage, approxi- mosomes that determine sex. In addition, we study
mately 200 cells in your body have undergone the special patterns of inheritance of genes on the X
mitotic division. chromosome, and we see how the discovery of genes
There are trillions of somatic cells in your body, on the X chromosome supported the chromosome
and nearly all of them contain a nucleus in which the theory of heredity, the theory that chromosomes are
chromosomes are located. Human somatic cells are the cell structures that carry genes.
like those of most other animals in that their nuclei
contain two sets of chromosomes: Each chromosome
belongs to a homologous pair, and the total number 3.1 Mitosis Divides Somatic Cells
of chromosomes is called the diploid number. Your
somatic cell nuclei contain 46 chromosomes each, Mitosis is among the most fundamental and important pro-
in 23 homologous pairs, so your diploid number is cesses occurring in cells. It is a genetically controlled pro-
cess that follows a precise script to enable organisms to
46. The diploid number varies among species (each grow and develop normally and to maintain the structures
species having its characteristic number of pairs), so and functions of their organs, tissues, and other bodily com-
the characteristic diploid number for animal species ponents. Life depends on the orderly progression and proper
in general is described as 2n. Some plant cell nuclei, regulation of mitosis. If too little cell division takes place
or cell division occurs too slowly, an organism may fail to
such as those of pea plants, also have two sets of develop at all, or it may have morphologic abnormalities.
chromosomes and are diploid. Commonly, however, On the other hand, too much cell division can lead to growth
plant cells carry more than two chromosome sets. of structures beyond their normal boundaries, likewise pro-
They may be triploid (3n), tetraploid (4n), hexaploid ducing morphologic abnormality and possible death.
(6n), octoploid (8n), or some other multiple of n. The The Cell Cycle
value n represents the haploid number of chromo-
Cell division is regulated by genetic control of the cell
somes, and it is one-half the diploid number. Humans cycle, the life cycle cells must pass through to replicate their
have a diploid number of 2n = 46, so the human DNA and divide. Since well-regulated cell division is such
haploid number is n = 23. an integral part of life, it will not surprise you to learn that
3.1 Mitosis Divides Somatic Cells 69
(a) (b) G1: Active gene expression and G0: Terminal differentiation
G0 cell activity; preparation for and arrest of cell division
DNA synthesis
I
nt
er
G1
Te Cell remains Eventual
ph
lop Gap 1
ase
ha S phase: DNA replication and specialized cell death
se chromosome duplication but does not (apoptosis)
Ana
pha divide
Mphase)
se
Metaphase
e S phase G2: Preparation for cell division
has
osis (
e tap DNA
m synthesis
Pro
se
Mit
ha
G2
Pr
Figure 3.1 The cell cycle. (a) The cell cycle is divided into interphase and M phase, which are each
further subdivided. The cycles are not drawn to scale. (b) An overview of cell cycle activities.
the cell cycles of all eukaryotes are similar and that much of cells express their genetic information and carry out normal
the molecular machinery that controls the cell cycle is evo- functions but do not progress through the cell cycle (see
lutionarily conserved in plants and animals. Furthermore, Figure 3.1b). Several kinds of cells in your body, including
in powerful testament to the single origin of life, plants and certain cells in your eyes and bones, reach a mature state
animals share a number of cell cycle genes with the Bacteria of differentiation, enter G 0, and rarely if ever divide again.
and Archaea domains of life. Most G 0 cells maintain their specialized functions until they
The eukaryotic cell cycle is divided into two principal enter programmed cell death (apoptosis) and die. Cells only
phases—M phase, a short segment of the cell cycle dur- rarely leave G 0 and resume the cell cycle.
ing which cells divide, and interphase, the longer period DNA replication takes place during S phase and results
between one M phase and the next (Figure 3.1a). Interphase in a doubling of the amount of DNA in the nucleus—by cre-
consists of three successive stages, G 1, S, and G 2. During ating two identical sister chromatids that are joined to form
interphase the cell expresses genetic information, it rep- each chromosome. Prior to S phase, each chromosome is
licates its chromosomes, and it prepares for entry into M composed of a long DNA double helix. During S phase, the
phase. M phase is divided into multiple substages that cor- DNA strands separate, and each acts as a template to direct
respond to the progress of the cell during its division. the synthesis of a new daughter strand of DNA. This DNA
When viewed under a light microscope, somatic cells synthesis forms the sister chromatids that are genetically
in interphase may appear rather placid, but their outward identical to one another. The completion of S phase brings
appearance gives little indication of the complex activity about the transition to the G2, or Gap 2, phase of the cell
taking place inside. Gene expression occurs continuously cycle, during which cells prepare for division. Interphase
throughout the cell cycle, but during the G1 (or Gap 1) ends when cells enter M phase, from which two identical
phase of interphase, it is particularly high (Figure 3.1b). daughter cells emerge.
Cells of different types vary in how many genes they The successive generations of cells produced through
express, in how they function in the body, and in how they mitosis as one cell cycle follows the next are known as cell
interact with other cells. Consequently, the duration of G 1 lines or cell lineages. Each cell line or cell lineage contains
varies among different types of cells in the body. Some identical cells (i.e., clones) that are all descended from a
types of cells are rapidly dividing and spend only a short single founder cell. Mitosis ensures that the genetic infor-
time, perhaps as little as a few hours, in G 1. Other cells lin- mation in cells is faithfully passed to successive generations
ger in G 1 for periods of days, weeks, or more. of cell lineages.
As they approach the end of G 1, cells follow one of two
alternative paths. Most cells enter the S phase, or synthesis
Substages of M Phase
phase, during which DNA replication (DNA synthesis)
takes place. On the other hand, a small subset of specialized M phase follows interphase and is divided into five substages—
cells transition from G 1 into a nondividing state called G0 prophase, prometaphase, metaphase, anaphase, and
(“G zero”), a kind of semiperpetual G 1@like state in which telophase—whose principal features are described in
70 CHAPTER 3 Cell Division and Chromosome Heredity
Centrosomes
(with centriole Chromosomes Early mitotic Fragments Nonkinetochore
(duplicated) spindle Aster
pairs) of nuclear microtubule
envelope –
– +
–
–
+ + +
–
+
–
Nucleolus
Nuclear Chromosome, Centromere Kinetochore Kinetochore
Plasma
envelope consisting of two microtubule
membrane
sister chromatids Astral microtubules
The G2 interphase cell pictured here Chromosome condensation begins Nuclear envelope breakdown occurs
has passed through G1 and S phases, in and progresses throughout during prometaphase. Having
during which the chromosomes prophase, making the coalescing reached opposite poles of the cell,
duplicate. Although duplicated, the chromosomes increasingly visible the centrosomes extend microtubules
chromosomes are diffuse and not under the light microscope. In the that attach to kinetochores of
visible within the nucleus. An intact cytoplasm, the paired centrosomes chromosome centromeres. Micro-
nuclear envelope encloses the begin to migrate toward opposite tubules extending from opposite
chromosomes and one or more poles of the cell, extending their poles exert pulling forces in both
nucleoli. Two centrosomes, each microtubules to form the early directions. Chromosomes move
containing a centriole pair, are mitotic spindle. By the end of toward the middle of the cell.
located in the cytoplasm. Micro- prophase, the two sister chromatids Cohesin binds sister chromatids to
tubules begin to extend from the that make up each chromosome can resist premature separation due to
centrosomes in radial patterns that be seen. Centromeres can also be pulling forces. Nonkinetochore and
form asters. seen on late-prophase chromo- astral microtubules stabilize the cell.
somes. The nucleolus disappears.
Figure 3.2 Interphase and the five stages of mitosis. The chromosomes are shown in blue, and the
centrosomes, asters, and spindle fibers are shown in green.
Figure 3.2. These five substages accomplish two important condensation, a process that progressively condenses chro-
functions of cell division—(1) the equal partitioning of the mosomes into more compact structures, begins in early
chromosomal material into the nuclei of the two daughter prophase. Chromosomes become visible in midprophase,
cells, a process called karyokinesis, and (2) the partition- and the process continues until chromosomes reach their
ing of the cytoplasmic contents of the parental cell into the maximum level of condensation in metaphase. Nuclear enve-
daughter cells, a process known as cytokinesis. lope breakdown also occurs in prophase, and chromosome
During interphase chromosomes are diffuse and can- centromeres become visible as do the sister chromatids of
not be clearly seen by light microscopy. Chromosome each chromosome. The centromere is a specialized DNA
3.1 Mitosis Divides Somatic Cells 71
Metaphase Nucleolus
plate re-forming
Cleavage
furrow
Nuclear
Centriole at Daughter envelope
Spindle one spindle pole chromosomes re-forming
sequence on each chromosome, and its location is identi- cycle to identify each DNA-containing structure that has a
fied as a constriction where the sister chromatids are joined centromere. At the end of G 1, a chromosome consists of a
together. Centromeric DNA sequence binds a specialized single DNA duplex (double helix) with associated proteins.
protein complex called the kinetochore that facilitates chro- After the completion of S phase, a chromosome consists
mosome movement and division later in M phase. of two replicated DNA duplexes with associated proteins.
The meaning and usage of the terms chromosome, The two DNA molecules making up this chromosome are
chromatid, and sister chromatid sometimes cause confu- identical. Individually, these DNA molecules are identified
sion, and this is a good time to provide functional defini- as chromatids, and together they are identified as the sister
tions. The term chromosome is used throughout the cell chromatids.
72 CHAPTER 3 Cell Division and Chromosome Heredity
Chromosome Movement and Distribution assembles at the centromere of each chromatid. Kinet-
ochore microtubules are responsible for chromosome
In addition to visible changes to chromosomes, extranuclear movement during cell division.
changes (changes occurring outside the nucleus) are also
2. Nonkinetochore microtubules extend toward each
apparent in prophase. In animal cells, although not in most
other from the two polar centrosomes and overlap to
plants, fungi, or algae, two organelles called centrosomes
help elongate and stabilize the cell during division.
appear that migrate during M phase to form the two opposite
poles of the dividing cell. Each centrosome contains a pair 3. Astral microtubules grow toward the membrane of the
of subunits called centrioles. Centrosomes are the source of cell, where they attach and contribute to cell stability.
spindle fiber microtubules that emanate from each centro-
The kinetochore is a protein complex that assembles
some (Figure 3.3). Spindle fiber microtubules are polymers of
on each chromatid at the centromere. It is composed of an
tubulin protein subunits that elongate by the addition of tubu-
outer plate and an inner plate and is attached to the plus end
lin subunits and shorten by the removal of tubulin subunits.
of a kinetochore microtubule extending from a centrosome.
Microtubules are polar; they have a “minus” (-) end anchored
By the end of prometaphase, kinetochore microtubules from
at the centrosome and a “plus” (+) end that grows away from
each centrosome are attached to the kinetochore of a differ-
the centrosome. Specialized proteins called motor proteins
ent chromatid of the sister chromatid pair (see Figure 3.3).
are associated with microtubules. Motor proteins move chro-
Metaphase chromosomes have condensed more than
mosomes and other cell structures along microtubules.
10,000-fold in comparison with their form at the begin-
Three kinds of spindle fibers emanate from centro-
ning of prophase. This makes them easily visible under the
somes in a 360° pattern identified as the aster:
microscope and allows them to be easily moved within the
1. Kinetochore microtubules embed in the protein com- cell. Because they are tethered to kinetochore microtubules
plex called the kinetochore (described shortly) that from opposite centrosomes, the sister chromatids experience
+ +
Attached
+ at centriole
+ Centrosome – end
– (containing Microtubule
+ centrioles)
+ Fibers
Kinetochore containing
microtubule motor
Nonkinetochore
proteins
microtubule
+ end
Outer plate
Inner plate
Polymerization
Kinetochore
(one on each chromatid) Kinetochore
+ +
Sister
chromatids Sister chromatids
+
Motor
+ protein Kinetochore
Depolymerization
Tubulin
subunits
gained
and lost
+
+
Depolymerization – Astral microtubule
+ + (emanating from
centriole)
+
+
Figure 3.3 Microtubules in dividing cells emanate from centrosomes. Astral microtubules and nonki-
netochore microtubules control and stabilize cell shape during division. Kinetochore microtubules attach to
chromosome kinetochores to move chromosomes.
3.1 Mitosis Divides Somatic Cells 73
opposing forces that are critical to the positioning of chro- sister chromatids. Second, kinetochore microtubules begin to
mosomes along an imaginary midline at the equator of the depolymerize at their (+) ends to initiate chromosome move-
cell. This imaginary line is called the metaphase plate. ment toward the centrioles. The separation of sister chromatids
The tension created by the pull of kinetochore microtu- in anaphase A is called chromosome disjunction. As anaphase
bules is balanced by a companion process known as sister progresses, sister chromatids complete their disjunction and
chromatid cohesion. Sister chromatid cohesion is produced eventually congregate around the centrosomes at the cell poles.
by the protein cohesin that localizes between the sister chro- The next part of anaphase, anaphase B, is characterized
matids and holds them together to resist the pull of kineto- by the polymerization of polar microtubules that extends
chore microtubules (Figure 3.4). Cohesin is a four-subunit their length and causes the cell to take on an oblong shape.
protein; its central component is a polypeptide produced by The oblong shape facilitates cytokinesis at the end of telo-
the gene Scc1 for “sister chromatid cohesion.” Cohesin coats phase, which leads to the formation of two daughter cells.
sister chromatids along their entire length but is most con-
centrated near centromeres, where the pull of microtubules
Completion of Cell Division
is greatest. As microtubules move chromosomes toward the
midline of the cell, cohesin helps keep the sister chromatids In telophase, nuclear membranes begin to reassemble around
together, to ensure proper chromosome positioning and to the chromosomes gathered at each pole, eventually enclosing
prevent their premature separation. the chromosomes in nuclear envelopes. Chromosome decon-
Anaphase is the part of M phase during which sister densation begins and ultimately returns chromosomes to their
chromatids separate and begin moving to opposite poles diffuse interphase state. At the same time, microtubules disas-
in the cell. Anaphase includes two distinct events tied to semble. As telophase comes to an end, two identical nuclei
microtubule action: anaphase A, characterized by the sepa- are observed within a single elongated cell that is about to be
ration of sister chromatids, and anaphase B, characterized divided into two daughter cells by the process of cytokinesis.
by the elongation of the cell into an oblong shape. In animal cells, a contractile ring composed of actin
Anaphase A begins abruptly with two simultaneous microfilaments creates a cleavage furrow around the cir-
events. First, the enzyme separase initiates cleavage of polypep- cumference of the cell; the contractile ring pinches the cell
tides in cohesin, thus breaking down the connection between in two (Figure 3.5). In plant cells, cytokinesis entails the
(a) Prophase
Sister
Microtubule chromatids (a)
Cohesin
protein
Kinetochore
(b) Metaphase
Kinetochore
movement
Contractile
ring and furrow
(b)
(c) Anaphase
Separase
Cell plate
G1 phase
G
4 This cell contains two R
Nanograms (ng)
of DNA/nucleus
pairs of homologous
3 chromosomes with the g
genotpe GgRr. r
2
Metaphase
construction of new cell walls near the cellular midline. In
Chromosomes
both plant and animal cells, cytokinesis divides the cyto- align randomly
plasmic fluid and organelles. along the
Figure 3.6 presents a profile of the contents of a single metaphase plate
g G r R
nucleus that identifies the amounts of DNA, the number of with the aid of the
mitotic spindle. g r R
chromosomes, and the number of chromatids (DNA duplexes) G
at the end of different stages of the cell cycle. The nucleus
depicted is similar to a human nucleus that has approximately
2 nanograms (ng) of DNA in G 1, with 46 chromosomes, each
composed of one DNA duplex. DNA amount and the num-
ber of duplexes double (forming sister chromatids) with the
completion of S phase, and the separation of sister chroma-
tids into separate daughter cell nuclei in anaphase reduces the R
R G
amount of DNA by one-half. At the end of mitotic M phase, G
the nucleus again contains 2 ng of DNA and 46 chromosomes g
composed of one duplex each, at which point the cell is ready g r
to enter G 1 stage of the following cell cycle. Notice that despite r
changes in the amount of DNA and chromatid number, the
chromosome number remains at 46 throughout the cell cycle.
Mitosis separates the members of each pair of sister
Telophase
chromatids into identical nuclei, thus forming two geneti-
cally identical daughter cells. Figure 3.7 shows four chro- Two daughter cells are produced by mitosis.
Each is GgRr following sister chromatid
mosomes in a cell of an organism that is dihybrid (GgRr)
separation to form daughter chromosomes.
for genes on the chromosomes shown. The figure follows
major events of the cell cycle, showing the generation of Figure 3.7 An overview of mitosis.
sister chromatids in S phase, chromosome alignment on the
metaphase plate in metaphase, and the production of two
the cell cycle. Knowledge of the genes and proteins con-
identical (GgRr) daughter cells at the end of telophase.
trolling the cell cycle comes largely from the study of cell
lineages possessing mutations that affect their progression
Cell Cycle Checkpoints
through the cell cycle. These studies have produced impor-
Cell biologists find that no matter what the duration of the tant insights into genetic control of the cell cycle, and in
cell cycle, most cells follow the same basic program; this recent decades, biologists have discovered the identities and
suggests that common, genetically controlled signals drive functions of many genes responsible for cell cycle control.
3.2 Meiosis Produces Cells for Sexual Reproduction 75
What has been learned about genetic control of the cell cycle G2 check- Metaphase checkpoint:
can be applied to the study of normal cell division as well point: Pass if Pass if all chromosomes
as to the study of cell division abnormalities such as those cell size is are attached to mitotic
displayed in cancer. adequate and spindle.
chromosome
As cells move through the cell cycle, their readiness to replication is
progress from one stage to the next is regularly assessed. The successfully tosis
Mi
numerous cell cycle checkpoints, four of which are illus- completed.
trated in Figure 3.8, are times during the cell cycle when cells
are monitored by protein interactions that assess the status of M
ap
the cell and its readiness to progress to the next stage. Such
Second g
controls on cell division are essential for normal growth G2
First gap
and development. Mutations that alter the normal control of G1
the cell cycle are linked to a number of cell growth abnor-
S
malities. For example, loss of cell cycle control is a funda-
mental mechanism leading to cancer development. Indeed,
cancer is often characterized by out-of-control cell prolif-
s
si
he
eration that leads to tumor formation and the overgrowth ynt
DNA s
of cancerous cells that invade and displace normal cells. S-phase checkpoint: G1 checkpoint:
We explore mutations altering cell cycle control and other Pass if DNA replica- Pass if cell size is
gene mutations associated with cancer development and tion is complete and adequate, nutrient
progression in Application Chapter C titled “The Genetics has been screened to availability is sufficient,
remove base-pair and growth factors
of Cancer.” mismatch or error. (signals from other cells)
are present.
3.2 Meiosis Produces Cells Figure 3.8 Major cell cycle checkpoints. Genetic mechanisms
monitor cell cycle checkpoints to ensure the cell’s readiness to
for Sexual Reproduction progress to the next stage.
2n
Homolog separation
Meiosis I
(reduction division)
n n
n n n n
different member of a pair of homologous chromosomes, carry out genetic exchange between the nonsister chroma-
as well as a central element that joins the lateral elements. tids of homologous chromosomes during pachytene. Later
The function of the synaptonemal complex is to properly chapters discuss the genetic consequences of crossing over
align homologous chromosomes before their separation (Chapter 5) and the molecular processes of crossing over
and then to facilitate recombination between homologous (Chapter 11).
chromosomes. The chromosomes continue to condense in diplotene
Chromosome condensation continues in pachytene, as the synaptonemal complex begins to dissolve. The dis-
and sister chromatids of each chromosome can be visually solution allows homologs to pull apart slightly, revealing
distinguished by light microscopy. At this stage, the paired contact points between nonsister chromatids. These con-
homologs are called a tetrad in recognition of the four chro- tact points are called chiasmata (singular: chiasma), and
matids that are microscopically visible in each homologous they are located along chromosomes where crossing over
pair. Within the central element of the synaptonemal com- has occurred. Chiasmata mark the locations of DNA-strand
plex, new structures called recombination nodules appear exchange between nonsister chromatids of homologous
at intervals. chromosomes.
Recombination nodules play a pivotal role in c rossing Cohesin protein is present between sister chroma-
over of genetic material between nonsister chromatids of tids to resist the pulling forces of kinetochore microtubules
homologous chromosomes. The number of recombina- (Figure 3.12). In diakinesis, kinetochore microtubules actively
tion nodules correlates closely with the average number move synapsed chromosome pairs toward the metaphase
of crossover events along each homologous chromosome plate, where the homologs will align side by side.
arm. Two important observations have been made about The chiasmata between homologous chromosomes
recombination nodules. First, their appearance and location are resolved in late prophase I so that the homologs can
within the synaptonemal complex is coincident with the be aligned in metaphase I. This process of resolving the
timing and location of crossing over; and second, recom- contacts between homologs is critical to the completion of
bination nodules seem to be present in organisms that recombination between homologous chromosomes.
undergo crossing over and absent in those that do not. Cell Homologous chromosomes align on opposite sides of
biologists have concluded that recombination nodules are the metaphase plate in metaphase I. Kinetochore microtu-
aggregations of enzymes and proteins that are needed to bules from one centrosome attach to the kinetochores of
78 CHAPTER 3 Cell Division and Chromosome Heredity
both sister chromatids of one chromosome. Meanwhile, have very few genes in common. Even so, the X and Y chro-
kinetochore microtubules from the other centrosome attach mosomes of males align as homologs in prophase I. This
to the kinetochores of the sister chromatids of the homo- synapsis is accomplished with the aid of pseudoautosomal
log. Karyokinesis takes place in anaphase I as homologous regions (PARs) on the two types of sex chromosomes. The
chromosomes separate from one another and are dragged term pseudoautosomal means “false autosomal”; a PAR is
to opposite poles of the cell (see Figure 3.10). The sister a segment of homology between otherwise different chro-
chromatids of each chromosome remain firmly joined by mosomes. PARs are like homologous sequences carried on
cohesin. Nuclear membrane re-formation takes place in telo- authentic autosomes. The pattern of inheritance of a pseu-
phase I, when a haploid set of chromosomes are enclosed at doautosomal region would be indistinguishable from the
each pole of the cell. Cytokinesis follows the completion of pattern of autosomal inheritance, as a consequence of the
telophase I. homology.
Homologous chromosome disjunction (separation) Human X and Y chromosomes each contain two pseu-
in meiosis I reduces the number of chromosomes at each doautosomal regions, PAR1 and PAR2, that are located at
pole to the haploid number, so that one representative of opposite ends of the chromosomes (Figure 3.13). PAR1 is
each homologous pair of chromosomes is present. The located on the short arms of the X and Y chromosomes and
first meiotic division is known as the reduction division, to contains about 2.7 Mb (millions of base pairs) of DNA.
signify the reduction of chromosome number from diploid PAR2 is located on the long arms of the chromosomes and is
to haploid. shorter than PAR1—about 300,000 base pairs. Crossing over
Sex chromosomes differ from pairs of autosomal chro- during chromosome synapsis occurs regularly between PAR1
mosomes in that the X chromosome and Y chromosome regions. Studies estimate the rate of recombination to be as
3.2 Meiosis Produces Cells for Sexual Reproduction 79
Cleavage
furrow
Mitotic
spindle Astral and Homologous
nonkinetochore chromosomes separate Nuclear
mictrotubules envelope re-forms
Prophase I: Diakinesis Metaphase I Anaphase I Telophase I and Cytokinesis
The meiotic spindle is well Tetrads are aligned along the Depolymerization of Nuclear membranes re-form
established, with bundles of metaphase plate, with each kinetochore microtubules around the chromosomes
kinetochore microtubules chromosome of a homolo- begins the disjunction of clustered at each pole. Each
tethering homologous gous pair tethered to homologous chromosomes, newly formed nucleus
chromosomes of tetrads to kinetochore microtubules which start moving toward contains a haploid set of
opposite poles. The nuclear emanating from centrosomes opposite poles. Sister chromosomes. Chromo-
envelope is fully degraded. at opposite poles of the cell. chromatids remain joined somes may partially
Tetrads are moved toward The kinetochores of sister by cohesin. decondense. Cytokinesis
the middle of the cell. chromatids are attached to divides the cytoplasmic
the same centrosome, and material of the cell by
sister chromatids are joined dividing the nuclear
by cohesin to prevent their contents between the cells.
premature separation. The cytoplasmic division
Chiasmata linking nonsister may be unequal.
chromatids are broken.
much as 20-fold higher than for an equivalently sized region only a haploid number of chromosomes present in each cell
in autosomes. during meiosis II. Four genetically distinct haploid cells,
each carrying one chromosome that represents each homol-
ogous pair, are the products of meiosis II.
Meiosis II Figure 3.14a shows the profile of the content of a
The second meiotic division divides each haploid product of nucleus that begins G 1 with 2 ng of DNA and 46 chromo-
meiosis I by separating sister chromatids from one another somes composed of one chromatid each. As we discussed
in a process that is reminiscent of mitosis, except that the for somatic cell nuclei, the amount of DNA and the num-
number of chromosomes in each cell is one-half the number ber of duplexes double during S phase. These values are
observed in mitosis. The products of meiosis II mature to form maintained until homologous chromosomes are separated
the gametes that contain a haploid set of chromosomes. The in anaphase I. The end of meiosis I leaves the nucleus with
four stages of meiosis II—prophase II, metaphase II, anaphase one-half the DNA, chromosomes, and chromatids it con-
II, and telophase II—are shown and described in Figure 3.10. tained at the end of S phase. Anaphase II brings the separa-
Meiosis II bears a general resemblance to mitosis in that tion of sister chromatids and a further reduction by one-half
kinetochore microtubules from opposite centrosomes attach in DNA amount and in the numbers of chromosomes and
to the kinetochores of sister chromatids. Also, as in mito- chromatids. The products of meiosis II, containing 1 ng of
sis, in meiosis II the chromosomes align randomly along the DNA and 23 chromosomes composed of one chromatid
metaphase plate. Furthermore, sister chromatid separation each, are gametes. The union of a sperm and an egg with
is accompanied by cohesin breakdown, the action of motor this nuclear profile produces a fertilized egg with 2 ng of
proteins, and depolymerization of microtubules. Cytokine- DNA and 46 chromosomes (Figure 3.14b). This is the pro-
sis takes place at the end of telophase II. There are, however, file of a cell ready to initiate its first somatic cell cycle.
80 CHAPTER 3 Cell Division and Chromosome Heredity
Cleavage furrow
Microtubules Kinetochore
(from centrosomes) microtubule
Prophase II Metaphase II Anaphase II Telophase II and Cytokinesis
The nuclear envelope Sister chromatids are Sister chromatid separation Chromosome migration is
breaks down, and centro- attached to kinetochore begins with the breakdown completed, and the
somes duplicate and begin microtubules from opposite of cohesin by separase and chromosomes begin to
migrating to opposite poles poles of the cell. The force the depolymerization of decondense. The nuclear
of the cell. Microtubules of microtubule pull and the kinetochore microtubules. envelope re-forms around
emanate from the centro- resistance created by As the sister chromatids chromosomes. Cytokinesis
somes, producing kineto- cohesin leads to chromo- move toward opposite separates the newly formed
chore, nonkinetochore, and some alignment along the poles, polymerization of nuclei and divides the
astral microtubules. metaphase plate. nonkinetochore microtu- cytoplasmic material,
Chromosome recondensa- bules elongates the cell. perhaps unevenly.
tion takes place.
Chromatid M1
DNA
Synaptonemal complex
Maternal
chromosome Assembly Recombination Disassembly
Recombination nodule DNA
Chromatid M2
Lateral Transverse
Central Central space
elements filaments
element
Chromatid P1
DNA Recombination
Chromatid P2 nodule
Lateral
elements DNA
Central Transverse
element filament
Figure 3.11 The synaptonemal complex. From electron micrograph analysis, the synaptonemal complex
is thought to be a three-layer structure that assembles during prophase. Associated recombination nodules
are sites of crossing over between homologous chromosomes.
Q Does the synaptonemal complex form between sister chromatids or chromatids of homologous
chromosomes? Draw a chromosome pair consisting of two sister chromatids each and indicate where
the synaptonemal complex is found.
Kinetochore
microtubule
Kinetochore
movement
Chiasma
Cohesin
protein
Kinetochore
Spindle fibers
to centrioles
Figure 3.12 Homolog separation in meiosis I. (a) In diplotene and diakinesis of prophase I, crossing
over between homologs is complete, and contacts between homologs (chiasmata) are resolved. (b) Spindle
fibers pull chromosomes to align them on the metaphase plate. Cohesin protein adheres sister chromatids
against the pull of spindle fibers. (c) Homologous chromosomes separate at anaphase I.
82 CHAPTER 3 Cell Division and Chromosome Heredity
Interphase
G g
Unreplicated
chromosomes
PAR1
Chromosome
replication in
PAR1 Metaphase I S phase
G Gg g
SRY Homolog
Centromere synapsis
Meiosis I
Metaphase II
PAR2 G G g g Homolog separation
is the basis of
segregation.
PAR2
Meiosis II
X chromosome Y chromosome Gametes
G G g g
Figure 3.13 The pseudoautosomal regions of the X and Y
chromosomes.
(a) 1
–
2 G 1
–
2 g
1
plate, and chromosomes carrying recessives on the opposite
side. Arrangement II has a dominant-bearing and a recessive-
bearing chromosome on each side of the metaphase plate.
End of: G1 S Meiosis I Meiosis II The first meiotic division segregates G from g and R from r to
Number of: create the haploid products of meiosis I division.
Chromosomes 46 46 23 23 If we now follow each haploid product of meiosis I
Chromatids 46 92 46 23 through the meiosis II division, we see that the four gametes
(or equivalents) produced by arrangement I have the genotypes GR and gr in
equal frequency. In contrast, the four gametes produced by
(b) Sperm + Egg Fertilized egg arrangement II have the genotypes Gr and gR in equal fre-
quency. Taking both possible arrangements of these homol-
ogous chromosomes at metaphase I into account, eight
gametes are generated with four equally frequent genotypes.
The four possible gamete genotypes—GR, Gr, gR, and gr—
DNA: 1 ng 1 ng 2 ng
are produced in a frequency of 25% each as predicted by the
Chromosomes: 23 23 46
Chromatids: 23 23 46 law of independent assortment.
(or equivalents) Genetic Analysis 3.1 gives you practice identifying the
principles of Mendelian transmission in meiotic cell division.
Figure 3.14 Meiosis. (a) A profile of the nuclear contents of a
cell through the phases of meiosis. (b) Gametic contributions to
fertilization.
3.3 The Chromosome Theory
Obviously, when the cell undergoes meiosis, only one or
the other of these alternative arrangements will occur; thus,
of Heredity Proposes That Genes
each cell undergoing metaphase I of meiosis will have either Are Carried on Chromosomes
“arrangement I” or “arrangement II.” Over a large number
of meiotic divisions, arrangement I and arrangement II are The early 20th century was a time of rapid expansion of
equally frequent. Arrangement I in Figure 3.15 has chromo- genetic knowledge, fueled in large part by the rediscov-
somes carrying dominant alleles on one side of the metaphase ery of Mendel’s hereditary principles in 1900 and by the
3.3 The Chromosome Theory of Heredity Proposes That Genes Are Carried on Chromosomes 83
Interphase G
R g
r
S phase
Prophase I r
G r
G g g
R
R
Metaphase I Arrangement I Arrangement II
G G g g G Gg g
Metaphase II G G g g G G g g
R R r r r r R R
Gametes G g g G g g
G r G r
R R r r R R
1 1 1 1
–
4 GR –
4 gr –
4 Gr –
4 gR
Figure 3.16 Meiosis and the law of independent assortment. Assessing the results of meiosis in numer-
ous cells with the GgRr genotype, four genetically different gametes, GR, Gr, gR, and gr are produced at
frequencies of 25% each.
Q What event reduces the amount of DNA by one-half during meiosis I? What event reduces DNA
amount by an additional one-half in meiosis II?
independent discoveries of Sutton and Boveri that chromo- hanging buckets of rotting fruit on trees. Once captured and
some segregation in meiosis mirrored the hereditary trans- transported back to the laboratory, the flies were examined
mission of genes. Many biologists turned their work toward under the microscope to identify phenotypic variants. Flies
testing the new “gene hypotheses” of segregation and inde- captured from the wild were easily sexed by their morphol-
pendent assortment in an array of organisms. ogy, and they almost invariably had the same phenotype
Thomas Hunt Morgan, initially skeptical of the gene for each trait examined. Morgan’s group referred to these
hypothesis, began working on the tiny fruit fly Drosoph- phenotypes as the “wild type.” We use the term wild type
ila melanogaster shortly after 1900. Morgan intended to today to signify the phenotype that is the most common in a
rigorously test Mendel’s rules in a natural species, not population.
a domesticated one like Pisum sativum. Unlike Mendel, Morgan found Drosophila an easy organism to main-
however, Morgan had no readily available phenotypic vari- tain and reproduce in small glass bottles filled with a semi-
ants to examine. So, he and his students set out from their solid mixture of cornmeal, sugar, and water. The life cycle of
laboratory at Columbia University in New York City to the Drosophila is between 12 and 14 days depending on growth
then-rural landscape of Long Island to attract fruit flies by conditions, so 25 to 30 generations could be raised in a year.
GENETIC ANALYSIS 3.1
PROBLEM A diploid organism has the dihybrid genotype D1D2E1E2 for alleles of gene D and alleles of BREAK IT DOWN: This organ-
gene E. Gene D and gene E are on different chromosomes. In the diagrams requested, illustrate only these ism is a dihybrid (heterozygous
for two genes). A total of four
two pairs of chromosomes and label each copy of each allele on chromosomes and sister chromatids. chromosomes—two homologous
a. Diagram any correct mitotic metaphase arrangement for these two pairs of chromosomes and pairs-must be illustrated (p. 83).
label the alleles.
BREAK IT DOWN: There is more
b. Diagram any correct meiotic metaphase I arrangement for these two pairs of chromosomes and than one correct response for this
label the alleles. and other parts of this problem.
Follow the rules of segregation and
c. Describe the differences between the diagrams with respect to homolog and chromosome independent assortment (p. 83).
alignment.
d. Compare the outcome of mitosis with the outcome of meiosis in terms of the number of chromo-
somes and the genotype of the cells produced.
BREAK IT DOWN: Figures 3.7 and
3.9 provide overviews of mitosis and
Solution Strategies Solution Steps meiosis in terms of chromosome
division (pp. 75 and 77).
Evaluate
1. Identify the topic of this problem and 1. This problem concerns comparisons of mitosis and meiosis. Parts (a) and (b)
the kind of information the answer require illustration of chromosome alignments at metaphase in mitosis and in
should contain. meiosis I. Part (c) requires an explanation of the differences in those alignments,
and part (d) requires comparison of the outcomes of mitosis and meiosis.
2. Identify the critical information given 2. The organism is identified as a dihybrid for a pair of autosomal genes on different
in the problem. chromosomes.
Deduce TIP: Heterozygous organisms carry different alleles on homologous
chromosomes, but the alleles on sister chromatids are identical.
3. DNA duplicates in S phase. Identify 3. Sister chromatids carry identical alleles as a result of DNA replication in S phase.
the distribution of the different alleles Thus, for example, the sister chromatids of one chromosome will each carry a
on homologous chromosomes follow- copy of D1. In each of the other three chromosomes, the sister chromatids will be
ing completion of S phase. identical for one of the other alleles.
4. Review the overall patterns of chromo- 4. During mitotic metaphase, chromosomes align in single file and in an arbitrary
some alignment along the metaphase order along the metaphase plate. In meiotic metaphase I, homologs align oppo-
plate during mitotic and meiotic site one another along the metaphase plate.
divisions.
Solve Answer a
5. Diagram chromosome alignment dur- 5. Any order of the four chromosomes in
ing mitotic metaphase. single file along the metaphase plate is
a correct order. One example is shown. D1 E2 D2 E1
D1 E2 D2 E1
Answer b D2 E2 D2 E1
6. Diagram any correct chromosome 6. Homologous chromosomes align
E2 D2 E1
alignment during meiotic metaphase I. opposite one another along the meta D2
D1 E1 D1 E2
phase plate in meiotic metaphase I. The
two correct arrangements of order of
D1 E1 D1 E2
homologous chromosomes are shown.
Answer c
7. Describe the diagram differences with 7. Homologous chromosomes synapse in meiosis, but not in mitosis. The conse-
respect to homologs. quence of synapsis is that homologs align next to one another and on opposite
sides of the metaphase plate in metaphase I. The absence of synapsis in mitosis
leads chromosomes to align in any order along the metaphase plate in mitotic
metaphase.
Answer d
8. Describe the different outcomes of 8. Mitosis produces two diploid daughter cells that are genetically identical to one
mitosis and meiosis. another and to the parental cell they are derived from. Meiosis produces four
haploid daughter cells that are genetically different.
For more practice, see Problems 1, 5, and 32. Visit the Study Area to access study tools. Mastering Genetics
84
3.3 The Chromosome Theory of Heredity Proposes That Genes Are Carried on Chromosomes 85
Evaluate
1. Identify the topic of this problem and 1. The patterns of transmission of two Drosophila traits and the genotypes of
the kind of information the answer organisms are to be determined based on the phenotypes of male and female
should contain. F1 progeny.
2. Identify the critical information given 2. Pure-breeding parental phenotypes are given along with the phenotypes of
in the problem. male and female progeny in the F1.
Deduce
3. Consider the F1 phenotype results in 3. All F1 progeny have full-sized wings and none have vestigial wings, suggesting
light of the parental phenotypes. that full-sized wing is dominant. The F1 males are exclusively yellow-bodied,
TIP: Cross results that appear equally whereas F1 females are exclusively gray-bodied. The F1 male body color
in both sexes are consistent with is identical to that of the parental female, whereas the F1 females’ body color is
autosomal inheritance. Sex-dependent identical to that in the male parent.
differences in a cross suggest X-linked
inheritance.
4. Hypothesize the modes of inheritance 4. The observation of one body color in F1 males and another in females sug-
of body color and wing form from the gests this is an X-linked trait. Since hemizygous males have yellow body and
F1 data. females have gray body, it is likely that gray body is dominant and yellow
TIP: Test the hypothesized mode of
body is recessive. The F1 results for wing form are the same for both sexes,
inheritance by comparing the predicted suggesting that this trait is autosomal.
and observed F1 progeny ratios.
Solve Answer a
5. Test the proposed mode of transmis- 5. The F1 of both sexes have full-sized wings, consistent with an autosomal trait.
sion of wing form. The pure-breeding full-winged parent transmits the dominant alleles to all
progeny, and the pure-breeding vestigial parent transmits the recessive allele.
The F1 are predicted to be heterozygous and display the dominant trait.
6. Test the mode of transmission of body 6. The sex-dependent difference in body color among F1 males and females
color. strongly suggests this trait is X-linked. The F1 males inherit the maternal reces-
TIP: Compare observed and sive allele for yellow body color and express the trait because they are hemizy-
expected F2 progeny to test the
hypothesized mode of inheritance.
gous. F1 females inherit a recessive allele on the maternal X chromosome and
a dominant allele on the paternal X and are heterozygous, thus displaying the
PITFALL: Remember that males
are hemizygous for X-linked traits.
dominant phenotype.
Describing their genotype as homozy-
gous or heterozygous is incorrect. Answer b
7. Determine genotypes for parental and 7. The genotypes of pure-breeding parents are X y/X y;, v +/v + for yellow-bodied,
F1 flies. Use X y+ for yellow body, X y for full-winged females; and X y + /Y;, v/v for gray-bodied, vestigial-winged males.
gray body, v + for full wing, and v for The F1 females are X y/X y + ;, v +/v; and the F1 males are X y/Y;, v +/v.
vestigial wing.
For more practice, see Problems 12, 15, and 25. Visit the Study Area to access study tools. Mastering Genetics
88
3.4 Sex Determination Is Chromosomal and Genetic 89
are all male, whereas flies that are XX or XXY are female.
In Drosophila, flies that are XXX are very rarely observed, Undifferentiated
and those that are YO are never seen. gonad
A ratio of one X chromosome to the number of hap-
loid sets of autosomes—that is, 1X:2A (as in XY)— Wolffian duct Müllerian duct
is seen in males. Flies in which the ratio is 2X:2A (as in
XX) are females. Bridges called this the X/A ratio, or the
X/autosome ratio. At the molecular level, we now know
that Drosophila sex is determined by regulatory proteins that
relay the number of X chromosomes present in nuclei of cells SRY absent SRY present
in Drosophila embryos. These proteins control expression of
the sex-lethal (Sxl) gene in XX flies. As we discuss in the Ovaries
Case Study at the end of Chapter 8, Sxl protein controls the
expression of additional genes that drive sex development.
SRY (expressed
in , not )
Anti-Müllerian
Cholesterol Leads to congenital
factor
adrenal hyperplasia
CYP21
mutation
Müllerian duct Androgen- Internal male
Testosterone sensitive cells structures
degeneration
sex chromosomes that are the same. To avoid confusion pure-breeding hens (female) and roosters (male) involving a
with the XX/XY system, a different lettering system called Z-linked dominant allele for barred feathers (Z B) and its reces-
the Z/W system is used in these cases. In the Z/W system, sive counterpart for nonbarred feathers (Z b). The F1 results of
the letters Z and W are used to highlight the different sex- the reciprocal crosses reveal differences consistent with sex-
chromosome compositions associated with each sex. Males linked inheritance. Cross A produces barred hens (Z BW) and
are identified as having two Z sex chromosomes, or a sex barred roosters (Z BZ b) in the F1, whereas Cross B produces
chromosome composition of ZZ. In contrast, females have nonbarred hens (Z bW) and barred roosters (Z BZ b). The F2
two different sex chromosomes and are identified as ZW. results of these crosses also yield differences consistent with
The sex-chromosome differences in the Z/W system cause sex-linked inheritance. We can conclude that the mechanism of
reciprocal crosses involving Z-linked genes to produce differ- transmission of Z-linked genes in the Z/W system is analogous
ent results, just as there are reciprocal cross differences for to that of X-linked genes in the XX/XY system except that the
X-linked genes. Figure 3.22 shows reciprocal crosses between patterns are the reverse of those in placental mammals.
Sex chromosome content is even more unusual in
monotremes, like the platypus, an egg-laying mammal that
is native to Australia. Male platypus sex chromosomes are
(a) Cross A represented as X1Y1X2Y2X3Y3X4Y4X5Y5 and female platy-
pus sex chromosomes as X1X1X2X2X3X3X4X4X5X5. Mul-
tiple sets of sex chromosomes have also been documented
in some plant species, termites, and spiders. In dioecious
P ×
plants (those with male plants and female plants), sex chro-
mosomes are often not obvious at all, and they are therefore
difficult to study. And, in certain reptiles and fishes, sex is
Z bW Z BZ B
dependent on environmental variables such as temperature.
In other words, the sex of an individual can change during
its lifetime, even though its chromosomes do not.
gene transmission refer specifically to the expression of traits and congenital generalized hypertrichosis (CGH, character-
in females. For X-linked alleles, females can be homozygous ized by excessive hair growth all over the body). Five com-
or heterozygous, but males are hemizygous and express the mon features characterizing X-linked recessive inheritance
allele on their X chromosome regardless of the hereditary are illustrated in Figure 3.23, which features the inheritance
pattern in females. Second, the probability of transmission of of red–green color blindness.
X-linked alleles to offspring is not the same for the two sexes
as it is for autosomal alleles. Female X-linked transmission is 1. As a result of male hemizygosity, more males than
identical to autosomal transmission, but hemizygous males females have the recessive phenotype. The pedigree
always transmit their X chromosome to female offspring has six recessive males and one recessive female.
and their Y chromosome to male offspring. Lastly, whereas 2. Often, the transmission of the recessive allele from
females receive one copy of X-linked alleles from each par- grandfather to daughter to grandson gives the appear-
ent, males receive their X-linked alleles from their mother ance of generation skipping. See the transmission of c
and their Y chromosome from their father. This means that from I-1 to II-2 to III-1.
Y-linked inheritance, the inheritance of genes on the Y 3. If a recessive male (cY) mates with a homozygous
chromosome, is an exclusively patrilineal (father to son) pat- dominant female (CC), all progeny have the dominant
tern of hereditary transmission. From an evolutionary per- phenotype. All female offspring are heterozygous car-
spective, this pattern suggests that only those genes that play riers (Cc), and all male offspring are hemizygous for
a role in male fertility, male-specific metabolism, or other the dominant allele (CY). See the cross I-1 * I-2. and
male-specific features are inherited on the Y chromosome. their progeny.
4. Matings of recessive males (cY) and carrier females
Expression of X-Linked Recessive Traits (Cc) can produce the recessive phenotype in females.
X-linked recessive traits are expressed in hemizygous males About one-half of the offspring of these matings have
who carry the recessive allele and in females who are homo- the dominant trait and one-half have the recessive trait.
zygous for the recessive allele. Because hemizygous males See the results of the mating between III-4 and III-5
express the single copy of a recessive X-linked allele in and their progeny.
their phenotype, one of the hallmarks of X-linked reces- 5. Mating of a homozygous recessive female (cc) and a
sive inheritance is the observation that many more males hemizygous dominant male (CY) produces male prog-
than females express the traits. Table 3.2 lists several human eny with the recessive trait (cY) and female offspring
X-linked disorders, including three that we use as examples who have the dominant trait who are heterozygous
in this section: color blindness that affects perception of red carriers of the recessive allele (Cc). See the mating
and green color, hemophilia A (a blood-clotting disorder), between III-4 and III-5 and their progeny.
Table 3.2 A Short List of Human X-Linked Recessive and X-Linked Dominant Traitsa
Disease Symptoms
1 2
I
cY CC
1 2 3 4 5 6
II
CY Cc CY Cc Cc cY
1 2 3 4 5 6 7 8
III
cY C– C– CY cc CY cY Cc
1 2 3 4
IV
cY Cc cY Cc
Figure 3.23 X-linked recessive inheritance of red-green color blindness in a family. (a) See the text for
key features of this pattern of inheritance. (b) The number 57 is seen by those with full color vision, whereas
those with red-green color blindness do not see a number.
Q Explain how you know with certainty that II-4 is a heterozygous carrier of the recessive c allele.
Hemophilia A, a serious blood-clotting disorder, is the progeny of each sex have the dominant condition. (See
caused by mutation of an X-linked gene called factor VIII the nine combined progeny of I-1 and I-2 and II-3 and II-4.
(F8) that produces a blood-clotting protein called factor Five of the nine children—three males and two females—
VIII protein. Hemophilia A is transmitted in an X-linked have the dominant condition.) When the transmitting parent
recessive manner, most often by a carrier mother who is a hemizygous male with the dominant trait (HY) and his
passes the mutant allele to an affected son. In typical mate is a female with the recessive trait (hh), we see a hall-
X-linked recessive fashion, approximately one-half of the mark that distinguishes autosomal dominant transmission
sons of carrier mothers have the disease. Also as is com- from X-linked dominant transmission. In these matings, the
mon for X-linked recessive conditions, hemophilia often dominant trait appears in all daughters, who are Hh, and in
appears to “skip” a generation because the mutant allele is no sons, who are hY. (See the nine progeny of II-5 and II-6.)
passed from affected father to carrier daughter and on to an
affected grandson.
Y-Linked Inheritance
In some families, a de novo (newly occurring) muta-
tion of the F8 gene is responsible for the appearance of The key to Y-linked inheritance is that the Y chromosome is
hemophilia. An example occurred in the royal families of found only in males. This means Y-linked genes are trans-
England and Europe: An apparent de novo mutation of the mitted in a male-to-male pattern. In mammals, fewer than
F8 gene affected Queen Victoria of England (Figure 3.24). 50 genes are found on the Y chromosome; and like SRY,
Victoria had five sons, one of whom had hemophilia, along those genes are likely to play a role in male sex determi-
with four daughters, two of whom were known carriers. nation or development. The genes on the human Y chro-
Victoria’s carrier daughters had normal blood clotting but mosome do not have counterparts on the X chromosome,
introduced the mutation to the royal families of Russia, although the DNA sequences in the pseudoautosomal
Germany, and Spain through intermarriage. These daugh- regions are shared by the X and Y chromosomes to facili-
ters passed the mutation to their sons who had hemophilia tate synapsis of the chromosomes during meiosis. There is
and to their daughters who were carriers like their mothers. crossing over between the pseudoautosomal regions, but
Genetic Analysis 3.3 analyzes the hereditary transmission of this does not involve expressed genes.
hemophilia A. Females never carry Y chromosomes, so from an evo-
lutionary perspective it makes sense that the genes carried
on a Y chromosome should be male-specific, having either
X-Linked Dominant Trait Transmission to do with male sex determination or reproduction. Indeed,
Transmission of traits such as CGH (see Table 3.2) that are the most recent genomic evidence suggests that the mam-
controlled by X-linked dominant alleles has two distinctive malian Y chromosome has rapidly evolved over the past
characteristics, one indicating transmission from a female 300 million to 350 million years, undergoing multiple
and one indicating transmission from a male. A family with changes in structure but preserving a handful of genes that
CGH is illustrated in Figure 3.25. When the transmitting par- are essential to male fertility and survival. The fascinating
ent is a heterozygous female with the dominant trait (Hh) and evolution of the mammalian Y chromosome is the subject of
her mate is a male with the recessive trait (Yh), about half the Case Study at the end of this chapter.
94 CHAPTER 3 Cell Division and Chromosome Heredity
I Edward Victoria
Duke of Kent Princess of Saxe-Coburg
Victoria
II
Queen of England (de novo mutation)
III ? ?
Victoria Frederick Edward VII Alice Leopold Beatrice
of of
Germany England
IV
George V Irene Henry of Alix Nikolas II Alice Alfonso XIII VictoriaLeopold Maurice
Prussia of Russia of Spain
V ? ? ?
so
alo
I
nd
ar
nry
ga
a
ri e
a
xis
rt
ry
n
eV
an
asi
Jua
pe
Ma
em
on
Ale
Ol
mu
Ma
nz
He
ti
org
ast
Ru
Alf
ld
Ta
Go
Sig
An
Wa
Ge
VI ? ?
Margaret Elizabeth II Juan Carlos
of Spain
Normal male
Spanish
VII Normal female royal family
Anne Charles Andrew Edward
Affected male
George Charlotte
British royal family
(No affected
descendants)
Figure 3.24 Hemophilia A in the royal families of Europe. The disease in these families originated with
a de novo mutation in Queen Victoria. Note that some parents are omitted from the pedigree for clarity. In
all cases, these individuals carry and contribute wild-type alleles.
1 2
I
Hh hY
1 2 3 4 5 6
II
hh hY Hh hY HY hh
1 2 3 4 5 6 7 8 9 10 11 12 13 14
III
Hh hh HY HY hh Hh Hh Hh Hh Hh hY hY hY hY
Figure 3.25 X-linked dominant congenital generalized hypertrichosis (CGH) in a family. See the text for
key features of this pattern of inheritance.
GENETIC ANALYSIS 3.3
PROBLEM Hemophilia A is an X-linked recessive blood-clotting disorder caused by mutation of the factor
BREAK IT DOWN:
VIII gene. Suppose a heterozygous woman with normal blood clotting has children with a man who also has The information given
normal blood clotting. Determine the probability of each of the following outcomes. about the pattern
BREAK IT DOWN: The woman can of inheritance of
a. The probability of a son having hemophilia A. transmit the recessive allele to a child hemophilia A and the
status of the woman
b. The probability of a child of either sex having normal blood clotting. of either sex, but the man transmits
and the man allows
his X-linked allele to daughters and
c. The probability of having three children, each of whom has hemophilia A. his Y chromosome to sons (p. 92). identification of their
genotypes (p. 92).
d. The probability of having four children, two of whom have hemophilia A and two of whom have normal
blood clotting. BREAK IT DOWN: Parts (a) and (b) can be predicted using
a Punnett square (p. 36); part (c) uses the product rule and
part (d) is an application of binomial probability (p. 48).
Evaluate
1. Identify the topic this problem 1. This problem addresses inheritance probabilities of an X-linked recessive trait
addresses and describe the nature of for the parental genotype and phenotypes given. The answers should be stated
the required answers. as fraction, decimal, or percentage probabilities.
2. Identify the critical information given 2. The inheritance pattern of the trait in question is identified as X-linked reces-
in the problem. sive, the phenotype of each parent is given, and the woman is identified as a
heterozygote.
Deduce
3. Identify the genotypes of the woman 3. The woman is described as being hetero XH Y
and the man. zygous and so her genotype is X HX h, where
TIP: Remember that males
the uppercase and lowercase superscripts
are hemizygous for X-linked represent the dominant and recessive alleles, X H X HX H X HY
traits. respectively. The man has normal blood clot-
Healthy Healthy
ting, so he is hemizygous for the wild-type
TIP: Use a Punnett square to assist
you in accurately predicting the
allele. His genotype is X HY.
possible outcomes of mating.
Xh X HX h X hY
4. Determine the possible phenotypes 4. The Punnett square predicts four different
and phenotype probabilities for genotypes among the possible children of Healthy Hemophilia A
children of this couple. this couple.
Solve Answer a
5. Determine the probability of a son of 5. Taking sex into account, we find that approximately one-half the offspring are
this couple having hemophilia A. male and one-half are female. The Punnett square shows two possible male
genotypes, one healthy and one a hemizygous male with hemophilia A. The
probability that a son will have hemophilia A is therefore one-half, or 50%.
Answer b
6. Determine the probability of a child 6. The Punnett square shows that three of the four possible offspring genotypes
with normal blood clotting being pro- would produce normal blood clotting. The probability that a child of this couple
duced by this couple. has normal blood clotting is 0.75, or 75%.
Answer c
7. Calculate the probability that if the 7. The risk that each child will have hemophilia A is 25%. For three children with
couple has three children, each of hemophilia A, the probability is (.25)(.25)(.25) = 0.0156, or 1 14 2 1 14 2 1 14 2 = 64
1
.
them will have hemophilia A.
Answer d
8. Calculate the probability that if the 8. The chance the couple has four children, two of whom have hemophilia A and
couple has four children, two will two of whom are healthy, is predicted by the binomial expansion. There are six
have hemophilia A and two will have different ways (birth orders) in which to produce two healthy and two affected
normal blood clotting. children. The probabilities are 34 for a healthy child and 14 for a child with hemo-
philia A, so the requested probability is 6 3 1 34 2 1 34 2 1 14 2 1 14
1
2 4 = 1 256
54
2 , or 0.2109.
TIP: Use binomial probability to calculate
the likelihood of consecutive outcomes.
For more practice, see Problems 12, 13, and 25. Visit the Study Area to access study tools. Mastering Genetics
95
96 CHAPTER 3 Cell Division and Chromosome Heredity
3.6 Dosage Compensation of a female has one active X chromosome that is equally
Equalizes the Expression likely to be the maternal X or the paternal X.
Random X inactivation takes place in every cell with
of Sex-Linked Genes two or more X chromosomes. Following inactivation, the
inactive chromosome can be seen as a tightly condensed
In this final section of the chapter, we turn our attention to mass adhering to the nuclear wall. The inactive X chromo-
mechanisms that carry out the essential function of balanc- some is known as a Barr body, having first been visualized
ing the amount of gene expression of sex-linked genes. In by Murray Barr in 1949.
animals there is an imbalance between the sexes in the copy X inactivation is a permanent feature of somatic cells
number of genes on the sex chromosomes. Specifically, of placental mammalian females. Because some cells
females are generally XX, and have two copies of each have an active maternal X chromosome and an inactive
X-linked gene, whereas males are generally XY, and have paternal X chromosome and other cells have the oppo-
just one copy of each X-linked gene. This is a potential prob- site pattern, normal placental mammalian females are, in
lem because animals are extraordinarily sensitive to gene terms of X chromosomes, a mosaic of two kinds of cells
dosage imbalance such as could be caused by the presence (Figure 3.26). One cell type (pink in the figure) expresses
of the “extra” X chromosome in females if all X chromo- the maternally derived X chromosome, and the other
somes were to express genes at the same level. The expres- (blue) expresses the paternally derived X chromosome.
sion of the right number of genes in the correct amounts is Each individual cell expresses the allelic information of
essential for normal embryonic development and normal only one of those chromosomes, with all descendant cells
biological processes. If the gene dosage balance is off, the maintaining the same inactivation pattern as the original
consequences can be severe or even fatal for the animal. ancestral cell.
Evolution has provided multiple mechanisms that com- In most cases, the silencing of one X chromosome in
pensate for differences in the number of copies of genes each cell of a female has no detectable effect on the func-
due to the different chromosome constitutions of males and tion of a tissue or on the phenotype. Occasionally, how-
females. There are at least four major mechanisms to balance ever, female carriers of X-linked recessive traits display
X-linked gene expression in placental and marsupial mam- a phenotypic manifestation of the recessive allele. Calico
mals, fruit flies, and nematode worms (Table 3.3). Collec- and tortoiseshell coat-color patterning in female cats is
tively, these are called dosage compensation mechanisms. a product of mosaicism created by random X inactiva-
Placental mammals, including humans, use random tion (Figure 3.27). Females with an allele for black coat
X inactivation as their dosage compensation mecha- color on one X chromosome and orange coat color on
nism. Early in mammalian gestational development, about the homologous X chromosome have black and orange
2 weeks after fertilization in humans, when the female early patches of fur corresponding to portions of skin where
embryo consists of a few hundred cells, one of the two X each X chromosome is active. The sizes and the distribu-
chromosomes in each somatic cell of a female is randomly tion of the orange and black sectors of these cats reflect
inactivated. This idea was first proposed in 1961 by Mary the locations of the clonal descendants of the cells in
Lyon in her random X inactivation hypothesis, also known which each X chromosome was originally inactivated. The
as the Lyon hypothesis. In approximately one-half of the specific pattern of X inactivation is unique to each female
somatic cells in a female embryo, the maternally derived cat embryo, and the patterns of cellular migration are
X chromosome is inactivated; and in the other half of the variable as well. As a result, each adult female calico or
somatic cells, inactivation silences the paternally derived X tortoiseshell cat has a unique pattern of black and orange
chromosome. At the end of this process, each somatic cell sectors marking its coat.
Males Females
Fruit fly XY XX Expression of X-linked genes in males is doubled relative to female X-linked
gene expression.
Roundworm XO XXa Gene expression of each X chromosome in the hermaphrodite (“female”) is
decreased to one-half that of the X chromosome in the male.
Marsupial mammals XY XX The paternally derived X chromosome is inactivated in all female somatic cells.
Placental mammals XY XX One X chromosome is randomly inactivated in each female somatic cell.
a
XX worms are hermaphrodites.
Case Study 97
M P
Random X inactivation
M P
P Inactive M
Active X Barr body Active X
chromosome chromosome
Figure 3.26 Random X inactivation in female placental mam- Figure 3.27 Calico coat, produced by X inactivation in female
mals. One X chromosome is randomly inactivated in each nucleus. cats. Coat color patches are the result of gene expression from the
Descendant cells maintain the initial inactivation, leading to clus- one active X chromosome in each cluster of cells.
ters of descendant cells with the same X chromosome. M repre-
sents the maternally derived X chromosome and P the paternally
derived X chromosome.
Not all genes on the “inactivated” X chromosome spreads out from the gene, “painting” the X chromosome as
are transcriptionally silent. A 2005 study of 624 X-linked it accumulates. X chromosomes that are painted with XIST
genes showed that about 15% of the genes on the inacti- RNA have all, or nearly all, of their genes silenced. The
vated chromosome escape complete silencing. On average, XIST gene is expressed on only one of the two X chromo-
transcription of those genes is reduced by about 50–85% in somes, and its RNA accumulates only on the chromosome
comparison to transcription on the active X chromosome. transcribing the gene; it does not spread to the homologous
The genes that escape inactivation are largely clustered on X chromosome. In other words, XIST acts only in cis (on
the short arm of the chromosome near PAR1. the same chromosome) but not in trans (on the homologous
Random X inactivation requires a gene on the X chro- chromosome). Examination of inactivated chromosomes in
mosome called the X-inactivation–specific transcript the nucleus detects XIST RNA coating the Barr body in a
(XIST) that encodes a large RNA molecule. XIST RNA nucleus.
C A SE S T U D Y
The (Degenerative) Evolution of the Mammalian Y Chromosome
Mammalian X and Y chromosomes are the “odd couple” of these regions, but only about 5% of the Y chromosome par-
homologous chromosomes. They are very different from each ticipates in recombination. The other 95% of the chromosome
other in size and are only homologous in their pseudoautoso- experiences no crossing over. Studies in evolutionary genetics
mal regions. Further, because the Y chromosome is exclusively reveal that the mammalian Y chromosome has evolved very
found in males, the genes it contains are, naturally enough, rapidly over the past 300 million years or so, shrinking in size
only expressed in males. For example, the human Y chromo- and genetic content as essential genes have been shifted to
some contains only about one-third as many base pairs as the X other chromosomes, leaving just a handful of genes behind.
chromosome. Whereas the human X chromosome carries more
than 2000 genes, the Y chromosome contains just a few dozen. A STORY OF DEGENERATION Beginning with the work
The small pseudoautosomal regions of the X and Y chro- of Bruce Lahn and David Page in 1999, the composition and
mosomes make up just a few percent of the total sequence evolution of the mammalian Y chromosome have been sub-
of either chromosome. The PARs are sufficient for synapsis in jects of active investigation. The view of Y chromosome evo-
prophase I, and recombination between X and Y is frequent in lution first proposed by Lahn and Page has been supported
98 CHAPTER 3 Cell Division and Chromosome Heredity
and verified by additional studies and by genome sequenc- the region surrounding SRY—the first of Lahn and Page’s four
ing, and it tells the story of an evolutionary pathway that fea- strata—became the first region of the Y chromosome to be
tures progressive degeneration. unable to recombine with the X chromosome. This event also
In 1999, Lahn and Page studied the human X and Y chro- contributed to the shrinkage of the Y chromosome.
mosomes and identified 19 genes that are present on both About 130–170 mya, a structural change altered the Y
chromosomes, called X–Y shared genes. These genes are left chromosome and produced a second stratum that was unable
over from a time when the chromosomes were much more simi- to recombine with the X chromosome. Marsupials (such as
lar and regularly recombined. Lahn and Page reasoned that they kangaroos) retain the old Y-chromosome structure, so the
could trace the evolution of the X–Y shared genes by study- generation of the second stratum demarcates the separa-
ing differences between their DNA sequences. Their starting tion of marsupial and placental mammals. Another structural
premise was that in general more differences accrue the longer change to the Y chromosome, between 80 and 130 mya,
genes have been separated. What they found was quite surpris- created a third stratum of divergence, further restricting
ing: The differences between the X–Y shared genes followed a recombination with the X chromosome and shrinking the Y
distinct and suggestive pattern. X–Y shared genes nearest each chromosome. This change marks the separation of the mon-
other on the X chromosome short arm were most similar to keys from nonsimian placental mammals. Most recently, about
their Y-chromosome counterparts, but X–Y shared genes on the 30–50 mya, the fourth stratum was created by another
long arm of the X chromosome were the most different from structural change to the Y chromosome. This change—
their Y-chromosome counterparts. In all, Lahn and Page iden- present in the human lineage that includes our great ape rela-
tified four well-defined “strata” among the X–Y shared genes, tives but not present in monkeys—limited recombination to
each stratum having its own distinct level of sequence similarity. the end of the Y chromosome and reduced its size. In humans,
Within each of the strata, the level of X–Y shared-gene similar- recombination between X and Y chromosomes is limited to
ity was remarkably consistent, but there were substantial differ- PAR1 (on the short arm), the largest of the remaining regions
ences in gene similarity between strata. This suggested four of X–Y homology. Little if any recombination occurs in PAR2.
major evolutionary events that reshaped the Y chromosome, The functioning of genes remaining on the Y chromo-
resulting in structural changes that progressively restricted some was directly affected by the events that prevented X–Y
recombination between the X and the Y chromosomes. recombination. Without recombination, Y-linked genes were
subject to mutational degradation that would eventually ren-
MAJOR RESTRUCTURING EVENTS By comparing DNA der them nonfunctional. Strong natural selection operated
sequences across species, Lahn and Page determined that the to prevent this by moving essential genes off the Y chro-
autosomal precursors of X and Y were very similar at the time mosome to other chromosomes. The genes that remain on
reptiles diverged from mammals, about 350 million years ago the human Y chromosome are almost exclusively important
(mya). The monotremes (such as the platypus and echidna) in male development or sperm production, but even these
separated from the placental mammals 240–320 mya, but not remain subject to mutational degradation.
before the SRY gene evolved in their common ancestor. Both What will be the ultimate fate of the human Y chromosome?
monotremes and mammals have SRY, but reptiles do not. This Is it destined to be lost? Scientists don’t know what will happen,
implies that SRY developed about 350 mya (Figure 3.28). The but recent genomic data may provide a clue. The Y chromo-
SRY gene produces TDF, the protein that initiates a cascade of some, it seems, has backup copies of its genes. These dupli-
events that produces males. With the acquisition of SRY, the Y cated copies are also on the Y chromosome, and they may serve
chromosome became different from the X chromosome, and to protect the Y chromosome from the loss of critical information.
Time Present
Identical chromosomes SRY moves the
Matching areas still
able to recombine short arm of the Y
able to recombine
(to swap segments)
240–320 130–170 80–130 30–50
350 million million million million million SRY
years ago years ago years ago years ago years ago 4
2 3
SRY gene arises First structural Second structural Third structural Fourth structural
change: change: further change: additional change: recom-
1
Centromere recombination recombination recombination bination failure
failure and SRY failure and SRY failure and Y SRY and severe
SRY chromosome decay (second chromosome shrinkage
shrinkage (first stratum) degradation (fourth stratum)
stratum) (third stratum)
Autosome Nascent Nascent Non-matching regions
pair in Y X unable to recombine
reptile–mammal Y X Y X Y X Y X
common ancestor Monotremes– Marsupial– Monkey– Human–
mammal placental nonsimian monkey
divergence mammal divergence divergence divergence
Figure 3.28 The proposed evolutionary development of the mammalian Y chromosome through four
major structural rearrangements.
Summary 99
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
P R E PA R I N G F O R P R O BLEM S O LV I NG
In addition to the list of problem-solving tips and sugges- 1. The terminology of cell division is important for under-
tions given here, you can go to the Study Guide and Solu- standing and communicating during problem solving.
tions Manual that accompanies this book for help at solving Be able to define terms such as chromosome and sister
problems. chromatid in the context of mitosis and meiosis.
100 CHAPTER 3 Cell Division and Chromosome Heredity
2. From the perspective of genetics, meiotic cell divi- “hemizygous.” Be careful to use these terms correctly
sion provides the mechanism for transmission of genes and also to indicate the corresponding genotypes
and alleles from one generation to the next. Be sure correctly.
you have a clear picture of how and when homolo-
5. Understand the chromosomal basis of sex determina-
gous chromosomes and sister chromatids separate and
tion and the mechanisms of gene dosage compensation
how these events lead to segregation and independent
for X-linked genes in mammals.
assortment.
6. As with problem solving in Chapter 2 (“Transmission
3. Be prepared to analyze hereditary transmission of genes
Genetics”), the use of Punnett squares and the forked-
on autosomal chromosomes and on sex chromosomes.
line method will aid you in finding solutions to prob-
4. Remember that for X-linked genes females are either lems concerning heredity.
“homozygous” or “heterozygous” but males are
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers
1. Examine the following diagrams of cells from an organ- to ensure efficient separation of chromatids at mitotic
ism with diploid number 2n = 6, and identify what stage anaphase or in meiotic anaphase II. Explain why sister
of M phase is represented. chromatid cohesion is important, and discuss the role of the
proteins cohesin and separase in sister chromatid separation.
(a) (b)
5. The diploid number of the hypothetical animal Geneticus
introductus is 2n = 36. Each diploid nucleus contains
3 ng of DNA in G1.
a. What amount of DNA is contained in each nucleus at
the end of S phase?
b. Explain why a somatic cell of Geneticus introductus has
the same number of chromosomes and the same amount
of DNA at the beginning of mitotic prophase as one of
these cells does at the beginning of prophase I of meiosis.
c. Complete the following table by entering the number
(c) (d) of chromosomes and amount of DNA present per cell
at the end of each stage listed.
Number of
End of Cell Cycle Stage Chromosomes Amount of DNA
Telophase I
Mitotic telophase
Telophase II
9. Alleles A and a are on one pair of autosomes, and alleles 11. Describe the role of the following structures or proteins in
B and b are on a separate pair of autosomes. Does cross- cell division:
over between one pair of homologs affect the expected a. microtubules
proportions of gamete genotypes? Why or why not? Does b. cohesin protein
crossover between both pairs of chromosomes affect the c. kinetochores
expected gamete proportions? Why or why not? d. synaptonemal complex
10. How many Barr bodies are found in a normal human
female nucleus? In a normal male nucleus?
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
12. A woman’s father has ornithine transcarbamylase defi- iii. healthy girl
ciency (OTD), an X-linked recessive disorder producing iv. boy with both albinism and hemophilia
mental deterioration if not properly treated. The woman’s v. boy with albinism
mother is homozygous for the wild-type allele. vi. girl with hemophilia
a. What is the woman’s genotype? (Use D to represent the c. If Clara and Charles’s first child has albinism, what is
dominant allele and d to represent the recessive allele.) the chance the second child has albinism? Explain why
b. If the woman has a son with a man who does not have this probability is higher than the probability you calcu-
OTD, what is the chance the son will have OTD? lated in part (b).
c. If the woman has a daughter with a man who does not 14. A wild-type male and a wild-type female Drosophila with
have OTD, what is the chance the daughter will be a red eyes and full wings are crossed. Their progeny are
heterozygous carrier of OTD? What is the chance the shown below.
daughter will have OTD?
d. Identify a male with whom the woman could produce a Males Females
daughter with OTD. 3 3
e. For the instance you identified in part (d), what propor- 8 full wing, red eye 4 full wing, red eye
tion of daughters produced by the woman and the man 3
8 miniature wing, red eye 1
4 purple eye, full wing
are expected to have OTD? What proportion of sons of 1
the woman and the man are expected to have OTD? 8 purple eye, full wing
1
13. In humans, hemophilia A (OMIM 306700) is an X-linked 8 miniature wing, purple eye
recessive disorder that affects the gene for factor VIII pro-
a. Using clearly defined allele symbols of your choice,
tein, which is essential for blood clotting. The dominant
and recessive alleles for the factor VIII gene are repre- give the genotype of each parent.
sented by H and h. Albinism is an autosomal recessive b. What is/are the genotype(s) of females with purple
condition that results from mutation of the gene producing eye? Of males with purple eye and miniature wing?
tyrosinase, an enzyme in the melanin synthesis pathway. 15. A woman with severe discoloration of her tooth enamel
A and a represent the tyrosinase alleles. A healthy woman has four children with a man who has normal tooth
named Clara (II-2), whose father (I-1) has hemophilia and enamel. Two of the children, a boy (B) and a girl (G),
whose brother (II-1) has albinism, is married to a healthy have discolored enamel. Each has a mate with normal
man named Charles (II-3), whose parents are healthy. tooth enamel and produces several children. G has six
Charles’s brother (II-5) has hemophilia, and his sister children—four boys and two girls. Two of her boys and
(II-4) has albinism. The pedigree is shown below. one of her girls have discolored enamel. B has seven
children—four girls and three boys. All four of his
daughters have discolored enamel, but all his boys have
Hemophilia
normal enamel. Explain the inheritance of this condition.
Albinism
16. In a large metropolitan hospital, cells from newborn
1 2 3 4 babies are collected and examined microscopically over
I a 5-year period. Among approximately 7500 newborn
males, six have one Barr body in the nuclei of their
1 2 3 4 5 somatic cells. All other newborn males have no Barr bod-
II
ies. Among 7500 female infants, four have two Barr
Clara Charles
bodies in each nucleus, two have no Barr bodies, and the
?
rest have one. What is the cause of the unusual number of
Barr bodies in a small number of male and female infants?
a. What are the genotypes of the four parents (I-1 to I-4)
in this pedigree? 17. In cats, tortoiseshell coat color appears in females. A tor-
b. Determine the probability that the first child of Clara toiseshell coat has patches of dark brown fur and patches
and Charles will be a of orange fur that each in total cover about half the body
i. boy with hemophilia but have a unique pattern in each female. Male cats
ii. girl with albinism can be either dark brown or orange, but a male cat with
102 CHAPTER 3 Cell Division and Chromosome Heredity
tortoiseshell coat is rarely produced. Two sample crosses b. Determine which other pattern(s) of transmission is/are
between males and females from pure-breeding lines pro- possible. For each possible mode of transmission, spec-
duced the tortoiseshell females shown. ify the genotypes necessary for transmission to occur.
c. Identify which pattern(s) of transmission is/are impos-
Cross I P: dark brown male * orange female sible. Specify why transmission is impossible.
F1 : orange males and tortoiseshell females Pedigree A
Cross II P: orange male * dark brown female
F1 : dark brown males and tortoiseshell females
gray body and full wings is made. Based on an analysis in human males and females involving the SRY gene.
of the progeny of the cross shown below, determine the (Hint: See Experimental Insight 3.1 for a clue about the
genotypes of parental and progeny flies. mutational mechanism.)
27. In an 1889 book titled Natural Inheritance (Macmillan,
Number of Number of
New York), Francis Galton, who investigated the inheri-
Phenotype Males Females
tance of measurable (quantitative) traits, formulated a law
Yellow body, full wing 296 301 of “ancestral inheritance.” The law stated that each person
Yellow body, vestigial wing 101 98 inherits approximately one-half of his or her genetic traits
from each parent, about one-quarter of the traits from each
Gray body, full wing 302 298
grandparent, one-eighth from each great grandparent,
Gray body, vestigial wing 101 103 and so on. In light of the chromosome theory of heredity,
800 800 argue either in favor of Galton’s law or against it.
28. In Drosophila, the X-linked echinus eye phenotype dis-
24. In a species of fish, a black spot on the dorsal fin is
rupts formation of facets and is recessive to wild-type eye.
observed in males and females. A fish breeder carries out a
Autosomal recessive traits vestigial wing and ebony body
pair of reciprocal crosses and observes the following results.
assort independently of one another. Examine the prog-
Cross I Parents: black-spot male * nonspotted female
eny from the three crosses shown below, and identify the
genotype of parents in each cross.
Progeny: 22 black-spot males
24 black-spot females Parental Phenotype Progeny Phenotype Proportion
25 nonspotted males Female Male Female Male
21 nonspotted females a. Wild type Echinus Wild type 3 3
8 8
Cross II Parents: nonspotted male * black@spot female Echinus 3 3
8 8
Progeny: 45 black-spot males
Vestigial 1
8
1
8
53 nonspotted females
Echinus, vestigial 1
8
1
8
a. Why does this evidence support the hypothesis that a
b. Wild type Wild type Vestigial, ebony 2 1
black spot is sex linked? 32 32
Wild type 18 9
25. Lesch–Nyhan syndrome (OMIM 300322) is a rare 32 32
over between the chromosomes. The diagram below shows Vestigial, ebony 1
32
1
32
SRY in relation to the pseudoautosomal region. Vestigial 3 3
32 32
SRY Ebony 3
32
3
32
Y Wild type 9 9
32 32
PAR
29. A wild-type Drosophila male and female are crossed,
X producing 324 female progeny and 161 male progeny. All
their progeny are wild type.
About 1 in every 25,000 newborn infants is born with sex a. Propose a genetic hypothesis to explain these data.
reversal; the infant is either an apparent male, but with b. Design an experiment that will test your hypothesis,
two X chromosomes, or an apparent female, but with an using the wild-type progeny identified above. Describe
X and a Y chromosome. Explain the origin of sex reversal the results you expect if your hypothesis is true.
104 CHAPTER 3 Cell Division and Chromosome Heredity
30. Drosophila has a diploid chromosome number of 2n = 8, b. Diagram any correct alignment of chromosomes at
which includes one pair of sex chromosomes (XX in mitotic metaphase.
females and XY in males) and three pairs of autosomes. c. Diagram any correct alignment of chromosomes at
Consider a Drosophila male that has a copy of the A1 allele metaphase I of meiosis.
on its X chromosome (the Y chromosome is the homolog) d. For the metaphase I alignment shown in (c), what gam-
and is heterozygous for alleles B1 and B2, C1 and C2, and D1 ete genotypes are produced at the end of meiosis?
and D2 of genes that are each on a different autosomal pair. e. How many different metaphase I chromosome align-
In the diagrams requested below, indicate the alleles carried ments are possible in this male? How many genetically
on each chromosome and sister chromatid. Assume that no different gametes can this male produce? Explain your
crossover occurs between homologous chromosomes. reasoning for each answer.
a. What is the genotype of cells produced by mitotic
division in this male?
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
31. The cell cycle operates in the same way in all eukaryotes, n. Combining your work in steps (f) through (m), pro-
from single-celled yeast to humans, and all share numerous vide a written explanation of the connection between
genes whose functions are essential for the normal progres- meiotic cell division and Mendel’s law of independent
sion of the cycle. Discuss why you think this is the case. assortment.
32. From a piece of blank paper, cut out three sets of four 33. Form a small discussion group and decide on the most likely
cigar-shaped structures (a total of 12 structures). These genetic explanation for each of the following situations;
will represent chromatids. Be sure each member of a set a. A man who has red–green color blindness and a
of four chromatids has the same length and girth. In set woman who has complete color vision have a son with
one, label two chromatids “A” and two chromatids “a.” red–green color blindness. What are the genotypes of
Cut each of these chromatids about half way across near these three people, and how do you explain the color
their midpoint and slide the two “A” chromatids together blindness of the son?
at the cuts, to form a single set of attached sister chroma- b. Cross A performed by Morgan and shown in Figure
tids. Do the same for the “a” chromatids. In the second set 3.18 is between a mutant male fruit fly with white eyes
of four chromatids, label two “B” and two “b.” Cut and and a female fruit fly from a pure-breeding, red-eye
slide these together as you did for the first set, joining the stock. The figure shows that 1237 F1 progeny were
“B” chromatids together and the “b” chromatids together. produced, all of them with red eyes. In reality, this isn’t
Repeat this process for the third set of chromatids, label- entirely true. Among the 1237 F1 progeny were 3 male
ing them as “D” and “d.” You now have models for three flies with white eyes. Give two possible explanations
pairs of homologous chromosomes, for a total of six for the appearance of these white-eyed males.
chromosomes. 34. Duchenne muscular dystrophy (DMD; OMIM 310200)
a. Give the genotype of the cell with six chromosomes. and Becker muscular dystrophy (BMD; OMIM 300376)
b. Align the chromosomes as they might appear at meta- are both X-linked recessive conditions that result from dif-
phase of mitosis. ferent mutations of the same gene, known as dystrophin,
c. Are there any alternative alignments of the chromo- on the long arm of the chromosome. BMD and DMD are
somes for this cell division stage? Explain. quite different clinically. DMD is a very severe disorder
d. Separate the chromosomes and chromatids as though that first appears at a young age, progresses rapidly, and is
mitotic anaphase and telophase have taken place. often fatal in the late teens to 20s. BMD, on the other hand,
e. What are the genotypes of the daughter cells? is much milder. Often symptoms don’t first appear until the
f. Align the chromosomes as they might appear at meta- 40s or 50s, the progression of the disease is slow, and fatal-
phase I of meiosis. ities due to BMD are infrequent. Go to http://www.ncbi.
g. Are there any alternative alignments of the chromo- nlm.nih/omim and survey the information describing the
somes for this cell division stage? Explain. gene mutations causing these two conditions. Discuss the
h. Separate the chromosomes as though meiotic anaphase information you find with a few others in a small group,
I and telophase I have taken place. and write a single summary explaining your findings.
i. Align the chromosomes of each daughter cell as they
might appear in metaphase II of meiosis. 35. Red–green color blindness is a relatively common condi-
j. Are there any alternative alignments of the chromo- tion found in about 8% of males in the general population.
somes for this cell division stage? Explain. From this, population, biologists estimate that 8% is the
k. Separate the chromosomes as though anaphase II and frequency of X chromosomes carrying a mutation of the
telophase II have taken place. gene encoding red and green color vision. Based on this
l. What are the genotypes of the daughter cells? frequency, determine the approximate frequency with
m. Repeat steps (h) through (l) for the alternative align- which you would expect females to have red–green color
ment of chromosomes you identified in step (g). blindness. Explain your reasoning.
Gene Interaction
4
CHAPTER OUTLINE
4.1 Interactions between
Alleles Produce Dominance
Relationships
4.2 Some Genes Produce Variable
Phenotypes
4.3 Gene Interaction Modifies
Mendelian Ratios
4.4 Complementation Analysis
Distinguishes Mutations in the
Same Gene from Mutations in
Different Genes
Coat colors in Labrador retrievers, black (left), yellow (center), and choco- ESSENTIAL IDEAS
late (right), are determined by the interaction of two genes, one deter-
mining the production of coat color pigment and the other, pigment ❚❚ Dominance relationships between alleles
distribution. have a molecular basis. The biological
effects of gene products determine what
M
type of dominance is observed.
endel’s laws of segregation and independent ❚❚ Gene expression can be affected by
nongenetic (environmental) factors and
assortment encapsulate the basic rules of genetic
also as a consequence of factors related
transmission in diploid organisms. We see the results of to sex.
these rules in the relative proportions of progeny with dif- ❚❚ Gene expression can be affected by
ferent phenotypes from crosses. By assessing the molecular interactions with other genes, causing
characteristic changes in Mendelian
basis for the phenotypic variation, we can also glimpse the ratios.
connection between hereditary transmission of phenotypic ❚❚ Mutation of different genes can produce
traits and DNA, RNA, and protein sequence variability. the same effect on phenotype. The
number of genes causing mutation of
Mendel’s success in identifying and describing the law of
a phenotype is discovered by genetic
segregation and the law of independent assortment was partly complementation analysis.
because of his use of traits whose phenotypic characteristics
are determined exclusively by inheritance of alleles for single
genes. In interpreting the inheritance of these traits, he did not
105
106 CHAPTER 4 Gene Interaction
have to contend with phenotypic variation introduced environmental factors. The concepts presented here
by other genes or by environmental (nongenetic) include the following:
factors. In Mendel’s experiments, each of the seven
❙❙ There may be more than two alleles for a given
traits was decided by a single pair of alleles, one fully gene within the population.
dominant and one fully recessive, for a gene deter- ❙❙ Dominance of one allele over another may not be
mining that particular trait; and environmental factors complete.
played a minimal role in the phenotypic variation he ❙❙ Two or more genes may affect a single trait.
observed. ❙❙ The expression of a trait may be dependent on the
The simple case in which just two alleles influence interaction of two or more genes, on the interaction
of genes with nongenetic factors, or both.
a trait and environment plays no meaningful role is
relatively rare in nature. Although a diploid organism
can have no more than two alleles for a given gene—
because such individuals have just two copies of each 4.1 Interactions between Alleles
chromosome—there may be more than two alleles for Produce Dominance Relationships
a single gene within a population, and these different
Mendel wisely chose to examine traits presenting in one of
alleles may produce different phenotypic effects. In
two easily distinguishable forms. One form of each trait he
addition, alleles can exhibit dominance relationships studied displayed complete dominance over the other form.
other than the simple dominance and recessiveness Complete dominance makes the phenotype of a heterozy-
we saw in Chapters 2 and 3; and a few alleles are gous organism indistinguishable from that of an organism
homozygous for the dominant allele; thus, only organisms
expressed differently in males and females. Two other
homozygous for the recessive allele display the reces-
phenomena influencing phenotype development are sive phenotype. The complete dominance of one allele also
important to consider as well. First, many phenotypes results in the exclusive expression of the dominant phenotype
are the consequence of two or more genes interacting among the heterozygous F1 progeny of a cross between pure-
with one another, and second, the phenotypic expres- breeding homozygous parents, while the F2 progeny display
a 3:1 ratio of dominant to recessive phenotypes. We now
sion of some genes is influenced by environmental know that the phenotypes of the seven traits that Mendel stud-
factors. Taken together, these circumstances impart ied are controlled by two alternative alleles at seven different
a further dimension to the way geneticists view the genes. For the four traits of Mendel that have been described
function of genes in determining phenotypes. The at the molecular level (see Section 2.6), the dominant alleles
produce full function of the gene, while the recessive alleles
phrase “extensions of Mendelian inheritance” is fre- encode gene products with reduced or no functional activity.
quently used to include these gene–gene and gene– Questions concerning the molecular basis of dominant
environment interactions. and recessive alleles drove genetic research in the early
These interactions, detectable in all organisms, and mid-20th century. Questions such as how dominance
of an allele could be ascertained, why certain mutations
are particularly relevant to humans when medical are recessive whereas others are dominant, and whether
conditions are considered. As we discuss in later mutations always cause genes to lose function or whether
chapters, numerous common human diseases, includ- mutations can impart new or additional functions to alleles
ing heart disease, diabetes, and cancers, can have were c ommonly asked.
an inherited component that increases disease risk.
The Molecular Basis of Dominance
Environmental factors play a major role in producing
A character is called dominant if the same phenotype is
these diseases, however, and interactions between
seen in organisms with the homozygous and heterozygous
genes and the environment can be critically impor- genotypes. The correlative character is called recessive
tant in the disease process. if it is observed only in a single homozygous genotype.
In this chapter, we examine several examples In this sense, dominance and recessiveness have a pheno-
typic basis. The phenotypes are, however, a consequence of
of allele interactions that are different from those
the characteristics of proteins produced by the alleles of a
described by Mendel and we also examine inter- gene. In this sense, dominance and recessiveness also have
actions between genes and between genes and a molecular basis. The dominance of one allele over another
4.1 Interactions between Alleles Produce Dominance Relationships 107
is determined by the protein products of the allele—by the this case, the mutant allele T2 is dominant over the wild-type
manner in which the protein products of alleles work to allele T1 since both the heterozygous (T1 T2 ) and homozy-
produce the phenotype. gous (T2 T2 ) organisms have a mutant phenotype. In cases like
Let’s compare two examples to illustrate the molecular this, the wild-type allele is identified as haploinsufficient
basis of dominance and recessiveness. In both examples, a because a single copy is not sufficient to produce the wild-
wild-type allele produces an enzyme with full activity and type phenotype in the heterozygous genotype.
a mutant allele produces either very little enzyme activity
or none at all. In the first example the mutant allele is reces-
sive, but in the second example the mutant allele is domi-
Functional Effects of Mutation
nant. Recall from our discussion at the beginning of Section 3.3 The study of mutations and their consequences is a central
that the term wild type derives from the work of Thomas tool of genetic analysis. In many instances, the study of
Hunt Morgan, who determined that most flies in his wild pop- mutations provides clues to the production of the wild type
ulations had the same phenotype. The wild-type trait or allele and to the underlying causes of abnormal outcomes. In the
is the most common allele in a natural (wild) population. study of mutations, a central question concerns the mecha-
nism through which the mutation disrupts normal (wild-
Haplosufficient Wild-Type Allele Is Dominant In the first type) gene function and leads to the mutant phenotype.
example, gene R has a dominant wild-type allele R+ and a From a functional perspective, organisms with two
recessive mutant allele r. Gene R produces an enzyme that copies of the wild-type allele have the wild-type phenotype
must generate 40 or more units of catalytic activity to drive (Figure 4.1a). The same would be true if an organism had
a critical reaction step. Successful completion of this step a single copy of a fully dominant wild-type allele. Using
produces the wild-type phenotype, whereas failure to com- the level of activity of the protein products of the wild-type
plete the step generates a mutant phenotype. Each copy of allele as the basis for comparison, mutant alleles can often
allele R+ produces 50 units of enzyme activity. The mutant be placed into either a loss-of-function or a gain-of-function
allele r produces no functional enzyme and leads to 0 units category. A loss-of-function mutation results in a signifi-
of activity. Homozygous R+R+ organisms produce 100 units cant decrease or in the complete loss of the functional activ-
of enzyme activity (50 units from each copy of R+), ity of a gene product. This common mutational category
far exceeding the minimum required to achieve the wild- includes mutations like those described in the R-gene and
type phenotype. Heterozygous organisms (R+r) produce a T-gene examples. Loss-of-function mutant alleles are usu-
total of 50 units of enzyme activity, which is sufficient to ally recessive, but under certain circumstances, they may be
produce the wild-type phenotype. Homozygous rr organ- dominant, depending on whether the wild-type allele is hap-
isms produce no enzymatic action, however, and display the losufficient or haploinsufficient.
mutant phenotype. Based on its ability to catalyze the critical Gain-of-function mutations identify alleles that have
reaction step and produce the wild-type phenotype in either acquired a new function or have their expression altered in
a homozygous (R+R+) or heterozygous (R+r) genotype, R+ a way that gives them substantially more activity than the
is dominant over r. Dominant wild-type alleles of this kind wild-type allele. Gain-of-function mutations are almost
are identified as haplosufficient since one (haplo) copy is always dominant and usually produce dominant mutant
sufficient to produce the wild-type phenotype in the hetero- phenotypes in heterozygous organisms. As a consequence
zygous genotype. of their newly acquired functions, certain gain-of-function
mutations are lethal in a homozygous state.
Haploinsufficient Wild-Type Allele is Recessive The sec-
ond example involves gene T, for which the wild-type allele Loss-of-Function Mutations As the previous discussion
is recessive to a mutant allele. Gene T produces an enzyme suggests, mutations resulting in a loss of function vary in
required to catalyze a critical reaction step that produces a the extent of loss of normal activity of the gene product.
wild-type phenotype if it is completed. The inability to com- A loss-of-function mutation that results in a complete loss
plete the reaction step results in a mutant phenotype. For the of gene function in comparison with the wild-type gene
reaction step in question, 18 units of enzyme activity are product is identified as a null mutation, also known as an
required. The wild-type allele T1 produces 10 units of activ- amorphic mutation (Figure 4.1b). The word null means
ity. A mutant allele, T2 , generates 5 units of enzyme activity. “zero” or “nothing,” and the word amorphic means “with-
Homozygous T1 T1 organisms generate 20 units of catalytic out form.” These mutant alleles produce no functional gene
enzyme activity, enough to catalyze the critical reaction step product and are often lethal in a homozygous genotype. The
and produce the wild-type phenotype. Heterozygous organ- elimination of functional gene products can result from vari-
isms, on the other hand, produce only 15 units of enzymatic ous types of mutational events, including those that block
activity and have the mutant phenotype because they fall transcription, produce a gene product that lacks activity, or
short of the 18 units required to catalyze the reaction step. result in deletion of all or part of the gene.
Similarly, homozygous T2 T2 organisms, which produce 10 Alternatively, a mutation resulting in partial loss of gene
units of enzyme activity, also have a mutant phenotype. In function may be identified as a leaky mutation, also known
108 CHAPTER 4 Gene Interaction
Figure 4.1 The functional consequences of mutation. (a) Wild type. (b), (c), and (d) Loss-of-function mutations.
(e) and (f) Gain-of-function mutations. The “X” indicates the presence of a mutation in a copy of a gene.
4.1 Interactions between Alleles Produce Dominance Relationships 109
as a hypomorphic mutation (Figure 4.1c). Hypomorphic from regulatory mutations that increase gene transcription,
means “reduced form”; like the term leaky, it implies that a block the normal response to regulatory signals that silence
small percentage of normal functional capability is retained transcription, or increase the number of gene copies by gene
by the mutant allele but at a lower level than is found for the duplication. The phenotypic effect may be more severe in
wild-type allele. The severity of the phenotypic abnormal- mutation homozygotes than in heterozygotes, but often,
ity depends on the residual level of activity from the leaky particularly in humans, mutant homozygotes are not seen
mutant allele. A greater percentage of activity from a leaky because homozygosity is lethal.
allele results in a less severely affected phenotype than when Gain-of-function mutations resulting from neomorphic
the mutation incurs a more substantial loss of function. Both (“new form”) mutations acquire novel gene activities not
null and hypomorphic loss-of-function mutations are often found in the wild type (Figure 4.1f) and are usually domi-
recessive and homozygous lethal. nant. The gene products of neomorphic mutants are func-
Dominant loss-of-function mutations are also known to tional but have structures that differ from the wild-type gene
occur. Some of these produce dominant mutant phenotypes product. The altered structures lead the mutant protein to
through alterations in the function of a multimeric protein function differently than the wild-type protein. Homozy-
of which the mutant polypeptide forms a part (Figure 4.1d). gotes for a neomorphic allele may exhibit a more severely
Multimeric proteins, composed of two or more polypeptides affected phenotype than do heterozygotes.
that join together to form a functional protein, are particu-
larly subject to dominant negative mutations as a conse- Notational Systems for Genes and Allele
quence of some change that prevents the polypeptides from
interacting normally to produce a functional protein. A mul-
Relationships
timeric protein that contains an abnormal polypeptide may Our description of the molecular basis of dominance and of
suffer a reduction or total loss of functional capacity. Muta- loss-of-function and gain-of-function mutations provides a
tions of this kind are dominant due to the substantial loss conceptual basis for understanding how different patterns
of function of the multimeric protein (as illustrated in the of dominance relationships can develop among alleles of
following paragraph). These mutations are characterized as a gene. These concepts apply to all diploid organisms, but
“negative” due to the spoiler effect of the abnormal poly- the various notational systems used to identify genes and
peptide on the multimeric protein. alleles in different species do not all depict these relation-
An example of dominant negative mutation is seen ships in the same ways. Historically, these different gene
in the human hereditary disorder osteogenesis imperfecta notation systems developed along species lines due to the
(OMIM 116200, 116210, and 116220), which is caused by propensity of early 20th-century biology to study one spe-
defects in the bone protein collagen and has multiple forms cies in isolation from other species. Biology today is far
with different severity. Collagen protein is composed of three more interdisciplinary. For example, in discussing Mendel’s
interwoven polypeptide strands—two polypeptides from the work in Chapter 2, we mostly used a notational system in
COL1A1 gene and one polypeptide from the COL1A2 gene. which an uppercase letter (for example, A) indicates a domi-
The trimeric collagen protein is subject to dominant negative nant allele and the same letter in lowercase (a) designates
mutation as a consequence of COL1A1 mutations that pro- a recessive allele. When the dominance of one allele is not
duce a defective polypeptide. The trimeric structure of colla- complete, however, a different notational system—one that
gen and the 2:1 ratio of incorporation of COL1A1 polypeptide avoids implying dominance or recessiveness—is used. In
over COL1A2 polypeptide means that in individuals who are this nomenclature system, alleles can be symbolized with
homozygous wild type for COL1A2 and heterozygous for either upper- or lowercase letters plus a suffix that may be
COL1A1 mutation, most collagen protein contains one or two a number or a letter. Examples of how pairs of alleles with
mutant COL1A1 proteins. As a result, most collagen protein incomplete dominance can be designated are A1 and A2, B 1
is defective, and osteogenesis imperfecta develops. and B 2, d1 and d2, and w a and w b . We apply some of these
notational systems in the following section.
Gain-of-Function Mutations Mutations resulting in a It is not surprising that there are a number of different
gain of function fall into two categories that depend on the notational systems employed in genetics, involving various
functional behavior of the new mutation. Hypermorphic uses of italics, capital letters, and symbols such as ; +< for
(“greater than wild-type form”) mutations produce more wild-type alleles and ; -< for mutant alleles. They devel-
gene activity per allele than the wild type (Figure 4.1e) and oped in the early years of genetics research when genetic
are usually dominant. The gene product of a hypermorphic experiments were being carried out by experts in widely
allele is indistinguishable from that of the wild-type allele, divergent fields of biology with little intercommunication.
but it is present in a greater amount and thus induces a Geneticists studying fruit flies developed one notation sys-
higher level of activity. The excess concentration is the func- tem for identifying wild-type and mutant alleles, geneticists
tional equivalent of overdrive, pushing processes forward studying yeast developed another, and geneticists studying
more rapidly, at the wrong time, in the wrong place, or for a plants developed another. As the table inside the back cover
longer time than normal. Hypermorphic mutants often result illustrates, each model organism has its own unique style of
110 CHAPTER 4 Gene Interaction
gene description and nomenclature. These various styles are expression of both alleles in heterozygotes. Codominance
the conventions we follow throughout this book for discuss- is most clearly identified when the protein products of both
ing the genetics of different model organisms. alleles are detectable in heterozygous organisms, typically
by means of some sort of molecular analysis or a biochemi-
Incomplete Dominance cal assay that can distinguish between the different proteins.
An example of codominance is presented in the following
Mendel’s description of inheritance of traits controlled by
discussion of ABO blood type.
single genes having a dominant and a recessive allele is a sim-
ple hereditary process that is relatively rare in nature. More
commonly with single-gene traits, the dominance of one Dominance Relationships of ABO Alleles
allele over another is not complete but instead is described as More than one pattern of dominance between the alleles
incomplete dominance, also known as p artial dominance. of a gene can occur under certain circumstances. Here we
When incomplete dominance exists among alleles, the examine the codominance of two alleles and the recessive-
phenotype of the heterozygous organism is distinctive; it ness of a third allele of the gene determining human ABO
falls somewhere on a phenotypic continuum between the blood type.
phenotypes of the homozygotes and is typically more simi- All of us have one of the four common blood types—
lar to one homozygous phenotype than the other. When traits type O, type A, type B, or type AB—that result from our
display incomplete dominance, two pure-breeding parents genotype at the ABO blood group gene located on chromo-
with different phenotypes produce F1 heterozygotes having a some 9 (OMIM 110300).
phenotype different from that of either parent. The three alleles of the ABO gene are identified as
One of the many traits displaying incomplete domi- I A, I B, and i, and the four blood groups are phenotypes pro-
nance is the trait described as flowering time in Mendel’s duced by six genotypes. On the basis of genotype–phenotype
pea plants (Pisum sativum). In peas, the first appearance of (i.e., blood type) correlation, geneticists have concluded that
flowers is under the genetic control of a gene that we will I A and I B have complete dominance over i, and that I A and
call T, for flowering time. The earliest-flowering strain of I B are codominant to one another. The complete dominance
pea plants has the homozygous genotype T1T1; the flowering of I A and I B to i is indicated by the identification of blood
time of this strain is described as day 0.0. The latest-flower- type A in individuals whose genotype is I AI A or I Ai, and of
ing strain is homozygous T2T2, and it flowers 5.2 days later blood type B in individuals whose genotype is I BI B or I Bi.
on average than T1T1 plants. A cross of pure-breeding early- The completely recessive nature of the i allele is confirmed
flowering and late-flowering strains produces T1T2 heterozy- by the observation that only ii homozygotes have blood type
gous progeny that begin to flower 3.7 days later on average O. Lastly, codominance of I A and I B to one another is con-
than the earliest-flowering strain (Figure 4.2a). firmed by the observation that blood type AB occurs only in
Genetic crosses show that flowering time is controlled individuals who have the heterozygous genotype I AI B.
by a single locus. Self-fertilization of T1T2 plants produces a
1:2:1 ratio of early-, intermediate-, and late-flowering prog-
Determining ABO Blood Type ABO blood type is iden-
eny (Figure 4.2b). We say the T2 allele is partially dominant,
tified by an antigen–antibody reaction on a microscope
but not completely dominant, to T1 because the heterozy-
slide. The test involves placing a drop of blood into a drop
gous phenotype is distinct from either homozygous pheno-
of anti-A antiserum in one well of a microscope slide and
type but more closely resembles the late-flowering strain.
placing another drop of blood into anti-B antiserum in the
other well of the slide. The two antisera contain antibodies,
Codominance molecules produced by the immune system that bind to a
Codominance, like incomplete dominance, leads to a specific antigen (for each kind of antibody there is a spe-
heterozygous phenotype different from the phenotype of cific antigen). Each antigen in the case of ABO blood type
either homozygous parent. Unlike incomplete dominance, is a carbohydrate group (sugar) embedded on the surface of
however, codominance is characterized by the detectable red blood cells. A positive reaction occurs when an antibody
A antigen
CH2OH CH2OH CH2OH CH2OH CH2OH
HO O HO O HO O HO O O
H
O OH O
OH OH
O O O Lipid
NHOCH3 O NHOCH3 OH OH
O
CH3
HO A-transferase adds N-acetylgalactosamine
Enzyme products of the HO to the H antigen to convert it to A antigen.
ABO gene can modify OH
the H antigen. A-transferase H antigen
encoded by I ACH OH CH2OH CH2OH CH2OH
2
H antigen HO O O HO O O
CH2OH CH2OH CH2OH CH2OH No functional
O O
HO O O HO O O transferase OH OH
O
OH
O Lipid
O O encoded by i
OH OH OH O NHOCH3 OH OH
O O Lipid
O
NHOCH3 OH OH CH3 H antigen is unmodified.
O HO
O HO
CH3 B-transferase
HO encoded by I B OH
HO
B antigen
OH The H antigen is produced by action of the H gene. CH2OH CH2OH CH2OH CH2OH CH2OH
HO O HO O HO O HO O O
OH O OH O OH
O O O Lipid
OH O NHOCH3 OH OH
O
CH3
HO
HO B-transferase adds galactose to the H
OH antigen, converting it to B antigen.
a different sugar, galactose, and produces a six-sugar oligosac- modifications; about one-half of the red cell surface anti-
charide known as the B antigen. Molecular analysis reveals gens are A antigens, and the rest are B antigens. As a
that the A and B alleles differ in several nucleotides, causing result, the action of both alleles is detected in the pheno-
four amino acids of the resulting transferase enzymes to differ type, leading to the conclusion that I A and I B are codomi-
and leading to differences in enzymatic activity. In contrast, nant to one another.
the i allele is due to a single base-pair deletion and is a null Many nonhuman primates have a blood group system
allele that does not produce a functional gene product capable that is essentially identical to the human ABO blood group
of adding a sixth sugar to the H antigen. system. ABO blood groups have been identified in the great
At the cellular level, anti-A antibody recognizes the apes (chimpanzee, gorilla, and orangutan) as well as in
N-acetylgalactosamine addition mediated by I A, and anti-B numerous Old World monkey species, including macaques
antibody identifies the galactose addition produced by (genus Macaca) and baboons (genus Papio). Two impor-
the action of I B. Neither of these antibodies has any reac- tant evolutionary observations derive from this finding.
tivity with the unmodified H antigen, so unmodified H First, the ABO blood group is a long-standing feature of
antigen, present in individuals with blood type O, is not the immune system genetics in primates, one that evolved
recognized by either antibody. One copy of the I A or the early in the ancestral history of primates and was retained
I B allele in a genotype is sufficient to produce an ABO over tens of millions of years as primates diversified. Sec-
antigen detectable by the corresponding antibody; and ond, the retention of the ABO blood group system in pri-
both I A and I B are dominant to i, since I A and I B produce mates demonstrates the importance of this immune system
enzymes that modify the H antigen but i does not. When response in protecting primates from infectious and foreign
the I AI B genotype is present, on the other hand, both antigens. Natural selection has played a preeminent role in
A-transferase and B-transferase are produced, resulting maintaining this system. The ABO blood group genes are
in the addition of N-acetylgalactosamine to some H anti- one example of the shared evolutionary history that can be
gens and the addition of galactose to other H antigens. In identified through the examination of the taxonomic distri-
this case, all red blood cells carry both types of H-antigen bution of genes in lineages. Genetic Analysis 4.1 examines
GENETIC ANALYSIS 4.1
PROBLEM The MN blood group in humans is an autosomal codominant system with two alleles, M and BREAK IT DOWN: The discussion
on p. 113 about the relationships
N. Its three blood group phenotypes, M, MN, and N, correspond to the genotypes MM, MN, and NN. among ABO alleles will help you to
The ABO blood group assorts independently of the MN blood group. identify the parental genotypes from
the phenotypes given here.
A male with blood type O and blood type MN has a female partner with blood type AB and blood type N.
BREAK IT DOWN: Alleles of the ABO
Identify the blood types that might be found in their children, and state the proportion for each type. system have both dominant-recessive
and codominant relationships (p. 114).
Evaluate
1. Identify the topic of this problem and 1. The problem concerns the inheritance of two blood types. The gene determining
the kind of information the answer ABO blood type carries three alleles: I A and I B are codominant to one another
should contain. and dominant to i. The MN blood group gene carries two alleles that are codomi-
nant. The answer requires finding the possible blood types, and their expected
proportions, of the children of parents whose blood types are given.
2. Identify the critical information given 2. The blood types of the parents are given.
in the problem. TIP: Blood type O is the recessive
phenotype, and blood type MN is
Deduce due to codominance of alleles.
3. Deduce the blood group genotypes 3. The male has blood types O and MN. Type O results from homozygosity for the
of the male parent. recessive i allele, whereas MN is produced in heterozygotes carrying both alleles.
The male genotype is ii MN.
4. Deduce the blood group genotypes 4. The female has blood groups AB and N. The AB blood type is found in
of the female parent. heterozygotes, and blood type N in homozygotes. The female blood group
TIP: Blood type AB is due to genotype is I AI B NN.
codominance, and blood type N is
Solve due to homozygosity.
5. Identify the gamete genotypes and 5. Independent assortment predicts two gamete genotypes for the male:
their frequencies for the male. All gametes contain i, half carry M, and half carry N.
6. Identify the female gamete genotypes 6. Independent assortment predicts two gamete genotypes for the female:
and their frequencies. All gametes contain N, half contain I A, and half contain I B.
7. Predict the progeny genotypes and 7. Blood types A and B are each expected in 50% of the offspring of this cross, as
phenotypes. are blood types MN and N. Four different blood group phenotypes, each with an
TIP: Use a Punnett square to expected frequency of 25% are predicted.
evaluate this cross.
Mi Ni
MNI Ai NNI Ai
NI A Blood types: Blood types:
MN and A N and A
MNI Bi NNI Bi
NI B Blood types: Blood types:
MN and B N and B
For more practice, see Problems 6, 9, and 31. Visit the Study Area to access study tools. Mastering Genetics
the inheritance of blood group phenotypes, where alleles order of dominance emerges among the alleles, based on the
have a variety of dominance relationships. activity of each allele’s protein product, forming a sequen-
tial series known as an allelic series. Alleles in an allelic
series can be completely dominant or completely recessive,
Allelic Series or they can display various forms of incomplete dominance
Diploid genomes contain pairs of homologous chromo- or codominance.
somes; thus, each individual organism can possess at most
two alleles at a locus. In populations, however, the number The C-Gene System for Mammalian Coat Color
of alleles is theoretically unlimited, and some genes have Genetic analysis of coat color in mammals reveals that
scores of alleles. At the population level, a locus possessing many genes are required to produce and distribute pigment
three or more alleles is said to have multiple alleles; and like to the hair follicles or skin cells, where they are displayed
the ABO gene, many multiallelic genes display a variety of as coat color or skin color. Although various interactions
dominance relationships among the alleles. Commonly, an among these genes can modify color expression, we focus
113
114 CHAPTER 4 Gene Interaction
here on just one gene, the C (color) gene that is respon- demonstrated by the finding that all of the progeny of an ani-
sible for coat color in mammals such as cats, rabbits, and mal with the genotype CC have full color, regardless of the
mice. This gene has dozens of alleles that have been iden- genotype of the mate. The dominance order of alleles in the
tified over nearly a century of genetic analysis, but we series is revealed by the pattern of 3:1 ratios obtained from
limit our discussion to just four alleles that form an allelic crosses of various heterozygous genotypes shown in Figure
series. The C gene produces the enzyme tyrosinase, which 4.6. Cross D shows that chinchilla is completely dominant
is active in the first two steps of a multistep biochemi- over albino. Himalayan, too, is completely dominant over
cal pathway that synthesizes the pigment melanin, which albino (Cross E). Cross F shows that the chinchilla allele
imparts coat color in furred mammals and skin color in (cch) is partially dominant over the Himalayan allele (ch).
humans. In the initial melanin pathway steps, tyrosinase is Note the F2 of this cross have a 1:2:1 ratio of phenotypes,
responsible for the breakdown (catabolism) of the amino with the heterozygous F2 displaying Himalayan markings
acid tyrosine. and dilute coat color over the rest of the body that are both
The C-gene alleles form an allelic series that is somewhat lighter than in their homozygous counterparts.
revealed by the phenotypes of offspring of various mat- The dominance relationships within this allelic series locus
ings. Allele C is dominant to all other alleles of the gene, can be expressed as C 7 cch 7 ch 7 c.
and any genotype with at least one copy of C produces
wild-type coat color. These genotypes are written as C– to The Molecular Basis of the C-Gene Allelic Series Tyros-
indicate that regardless of the second allele in the geno- inase enzymes produced by different C-gene alleles have
type, the phenotype is dominant. Three other alleles, pro- distinctive levels of catabolic activity that are the basis for
ducing tyrosinase enzymes with reduced or no tyrosinase the dominance relationships between the alleles. The allele
activity, form an allelic series with C (Figure 4.5). The C is a dominant wild-type allele producing fully active
allele cch in homozygotes produces a phenotype called tyrosinase that is defined as 100% activity. The percent-
chinchilla, a diluted coat color. This allele is hypomorphic age of wild-type tyrosinase activity produced by each allele
and generates reduced coat color as a result of the reduced explains the order observed for the allelic series. Biochemi-
level of activity of the gene product. The ch allele in homo- cal examination reveals that the enzyme produced by the cch
zygotes produces the Himalayan phenotype, characterized hypomorphic allele has much less activity than the wild-
by fully pigmented extremities (paws, tail, nose, and ears) type enzyme. In the homozygous cchcch genotype or het-
but virtually absent pigmentation on other parts of the erozygous genotypes cchch or cchc, only a small amount of
body. This allele is temperature sensitive, as we describe melanin is synthesized. This leads to a decreased amount of
momentarily. Finally, the c allele produces a protein prod- pigment, and it has the effect of muting the coat color, more
uct with no enzymatic activity. This is a fully recessive so in heterozygous genotypes, where just one cch allele is
null (amorphic) allele that does not produce a functional present, than in the cchcch homozygous genotype, where two
gene product. Homozygosity for this allele produces an alleles are present.
albino phenotype. The tyrosinase enzyme produced by the hypomorphic
Crosses between animals with different genotypes at ch (Himalayan) allele is unstable and is inactivated at a
the C gene indicate the dominance relations of the alleles. temperature very near the normal body temperature of
For example, in Crosses A, B, and C in Figure 4.6, com- most mammals. This type of gene product is an example
plete dominance of C over other alleles in the series is of a temperature-sensitive allele. Cats with the Siamese
P × P × P ×
CC c c ch ch
CC c c h h
CC cc
Full color Chinchilla Full color Himalayan Full color Albino
F1 × F1 × F1 ×
Cc ch Cc ch Cc h Cc h Cc Cc
Full color Full color Full color Full color Full color Full color
F2 C c ch F2 C ch F2 C c
C CC Cc ch C CC Cc h C CC Cc
Full color Full color Full color Full color Full color Full color
c ch Cc ch c chc ch ch Cc h c hc h c Cc cc
Full color Chinchilla Full color Himalayan Full color Albino
P × P × P ×
c c ch ch
cc c c h h
cc c c ch ch
c hc h
Chinchilla Albino Himalayan Albino Chinchilla Himalayan
F1 × F1 × F1 ×
c chc c chc c hc c hc c chc h c chc h
Chinchilla Chinchilla Himalayan Himalayan Chinchilla Chinchilla
F2 c ch c F2 ch c F2 cch ch
Figure 4.6 The genetics of C gene dominance. Crosses A to F Q Based on the activities of C gene alleles described in this
illustrate the complete dominance of C, the recessiveness of c, and chapter, explain why one-half of the F2 progeny shown in Cross
the incomplete dominance of c chover c h. Dominance in this allelic F have chinchilla fur and dark paws, nose, and ears.
series is C 7 c ch 7 c h 7 c.
116 CHAPTER 4 Gene Interaction
Lethal Alleles
Certain single-gene mutations are so detrimental that
they cause death early in life or terminate gestational
development. These life-ending mutations affect genes
whose products are essential to life. Homozygosity for
mutation of these essential genes is lethal, and the muta- Figure 4.7 Evidence of lethal mutations in plants. Embryonic
lethality is detected by observing a 3:1 ratio of viable to nonviable
tions are identified as lethal alleles. As a rule, reces-
seeds, and gametophytic lethality is detected by observing a 1:1
sive lethal alleles have low frequencies in populations, ratio. Arrows indicate undeveloped seeds.
although they may persist in some populations over a long
period of time. Natural selection can eliminate copies of
the allele when they occur in homozygous genotypes; its mother and the mutant fer allele came from its father.
however, recessive lethal alleles are “hidden” by dominant During megasporogenesis, one-half of all megaspores will
wild-type alleles in heterozygous genotypes, thus evading inherit the FER allele and the other half will inherit the fer
natural selection. allele. Embryo sacs derived from megaspores inheriting
the fer allele will die, so that only one-half of all ovules
Detection in Plants In flowering plants, the effects of develop into seeds. The alleles segregate in a 1:1 ratio that
lethal alleles can be observed directly either as embryonic is observed among the developing seeds in a fruit. Note that
lethals that fail to produce homozygous lethal progeny or the 1:1 ratio is a direct observation of Mendelian ratios in
as gametophytic lethals that fail to generate lethal allele– the haploid gametes of a heterozygous organism. Thus, a
carrying gametes (Figure 4.7). For example, mutation of the 1:1 ratio distinguishes female gametophytic lethality from
RPN1a gene that encodes a subunit of the 26S proteosome, embryonic lethality, which results in a 3:1 ratio among
a multiprotein complex involved in protein degradation, seeds. Plants usually produce pollen in excess, similar to
has produced a loss-of-function null allele (rpn1a) that the excess of sperm production relative to egg production in
results in embryonic lethality in Arabidopsis thaliana and animals, and so male gametophytic lethality is not observ-
other plant species. In an RPN1a/rpn1a * RPN1a/rpn1a able by looking at developing seeds in the fruit. It can be
cross, a 3:1 segregation ratio of living seeds (RPN1a/_) detected, however, by looking for plants in which half of all
to dead seeds (rpn1a/rpn1a) can be observed in the fruit. the pollen grains are dead.
When the living seeds are planted, approximately two-
thirds are heterozygous for the lethal allele (RPN1a/rpn1a) Detection in Animals In contrast to their detection in
and one-third are homozygous for the wild-type allele flowering plants, lethal alleles in animals are usually
(RPN1a/RPN1a). detected by a distortion in segregation ratios due to failure
Lethal mutations that result in female gametophytic to produce the affected category of progeny. The first case
lethality are also detectable in flowering plants. Consider a of a lethal allele was identified in 1905 by Lucien Cuenot,
plant heterozygous for a female gametophytic allele, FER/ who studied a lethal mutation in mice carrying a domi-
fer, in which the wild-type FER allele was derived from nant mutation for yellow coat color. In mice, wild-type
4.1 Interactions between Alleles Produce Dominance Relationships 117
(a) Agouti coat color information, two important observations about the genetics
of the y ellow allele can be made. First, mating an agouti
mouse and a yellow mouse will always result in a 1:1 ratio
of agouti and yellow among progeny (Figure 4.9a). Sec-
ond, crosses between two yellow mice (both of which are
necessarily heterozygous) produce evidence of the reces-
sive lethal nature of the AY allele (Figure 4.9b). The out-
come of these crosses is a 2:1 ratio of yellow to agouti,
rather than the 3:1 ratio that is anticipated when hetero-
zygotes expressing a dominant allele are crossed. The
genetic interpretation of this observation is that alleles of
heterozygous yellow mice segregate normally in gamete
(b) Yellow coat color formation and unite at random to produce a 1:2:1 ratio at
conception, but that AYAY zygotes do not survive gestation.
Recessive lethality of AY prevents embryonic development
of homozygotes, eliminating that class among progeny and
resulting in the 2:1 ratio seen among progeny of heterozy-
gous parents.
Nearly a century after Cuenot first identified homozy-
gous lethality of the mutant AY allele, the molecular basis of
the lethality was identified. Much to the surprise of geneti-
cists, the lethality had little to do with yellow coat color
itself; instead, yellow coat was an almost inadvertent con-
Figure 4.8 Coat color in mice. (a) Wild-type agouti coat color is sequence of a mutation that deleted part of a gene near the
a mixture of black and yellow pigment in hair shafts. (b) Yellow coat coat-color gene.
occurs when yellow pigment produced by the overly active mutant The mutation producing the AY allele results from a
allele AY displaces black pigment. deletion that affects two genes, the Agouti gene and a neigh-
boring gene identified as Raly. Raly produces a protein that
is essential for mouse embryo development. Each of these
genes has its own promoter. The wild-type Raly promoter
coat color is a brown color, called “agouti” (a-GOO-tee),
produced by the presence of yellow and black pigments in
each hair shaft (Figure 4.8a). Agouti hairs are black at the
base and tip, with yellow pigment in the central portion (a) (b)
of the shaft. Yellow coat color is seen when yellow pig-
× ×
ment is deposited along the entire length of the hair shaft, P P
not just in the middle portion as it is in agouti (Figure AA AAY AAY AAY
4.8b). The Agouti gene is one of the pigment-producing Agouti Yellow Yellow Yellow
genes found in mammals with furry coats. It produces a
yellow pigment called pheomelanin that is found in the F1 A AY F1 A AY
hairs of mammalian coats. An independently assorting
gene produces the black pigment that is also visible in A AA AAY A AA AAY
the hair shafts in Figure 4.8a. The wild-type allele for Agouti Yellow Agouti Yellow
agouti coat color is designated A, and its normal activity
leads to the production of a moderate amount of yellow
A AA AAY AY AAY AYAY
pigment. The mutant allele, designated AY, is a hypermor- Agouti Yellow Yellow (Lethal)
phic allele. It is a dominant gain-of-function mutation 1
1
– AA Agouti – AA Agouti
that produces substantially more yellow pigment than 2 3
drives a high level of transcription, whereas the Agouti hand, we look at the ratio of progeny with yellow versus
gene promoter is considerably less actively transcribed agouti coat color in the cross of two yellow mice, we see
(Figure 4.10). The dominant mutation producing yellow coat a 2:1 ratio that is the result of the homozygous lethality
color comes about by a deletion of approximately 120,000 of the mutant allele. In this context, lethality only affects
base pairs that deletes the entire Raly gene and the Agouti homozygotes, and the mutant allele is recessive to the wild
gene promoter, thus bringing the Agouti gene under the type. This relationship is due to the loss of function of the
control of the Raly promoter, leading to a mutant hypermor- Raly gene caused by its deletion. We have, therefore, the
phic agouti allele. The Raly promoter drives a high level of odd circumstance of one mutant allele that is both domi-
Agouti gene transcription that results in excess yellow pig- nant and recessive, depending on how its phenotypic effect
ment that displaces black pigment in hair shafts and leads to is examined.
the mutant yellow phenotype. At the same time the absence
of the Raly gene means the mutant allele fails to produce Delayed Age of Onset
eterozygotes with the AAY genotype have
the Raly protein. H
yellow coats and s urvive due to haplosufficiency of the sin- From an evolutionary perspective, it is easy to understand
gle copy of Raly. Homozygous AYAY mice are unable to pro- that a dominant lethal allele can be efficiently eliminated
duce the essential protein product from the Raly gene and by the action of natural selection when it is expressed
fail to develop, resulting in the skewed 2:1 Mendelian ratio during gestation or very early in life. Even so, there are
that characterizes the progeny of two heterozygous yellow- numerous examples of dominant lethal hereditary condi-
coated mice. tions, and a pertinent evolutionary genetic question con-
cerns how these mutations persist in populations. One
An Allele That Is Both Dominant and Recessive The reason, in the case of a small number of dominant lethal
AY allele is an example of an allele that can be classified as alleles, is that they sidestep natural selection by having
both dominant and recessive. This may sound confusing and a delayed age of onset; the abnormalities they produce
contradictory, but it is based on the phenotypes produced by do not appear until after affected organisms have had an
genotypes of the Agouti gene. We refer to the mutant allele opportunity to reproduce and transmit the mutation to the
as dominant or as recessive depending on the particular phe- next generation.
notype we happen to be examining. One well-characterized human hereditary disorder
When we look at the ratio of agouti versus yellow coat displaying delayed age of onset of a dominant lethal allele
color among the progeny produced by a yellow mouse mat- is the condition called Huntington disease (HD). This
ing with an agouti mouse, we see a 1:1 ratio that indicates progressive neuromuscular disorder, usually fatal within
dominance of the mutant allele over the wild-type allele. 10 to 15 years of diagnosis, is caused by mutation of a gene
Dominance in this instance is due to the gain-of-function near one end of chromosome 4. The HD mutant allele per-
of yellow pigment by the mutant allele. If, on the other sists in the population because symptoms do not begin in
about half of all cases until the person’s late thirties or early
forties, well after most people have begun having children
(Figure 4.11).
Functionally, the onset of symptoms of HD is delayed
Raly Agouti because the symptoms are due to neuron death, which usu-
promoter promoter
A ally takes place over an extended period of time that often
allele stretches over several decades.
Raly gene Agouti gene
Chromosomes carrying
wild-type A alleles produce
120,000 base pairs Raly protein required for 4.2 Some Genes Produce Variable
deleted by mouse embryonic
mutation development, and a Phenotypes
moderate amount of
Raly yellow pigment.
promoter To interpret phenotype ratios and identify the distribution
AY Chromosomes carrying the of genotypes among phenotypic classes, geneticists make
allele
Agouti gene mutant A Y allele produce the assumption that phenotypes differ because their under-
no Raly protein and a very lying genotypes differ. This assumption is valid only to the
high level of yellow pigment
due to the hypermorphic
extent that a particular genotype always produces the same
mutation. phenotype. If the correspondence between genotype and
phenotype holds true in every case, the trait is identified as
Figure 4.10 Mutation of Raly and Agouti producing yellow coat. having complete penetrance. When the correspondence
Q Refer back to Figure 4.1. Using the letters (a) through (f) in that between genotype and phenotype does not consistently
figure, identify the type of mutation causing yellow coat color and hold true—if instead the same genotype can produce differ-
the type of mutation producing lethality. ent phenotypes—the usual reasons are gene–environment
4.2 Some Genes Produce Variable Phenotypes 119
Sex-Influenced Traits
Sex-influenced traits are those in which the inheritance
pattern for a trait in one sex differs from the inheritance pat-
0 10 20 30 40 50 60 70 80 tern for the trait in the other sex, even when the genotype
Age (years) is the same. As with sex-limited traits, hormones influence
this pattern of differential gene expression between the
Figure 4.11 The age-of-onset curve for Huntington disease (HD). sexes.
The appearance of a chin beard versus the absence of
a beard, the beardless phenotype, in certain goat breeds is
an example of a sex-influenced trait. Bearding is inherited
interaction or interactions with alleles of other genes in the as an autosomal trait determined by two alleles, B1 and
genome. B2, which are present in three genotypes in each sex. In
In this section, we describe four phenomena in which a both sexes, B1B1 homozygotes are beardless, and homozy-
certain genotype does not always produce the same pheno- gotes of either sex with the B2B2 genotype are bearded. It
type. We first discuss sex-limited traits and sex-influenced is thought that androgenic hormones are a principal factor
traits, two categories of traits in which the sex of the organ- influencing the bearded phenotype. The effect of different
ism influences how certain genotypes are expressed. In levels of androgenic hormones on bearding in the sexes is
these cases, the hormonal environment is the critical factor seen by comparing females and males with the heterozy-
influencing phenotypic expression of the genotypes. The gous genotype (B1B2). Heterozygous males have a beard,
other two phenomena, referred to as incomplete penetrance whereas heterozygous females are beardless. Figure 4.12
and variable expressivity, are circumstances in which phe- illustrates the results of a cross between two heterozy-
notypic variation among organisms with the same genotype gotes that produces different ratios of bearded to beard-
is due to some sort of unspecified or unknown genetic or less males and females. Mendelian inheritance occurs, but
environmental interaction. as a consequence of sex-influenced expression, the cross
yields a 3:1 ratio of bearded to beardless males and a 3:1
Sex-Limited Traits ratio of beardless to bearded females. In short, the domi-
nance relationship of these alleles varies with sex. Allele
The sex of an organism can exert an influence on its gene B1 is dominant to B2 in females, since females that are
expression, due to the differences in hormone profiles that heterozygous B1B2 have the same beardless phenotype as
characterize males and females of a species. These sex- do B1B1 females. On the other hand, allele B2 is dominant
dependent differences amount to expressing genes in dif- over B1 in males since heterozygotes are bearded just like
ferent environments. One form such influence can take is B2B2 homozygotes. Analogous to the classification of the
described as sex-limited traits. Both sexes typically carry
the genes for sex-limited traits, but the genes produce a phe-
notype in just one sex.
In mammals, for example, the development of breasts
and the ability to produce milk are traits limited to females. B1B2 B1B2
×
Horn development is a trait limited to males in some spe- Beardless Bearded
cies of sheep, cows, and other hoofed animals. Behavioral
traits in some species, particularly traits related to mat-
B1 B2
ing, are also strongly influenced by sex. For example, the
courtship behavior of crowned cranes includes an elaborate B1 B1B1 B1B2
display of body positioning, neck intertwining, and vocal- Beardless Beardless Beardless Bearded
ization that is performed differently by males and females
B2 B1B2 B2B2
of the species. In the case of male canary vocalization,
changes in male singing patterns are initiated in late win- Beardless Bearded Bearded Bearded
ter by an increase in male hormones released by the brain
in response to increased day length and warmer tempera- Figure 4.12 Sex-influenced inheritance of beard appearance in
tures. In this case, male hormones are thought to stimu- goats. Dominance of the B1 and B2 alleles is expressed differently
late enlargement of the testes and increased production of in males and females.
120 CHAPTER 4 Gene Interaction
Incomplete Penetrance
When the phenotype of an organism is consistent with the
organism’s genotype, the organism is said to be penetrant
for the trait. In such a case, if the organism carries a domi-
nant allele for the trait in question, the dominant phenotype
is displayed. Sometimes an organism with a particular geno-
type fails to produce the corresponding phenotype, in which
case the organism is nonpenetrant for the trait. Traits for
which a genotype is always expressed in the phenotype are
identified as fully penetrant. In contrast, traits that are non- Figure 4.13 Polydactyly is an autosomal dominant trait with
penetrant in some individuals are characterized as display- incomplete penetrance.
ing incomplete penetrance.
The human condition known as polydactyly (“many
digits”) is an autosomal dominant condition that dis- of penetrance can be quantified. Penetrance values vary
plays incomplete penetrance. Individuals with polydactyly between families, but for the family shown in Figure
have more than five fingers and toes—the most common 4.14, the penetrance of polydactyly is 69, or 66.7%, which
alternative number is six (Figure 4.13). Polydactyly occurs is about the average seen worldwide among hundreds of
in hundreds of families around the world, and in these fami- families with polydactyly.
lies the dominant allele is nonpenetrant in about 25–30% of
individuals who carry it. Most people who carry the domi-
Variable Expressivity
nant mutant polydactyly allele have extra digits; but at least
one in four people with the mutant allele do not have extra Sometimes the discrepancy between genotype and phe-
digits and instead express the normal five digits. The gene notype is a matter of the degree or specific manifestation
mutated to produce polydactyly was recently identified of expression of a trait rather than presence or absence
(see Chapter 18). of the trait altogether. In the phenomenon of variable
Figure 4.14 shows a family in which polydactyly seg- expressivity, the same genotype produces phenotypes
regates as a dominant mutation. Nine individuals in the that vary in the degree or form of expression of the allele
family carry a copy of the polydactyly allele. Six of them of interest.
are penetrant for the phenotype (meaning that they express Waardenburg syndrome is a human autosomal domi-
the phenotype), but at least three family members—II-6, nant disorder displaying variable expressivity. Individu-
II-10, and III-10—are nonpenetrant. Each of these indi- als with Waardenburg syndrome may have any or all of
viduals has a child or grandchild with polydactyly; thus, four principal features of the syndrome: (1) hearing loss,
each carries the dominant allele for polydactyly but is (2) different-colored eyes, (3) a white forelock of hair, and
nonpenetrant for the condition. When nonpenetrant indi- (4) premature graying of hair. In the pedigree shown in
viduals are relatively common, the magnitude of frequency igure 4.15, notice that the circles and squares representing
F
1 2
I
1 2 3 4 5 6* 7 8 9 10* 11
II
1 2 3 4 5 6 7 8 9 10* 11 12 13 14
III
1 2 3 4 5
IV
* Nonpenetrant individual
Figure 4.14 Incomplete penetrance for polydactyly. Three nonpenetrant individuals (II-6, II-10, and III-10) are seen in this family.
4.2 Some Genes Produce Variable Phenotypes 121
family members with Waardenburg syndrome may be As an example, consider the tall and short pure-breeding
entirely or only partly colored. Each quadrant of the sym- lines of pea plants studied by Mendel. Inherited genetic varia-
bols represents one of the principal features of the syn- tion dictates that one line will produce tall plants and the other
drome. The diversity of symbol darkening demonstrates the line will produce short plants, but the environment in which
variation in expressivity of Waardenburg syndrome in this the individual plants are grown also has a significant influ-
family. Molecular genetic analysis tells us that each fam- ence on plant height. Environmental factors such as variations
ily member with Waardenburg syndrome carries exactly the in water, light, soil nutrients, and temperature each influence
same dominant allele, yet among the six affected members plant growth. It is not hard to imagine that genetically identi-
of the family, there are five different patterns of phenotypic cal plants of a type adapted to temperate zones might grow to
expression. different heights if one plant has an ideal growth environment
Pinpointing the cause of incomplete penetrance or vari- while the other faces a hot, arid environment with poor soil.
able expressivity is a challenging task. Three kinds of inter- Phenotypic expression of genotypes can also depend
actions may be responsible: (1) other genes that act in ways on the interaction of genetically controlled developmental
that modify the expression of the mutant allele, (2) envi- programs and external factors operating on organisms. For
ronmental or developmental (i.e., nongenetic) factors that example, the seasonal change in coat color observed in arc-
interact with the mutant allele to modify its expression, or tic mammals that are nearly white in winter but have darker
(3) some combination of other genes and environmental fac- coats in spring and summer results from an interaction
tors interacting to modify expression of the mutation. Indeed, between numerous genes and external environmental cues
the characterization of a trait as having incomplete pene- such as day length and temperature. Similarly, environmental
trance or variable expressivity is an acknowledgment that an cues that induce plants to bloom in the spring trigger changes
as yet unknown factor is interacting with gene expression to in gene expression that stimulate the growth and develop-
produce variability in expressivity or to reduce penetrance. ment of multiple plant structures, including flowers and
reproductive structures. Such capacities to make seasonal
Gene–Environment Interactions changes evolved by aiding the survival of these organisms,
and they suggest that gene–environment interaction is pivotal
Genes control innumerable differences between species. The in understanding and interpreting phenotypic variation.
genome of an organism lays out the body plan and biochem-
ical pathways of the organism, and it controls the progress
of development from conception to death. But genes alone Environmental Modification to Prevent Hereditary
are not responsible for all the variation seen between organ- Disease In some cases, the expression of a given gene is
isms. The environment—the myriad of physical substances, entirely dependent on the presence of certain environmental
events, and conditions an organism encounters at different conditions. An example of this kind of gene–environment
stages of life—is the other essential contributor to observ- interaction—or, more precisely, an example of the manipu-
able variation between organisms. Gene–environment lation of this relationship to achieve a desired outcome—is
interaction is the term describing the influence of environ- found in an element of the medical management that pre-
mental factors (i.e., nongenetic factors) on the expression of vents development of the human autosomal recessive con-
genes and on the phenotypes of organisms. dition known as phenylketonuria (PKU) (OMIM 261600).
PKU is caused by the absence of the enzyme phenylalanine
hydroxylase (PAH), which catalyzes the first step of the
pathway that breaks down the amino acid phenylalanine, a
common component of dietary protein.
I At one time, PKU accounted for thousands of cases of
severe mental retardation every year. PKU occurred in 1 out
of 10,000 to 1 out of 20,000 newborns in most populations
II around the world. Infants with PKU are normal at birth, but
over the first several months of life the body’s inability to
carry out the normal breakdown of phenylalanine leads to the
III buildup of a compound that is toxic to developing neurons. As
neurons die, mental and motor capacities are irretrievably lost,
making full manifestation of PKU inevitable. In the 1960s, a
IV simple blood test became available to detect PKU in the first
days of life. The test identifies the disease before the disease
Premature graying Hearing loss has had a chance to manifest itself and begin to damage the
White forelock Different-colored eyes body. PKU was among the first, and is now one of dozens of
rare hereditary disorders for which newborn infants are rou-
Figure 4.15 Variable expressivity of Waardenburg syndrome.
tinely screened in U.S. hospitals and in hospitals around the
Q What are the phenotypes of the two females in generation IV? world. The key feature shared by all of the hereditary diseases
122 CHAPTER 4 Gene Interaction
screened by newborn genetic testing is that the disease symp- produce offspring), and decrease life span. An evolutionary
toms can be prevented or substantially reduced in severity tradeoff is associated with changes in JH level or activity.
by strict and consistent dietary management. Dietary control On the one hand, producing more JH can lead to production
either prevents individuals from consuming compounds that of more offspring through earlier sexual maturity and higher
allow the disease to develop, or it provides the essential com- fecundity. On the other hand, body size decreases and life
pound missing in those with the disease. Application Chapter B span is shortened because of increased JH activity.
(Human Genetic Screening) discusses newborn genetic testing. Pleiotropy in the human hereditary condition sickle cell
The key dietary control for management of PKU is disease (SCD) is an example of the phenotypically diverse
elimination of the amino acid phenylalanine from the diet. secondary effects that can occur due to a mutant allele. SCD
Phenylalanine is a component of almost all proteins, but a (OMIM 603903) is an autosomal recessive condition caused
diet consisting of specially selected and processed proteins by mutation of the b@globin gene that, in turn, affects the
that have had phenylalanine removed is started as soon as structure and function of hemoglobin, the main oxygen-
PKU is diagnosed. This usually happens in the first hours carrying molecule in red blood cells. Many of the red blood
or days after birth. An infant who is started on the phenylal- cells of people with SCD take on a sickle shape and cause
anine-free diet soon after birth and kept on it through ado- numerous physical problems and complications (Figure 4.16).
lescence avoids the complications of PKU and will develop
and function normally despite having PKU. Thousands of
people with PKU are living fully normal and productive
4.3 Gene Interaction Modifies
lives today, thanks to this simple environmental modifica- Mendelian Ratios
tion that prevents the expression of the devastating PKU
phenotype. In this case, people who are homozygous reces- No gene operates alone to produce a phenotypic trait. Rather,
sive for the mutant PKU allele do not express the trait if genes work together to build the complex structures and
they are raised in a largely phenylalanine-free environment. organ systems of plants and animals. What we see as a phe-
Dietary hazards abound for children and young adults notype is the physical manifestation of the action of many
with PKU, particularly in the form of the artificial sweetener genes that have each played a role and have worked in com-
known as aspartame. This sweetener is made by a chemical plex but coordinated ways to produce a trait or structure. At
reaction that fuses the amino acids phenylalanine and aspar- the cellular and molecular levels, the mutual reliance of genes
tic acid to form a compound we perceive to taste sweet. on one another requires each gene to carry out its activity in
Once consumed, aspartame is quickly broken down into its the right place, at the right time, and at the appropriate level.
two constituent amino acids, and phenylalanine is released. Think of this process as analogous to a symphony
Regular intake of aspartame is dangerous for those with orchestra playing a piece of classical music. The orchestra
PKU; for this reason, a dietary caution reading “Phenylke- has many instruments and players, each with their own notes,
tonurics: Contains phenylalanine” appears on the packaging tones, keys, and volume. If the players use their instruments
of food products containing aspartame. Look for it on the as directed by the sheet music, the result will be smooth and
next artificially sweetened product you pick up! harmonious. If, however, one musician is playing off-time or
off-key, the error might disrupt the entire performance. The
Pleiotropic Genes same can be said of genes: Each must play its part correctly—
that is, give a wild-type performance—or the integrity of the
Pleiotropy is a phenomenon describing the alteration of trait will be at risk. For example, the products of several genes
multiple features of the phenotype by the presence of one interact in biosynthetic pathways to produce pigments that
mutation. It is distinguished from variable expressivity by are responsible for flower color. Similarly, a complex pheno-
the fact that variable expressivity affects one trait, whereas typic attribute like the ability to hear requires many genes to
pleiotropy alters several aspects of the phenotype. Most produce the various structures of the ear that convert acousti-
mutations displaying pleiotropy do so either by altering cal vibrations into the electrical impulses that are transmitted
the development of phenotypic features through the direct to the brain and converted into what we perceive as sound.
action of the mutant protein or as a secondary result of a In this section, we look in detail at gene interaction,
cascade of problems stemming from the mutation. the collaboration of multiple genes in the production of a
Pleiotropy through the direct action of a mutant pro- single phenotypic character or a group of related character-
tein product is frequently encountered in studies of devel- istics. First, however, let’s examine the genetic control of
opment. One example is the activity of the Drosophila phenotypes from a perspective we have not yet explored.
hormone called juvenile hormone (JH), which is active
throughout the Drosophila life cycle and influences numer-
ous attributes of development and reproduction. Increased
Gene Interaction in Pathways
production or increased activity of JH has been shown to Genes commonly work together in pathways, multistep bio-
prolong developmental time, decrease adult body size, pro- chemical processes that operate either as biosynthetic path-
mote early sexual maturity, raise fecundity (the ability to ways, synthesizing complex compounds such as amino acids,
4.3 Gene Interaction Modifies Mendelian Ratios 123
Mutation
Normal Sickle cell
5¿ CCT GAA GAG 3¿ 5¿ CCT GTA GAG 3¿
DNA
3¿ GGA CTT CTC 5¿ 3¿ GGA CAT CTC 5¿
Normal Deoxygenation of
development hemoglobin in tissue
Impaired
Impaired
ability to Kidney Bone Pain Heart Decreased
mental Jaundice
fight failure deformity crises failure growth
function
infection
Figure 4.16 Pleiotropy in sickle cell disease. The sickling of red blood cells has a range of phenotypic
consequences, due primarily to excessive red blood cell destruction and the reduced oxygen-delivery
capacity in those with the disease.
or as degradation pathways, breaking complex compounds An anabolic pathway that synthesizes the amino acid
down into simpler or elemental constituents. Biosynthetic methionine is shown in Figure 4.17a. The production of
pathways result from the expression of genes whose products methionine, the end product of the pathway, requires the
help build complex compounds or molecules that are the end expression of four genes that each produce an enzyme cata-
product of the pathway. Through successive reaction steps that lyzing a distinct step of the pathway. Homozygosity for a
produce a series of intermediate compounds, these pathways— mutant allele of any of these genes can block the pathway
known broadly as anabolic pathways—lead ultimately to the and would prevent methionine synthesis.
production of an end product such as a pigment, amino acid, The catabolic pathway that breaks down the amino acid
hormone, or nucleotide. The opposite process, the breakdown phenylalanine is shown in Figure 4.17b. It, too, utilizes the
of compounds into intermediate compounds and often into ele- enzyme products of multiple genes. The figure identifies sev-
mental constituents, is undertaken by catabolic pathways. eral steps of the pathway that are blocked by mutations of
124 CHAPTER 4 Gene Interaction
(a) In anabolic pathways the sequential action of gene products catalyzes steps of a biosynthesis.
Gene Met 2 Met B Met C Met E
(b) The action of gene products in catabolic pathways breaks down complex compounds into simpler compounds.
Dietary protein
Phenylalanine Tyrosinemia
hydroxylase aminotransferase
HGA oxidase
Alkaptonuria
(OMIM 203500)
Maleylacetoacetic acid
certain genes. Each of these mutations causes a distinct human The One Gene–One Enzyme Hypothesis
hereditary disorder, including PKU that we just described.
In addition to biosynthetic (anabolic) pathways and The concept of pathways requiring gene action originated
catabolic pathways, other pathways such as signal transduc- with Archibald Garrod’s suggestion in 1902 that the inabil-
tion pathways and developmental pathway also feature the ity to produce the enzyme homogentisic acid oxidase (HGA
interaction of multiple genes in the production of a trait or oxidase) is the cause of the autosomal recessive human hered-
characteristic. Signal transduction pathways are responsible itary condition known as alkaptonuria (see Figure 4.17b). It
for receiving a variety of chemical signals generated outside was not until the middle of the 20th century, however, that
a cell and initiating a response inside a cell. Operating by details of specific genetic pathways began to emerge. George
way of hormones and other compounds, signal transduction Beadle and Edward Tatum were among the first to investigate
pathways culminate in the activation or repression of gene biosynthetic pathways, in research that laid the groundwork
expression in response to an intracellular or extracellular for the later definition and examination of signal transduction
signal. and developmental pathways.
Developmental pathways direct the growth, devel- Beadle and Tatum’s experiment studied growth variants
opment, and differentiation of body parts and structures. of the fungus Neurospora crassa, and its details are described
Researchers have discovered the functions of genes in in Experimental Insight 4.1. The idea behind their experi-
numerous developmental pathways through experimental ment was simple—to generate single-gene growth mutations
analyses of mutant phenotypes. in Neurospora and interpret the normal function of genes by
4.3 Gene Interaction Modifies Mendelian Ratios 125
observing the phenotypic consequences of their mutation. and phenotypes. Two new terms that are used multiple times
The famous hereditary proposal known as the one gene–one in this section appear in Experimental Insight 4.1. The term
enzyme hypothesis came out of this experiment. It says that prototroph, or protrophic, means “wild type” and derives
each gene produces an enzyme, and each enzyme has a spe- from prototype, meaning “the original version.” In contrast,
cific functional role in a biosynthetic pathway. Beadle and the term auxotroph, or auxotrophic, means “mutant.”
Tatum observed that single-gene mutations block the comple- The one gene–one enzyme concept has undergone mod-
tion of biosynthetic pathways and lead to the production of ifications since it was first proposed. These changes take
mutant fungi that are deficient in their ability to grow with- account of three observations: (1) many protein-producing
out specific nutritional supplementation. Their hypothesis genes do not produce enzymes but produce transport pro-
proposed that each mutant phenotype was attributable to the teins, structural proteins, regulatory proteins, or other non-
loss or defective function of a specific enzyme. The conse- enzyme proteins; (2) some genes produce RNAs rather
quence of these enzyme losses or defects was the blockage than proteins; and (3) some proteins (e.g., b@globin) must
of a biosynthetic pathway and the absence of the end prod- join with other proteins to acquire a function. Despite these
uct of the pathway. Since each enzyme defect was inherited modifications, Beadle and Tatum’s fundamental conclusion
as a single-gene defect, the one gene–one enzyme hypoth- linking each gene to a particular product is valid and forms
esis identifies the direct connection between genes, proteins, the basis for understanding gene function.
1 Irradiate prototrophic
X-rays
Neurospora crassa growing
on minimal medium.
5 Transfer auxotrophs to
minimal media supple-
mented with one amino
acid to identify the
defective pathway.
Alanine
Arginine
Asparagine
Aspartic acid
Cysteine
Glutamic acid
Glutamine
Glycine
Histidine
Isoleucine
Leucine
Lysine
Methionine
Phenylalanine
Proline
Serine
Threonine
Tryptophan
Tyrosine
Valine
4.3 Gene Interaction Modifies Mendelian Ratios 127
Genetic Dissection to Investigate of the intermediates has no effect on its growth. Each methio-
Gene Action nine mutant grows on minimal medium plus methionine, the
end product of the biosynthetic pathway, but they show dif-
Beadle and Tatum’s experiments opened the way to inves- ferent growth patterns with other supplemented media. The
tigation of the roles of individual-gene mutations in bio- following is an analysis of each mutant:
synthetic pathways. These investigations began with three
assumptions about biosynthetic pathways that have proven 1. Met 1 grows only on minimal medium plus methio-
to be correct: (1) Biosynthetic pathways consist of sequen- nine, thus indicating that a mutation in the last step of
tial steps, (2) completion of one step generates the substrate the pathway prevents conversion of the final intermedi-
for the next step in the pathway, and (3) completion of every ate product to methionine. Only the addition of methio-
step is necessary for production of the end product of the nine to minimal medium bypasses the pathway block.
pathway. These assumptions support the conclusion that 2. Met 2 exhibits growth with supplementation by either
wild-type strains are able to complete each pathway step, methionine or homocysteine, thus indicating a block at
and that mutant strains are unable to complete a pathway the step that produces homocysteine. This result also
because one or more pathway steps are blocked by mutation. tells us that homocysteine is the substrate converted to
Genetic dissection in this context is an experimen- methionine in the biosynthetic pathway.
tal approach that separately tests the ability of a mutant to 3. Met 3 grows on minimal medium supplemented with
execute each step of a biosynthetic pathway and assembles either methionine, homocysteine, or cystathionine, but
the steps of a pathway by determining the point at which the not on minimal medium plus cysteine. This tells us that
pathway is blocked in each mutant. The strategy of genetic Met 3 is blocked at the step that produces cystathionine
dissection is illustrated for met- strain in Figure 4.18 using and that cystathionine precedes homocysteine in the
experimental data collected in 1947 by Norman Horowitz on pathway.
four independently isolated Neurospora crassa met– mutants.
4. Met 4 grows with any supplementation of minimal
The goals of Horowitz’s genetic dissection analysis were
medium. This tells us that Met 4 is defective at a step
to (1) determine the number of intermediate steps within the
that precedes the production of cysteine.
methionine biosynthetic pathway, (2) determine the order
of steps in the pathway, and (3) identify the step affected by Figure 4.18b shows the steps of the biosynthetic path-
each mutation. In designing his experiment, Horowitz relied way for methionine as determined by analysis of these
on previous biochemical work identifying homoserine as the mutants. The pathway step that is blocked in the mutant is
first compound in the methionine biosynthetic pathway and identified based on the logic that supplementation by a com-
identifying cysteine, homocysteine, and cystathionine as later pound needed after the blockage will permit growth, whereas
intermediates in the pathway. Horowitz tested the control pro- adding a compound used before the blockage will not aid
totroph (met+) and four methionine-requiring auxotrophs growth. The blocked step is also identified by the substance
(Met 1 to Met 4) for their ability to grow on (1) minimal that accumulates in the auxotroph: In each mutant, a different
medium, (2) minimal medium plus cysteine only, (3) minimal intermediate substance builds up because the step that would
medium plus cystathionine only, (4) minimal medium plus convert it to the next intermediate in the pathway is defective.
homocysteine only, and (5) minimal medium plus methionine Accumulation of cysteine by Met 3, cystathionine by Met 2,
only. Figure 4.18a shows growth (+) or no growth (-) of the and homocysteine by Met 1 supports the assignment of these
four met- mutants and the wild-type strain (met+) on each mutants to specific steps in the pathway. Genetic Analysis 4.2
of the experimental media. The wild-type strain grows on all illustrates genetic dissection of a biosynthetic pathway by
media, since supplementation of minimal medium with any assessment of the growth habits of auxotrophs.
Evaluate
1. Identify the topic of this problem and 1. This problem deals with mutants of the zmt-synthesis pathway and requires an
the kind of information the answer analysis of the defect in each mutant as well as ordering of the intermediates in
should contain. the zmt-synthesis pathway.
2. Identify the critical information given 2. The problem provides growth information for wild-type zmt + bacteria as well
in the problem. as four zmt - mutant strains when plated on minimal medium and on media
individually supplemented with zmt or one of five intermediates in the zmt-
synthesis pathway.
Deduce
3. Compare and 3. All mutants grow with zmt supplementation and with supplementation by com-
TIP: A supplement
evaluate the that supports growth
pound S. None grows without any supplementation, and none obtains growth
patterns of growth of all or most mutants support from compound D. Compounds F, M, and R each support growth of one
supported by the is likely to be near the or more mutants.
end of the pathway.
supplements.
4. Identify the final product of the path- 4. The compound zmt is the final product of the pathway. Compound S also sup-
way and next-latest pathway interme- ports the growth of all mutants and is likely the immediate precursor of zmt.
diate compound. TIP: A supplement supporting growth
of the fewest mutants is likely to be
Solve at the beginning of the pathway.
5. Identify the first compound 5. Compound D does not support growth of any of the zmt - mutants and likely
synthesized in the pathway. occurs before any of the synthesis steps affected by mutations. Compound D is
the first compound shown in the pathway.
6. Identify the second, third, and fourth 6. Compound R supports the growth of only one mutant, zmt-2, indicating the
compounds synthesized in the compound bypasses the step blocked in zmt-2. Compound R likely follows
pathway. compound D in the pathway, and zmt-2 is defective in its ability to convert
TIP: Medium supplemented with an intermediate D to R. zmt-2 grows on intermediate compounds that occur after its point
compound that occurs after the pathway step of pathway blockage, but not on compound D that comes before the zmt-2
blocked by a mutation will support growth. blockage.
Compound M supports growth of zmt-2 and zmt-4, bypassing the blockage in
both mutants. Growth of zmt-4 is not supported by compounds D or R that occur
before the conversion step blocked in zmt-4. The conclusion is that compound
M follows R and that zmt-4 is unable to convert R to M. Compounds F, M, and S
each support growth of zmt-4, so each bypasses the blockage.
TIP: To confirm this solution, verify that growth of each Compound F supports growth of zmt-3 and follows compound M in the pathway.
mutant is supported by supplementa-tion with compounds
that follow the blockage but not by supplementation with zmt-3 is unable to convert M to F. Compound S supports new growth of zmt-1,
compounds that precede the blockage. indicating that it follows compound F in the pathway and that zmt-1 fails to con-
vert compound F to S.
7. Assemble the zmt-synthesis pathway, 7. zmt-2 zmt-4 zmt-3 zmt-1
and identify the mutants at each path- D ¡ R ¡ M ¡ F ¡ S ¡ zmt
way step.
For more practice, see Problems 4, 18 and 19. Visit the Study Area to access study tools. Mastering Genetics
128
4.3 Gene Interaction Modifies Mendelian Ratios 129
FIGURE 4.19 Phenotype patterns in the F2 of a dihybrid cross that result from epistatic gene interaction.
—9 C P —9
9
16 A–B–
— 16 16
C–P– Precursor I Precursor II Anthocyanin Purple
A–bb —3 C p
CcPp 16
—7 Purple C–pp Precursor I Precursor II No pigment White
16 aaB–
×
aabb —3 c P —7
16 16
ccP– Precursor I No precursor II No pigment White
Complementary gene interaction occurs
CcPp
when genes must act in tandem to produce —1 c p
a phenotype. The wild-type action from Purple 16
both genes is required to produce the ccpp Precursor I No precursor II No pigment White
wild-type phenotype. Mutation of one or
both genes produces a mutant phenotype.
A–bb A
6
—3
—
16 Precursor
16 Protein A
aaB–
AaBb A–bb Precursor No protein B
—1 aabb Disk b Sphere 6
16
× —
a 16
—3 Precursor No protein A
16
Precursor Protein B
Dominant gene interaction occurs between genes aaB–
AaBb
that each contribute to a phenotype, producing one B Sphere
Disk
phenotype if dominant alleles are present at each
a
gene, a second phenotype if recessive alleles are 1
homozygous for either gene, and a third phenotype —1
Precursor No protein A —
16 16
if recessive homozygosity occurs at both genes.
aabb Precursor No protein B
Long
b
130
Example: labrador retriever coat color 9:3:4
B
Black
4 Recessive epistasis 9 eumelanin —9
— Precursor M 16
16
9:3:4 Precursor P Eumelanin
B–E–
deposition Black
E
9
16 A–B–
—
b
3
Brown 3
— Precursor M eumelanin —
16 16
bbE– Precursor P Eumelanin
3
16 A–bb
— BbEe
Black deposition Chocolate
aaB–
E
×
—4
16 B
aabb Black
—3 Precursor M eumelanin
16
Precursor P No eumelanin
B–ee
BbEe e deposition Yellow 4
Recessive epistasis occurs when recessive Black —
b 16
alleles at one gene mask or reduce the Brown
expression of alleles at the interacting locus. —1 Precursor M eumelanin
16
bbee Precursor P No eumelanin
deposition Yellow
e
131
132 CHAPTER 4 Gene Interaction
four feather-color phenotypes. As predicted by independent pathway and production of white flowers containing no
9
assortment, green feather color (wild type) is observed in 16 pigment.
of the progeny, blue feathers and yellow feathers are each The ability of two mutants with the same mutant phe-
3
seen in 16 of the F2, and the white-feather phenotype appears notype to produce progeny with the wild-type phenotype is
1
in 16 of the F2 progeny. called genetic complementation, and it indicates that more
than one gene is involved in determining the phenotype. We
Complementary Gene Interaction (9:7 Ratio) William discuss the details of genetic complementation in the last
Bateson (an enthusiastic proponent of “Mendelism” in the early section of this chapter.
20th century) and Reginald Punnett (of Punnett square fame)
were the first biologists to document an example of epistasis, in Duplicate Gene Action (15:1 Ratio) Two genes that dupli-
experiments conducted between 1906 and 1908. Bateson and cate one another’s activity constitute a redundant genetic sys-
Punnett studied heredity in sweet peas (Lathyrus odoratus), an tem in which any genotype possessing at least one copy of a
ornamental plant different from Mendel’s edible pea (Pisum dominant allele at either locus will produce the dominant phe-
sativum). Wild-type sweet peas have purple flowers, and the notype. Only when recessive h omozygosity is present at both
experiments began with crossing two pure-breeding mutant loci does the recessive phenotype appear. The genes in a redun-
plants that had white flowers. Bateson and Punnett expected dant system are said to have duplicate gene action; they either
these mutants to produce mutant (white-flowered) progeny, encode the same gene product, or they encode gene products
but to their surprise, the F1 generation all had purple flow- that have the same effect in a single pathway or compensatory
ers. When Bateson and Punnett crossed F1 plants, the F2 pro- pathways.
9 7
duced a ratio of 16 purple-flowered plants to 16 white-flowered Figure 4.21 2 provides an illustration and explana-
plants. tion of duplicate gene action identified inadvertently by
Bateson and Punnett recognized that their results Mendel in an experiment involving flower color in bean
could be explained if two genes interacted with one plants. Near the end of his famous 1866 paper describing
another to produce sweet pea flower color. Assuming two inheritance in peas, Mendel described an experiment with
genes are responsible for a single pigment that gives the beans that began with the cross of a pure-breeding purple-
sweet pea flower its purple color, each parental line— flowered bean plant to a pure-breeding white-flowered
represented by the genotypes ccPP and CCpp—is pure- bean plant. The F1 plants all had purple flowers, and Men-
breeding for white flowers as a result of homozygosity del probably assumed that flower color determination in
for recessive alleles at one of the genes. The cross of beans would follow the same pattern as in peas. Among
these two lines of pure-breeding white parents produces the 32 F2 plants Mendel produced, however, 31 had pur-
dihybrid purple-flowered F2 plants—genotype CcPp— ple flowers and only 1 had white flowers. Among the F2
because the dominant allele at each locus enables comple- plants, 15
16 have a genotype containing at least one copy of
tion of each step of the pathway leading to the synthesis of 1
either P or R, and only 16 have the genotype pprr and the
purple pigment. Independent assortment of alleles results white-flowered phenotype.
in four genotypic classes, C–P–, ccP–, C–pp, and ccpp, Figure 4.21 2 shows that the protein product of the
produced in the 9:3:3:1 ratio that is expected from a dihy- dominant allele of either gene is capable of catalyzing the
9
brid cross. Among the F2, however, only the 16 carry the conversion of a precursor to anthocyanin and producing the
C–P– genotype that confers the ability to produce purple dominant phenotype. Conversely, if homozygous recessive
7
pigment. The remaining 16 of the F2 are homozygous alleles are present at both loci, no functional gene product
either for one of the recessive alleles c and p or for both is produced, and the synthesis pathway is not completed.
sets of alleles. None of these plants is able to synthesize 1
White flowers result from the absence of pigment in the 16
pigment, due to the absence of functional gene products of the F2 progeny that are homozygous recessive for alleles
from one or both loci, and they all have the same mutant of both genes.
phenotype.
A 9:7 phenotypic ratio results from complementary Dominant Gene Interaction (9:6:1 Ratio) The shape
gene interaction that requires genes to work in tandem of summer squash is classified as either long, spherical,
to produce a single product. Figure 4.21 1 shows that at or disk-shaped. Plants that bear long fruit are consistently
the molecular level, purple flower color in sweet peas pure-breeding, indicating that these plants are homozygous
is produced when the pigment anthocyanin is depos- for genes controlling fruit shape. On the other hand, plants
ited in petals. Since anthocyanin production requires the producing disk-shaped fruit or spherical fruit are sometimes
action of the product of C as well as the product of P, pure-breeding and sometimes not, indicating that plants pro-
both steps must be successfully completed for anthocy- ducing disk-shaped or spherical fruit can be either homozy-
anin production and deposition in flower petals. The pres- gous or heterozygous for the genes controlling their shape.
ence of the homozygous recessive genotype at the C locus Figure 4.21 3 illustrates and describes dominant interaction
(cc), the P locus (pp), or both results in blockage of the between two genes controlling squash fruit shape. Dominant
4.4 Complementation Analysis Distinguishes Mutations in the Same Gene from Mutations in Different Genes 133
interaction is characterized by a 9:6:1 ratio of phenotypes in squash due to the inhibition of conversion of the colorless
the progeny of a dihybrid cross. precursor compound to green pigment. The protein products
A cross of pure-breeding disk plants (AABB) to pure- of the Y gene require a pigment substrate for their action,
breeding long plants (aabb) produces dihybrid F1 plants with and because plants that are W– do not produce a substrate,
9
disk-shaped fruit. The F2 progeny of these dihybrids are 16 the action of the protein products of alleles of the Y gene
6 1
disk, 16 spherical, and 16 long, a 9:6:1 ratio. The phenotype does not occur. Plants that are homozygous ww are able to
of an F2 plant depends on whether a dominant allele is pres- convert the colorless precursor to green pigment, yielding
ent for both genes, one gene, or neither gene. The molecular substrate for Y gene activity. The dominant allele of the Y
model of fruit shape production assumes that each gene pro- gene produces an enzyme that converts green pigment to
duces a distinct protein that contributes to fruit shape. yellow pigment. Homozygosity for the recessive allele (yy)
leaves the green pigment unaltered and green squash are
Recessive Epistasis (9:3:4 Ratio) Black, chocolate, and produced. Notice that in ww plants, segregation of Y gene
yellow coat colors in Labrador retrievers result from the inter- alleles in a cross of Yy monohybrids produces a 3:1 ratio of
action of two genes, one that produces pigment and another Y– (yellow) and yy (green) squash. This ratio can be seen by
that distributes the pigment to hair follicles. This form of gene looking at plants that are wwY– 1 16
3
2 and wwyy 1 161 2 .
interaction, in which homozygosity for a recessive allele at
one locus can mask the phenotypic expression of a second Dominant Suppression (13:3 Ratio) Our final example
gene, is called recessive epistasis and has the characteristic of epistatic gene interaction is dominant suppression,
9:3:4 ratio of phenotypes illustrated by Figure 4.21 4 . illustrated in Figure 4.21 6 . In dominant suppression, the
Crossing pure-breeding chocolate to pure-breeding yel- dominant allele of one gene suppresses expression of the
low dogs produces F1 progeny with black coats. That the other gene. In the blue pimpernel plant, production of the
F1 progeny are dihybrid is revealed by the F2 generation, blue flower pigment is controlled by the L gene. Plants
9
in which 16 of the progeny carry the genotypes in the B–E– that are L– are capable of producing blue pigment, whereas
3
class and have black coats, 16 have a genotype that is bbE–, those that are ll produce no pigment and are white. A second
4
resulting in chocolate-colored coats, and 16 carry genotypes gene, D, has a dominant allele that suppresses the expres-
that are either B–ee or bbee and have yellow coats. sion of the L gene; thus, plants that are D– are white regard-
The molecular explanation for this genetic system is less of the L gene genotype, because the D allele controls L
tied to production of the hair pigment melanin. Dogs can gene expression. Plants that are dd allow L gene expression.
produce eumelanin that gives hair a black or brown color Crosses between pure-breeding blue-flowered plants (LLdd)
and pheomelanin that gives hair a reddish or yellowish tone. and pure-breeding white flowered plants (llDD) produce
The E gene is TYRP1 that controls eumelanin distribution. white-flowered F1 that are dihybrid (LlDd), and the F2 have
A single copy of the wild-type allele E yields full eumela- a 13:3 ratio that is characteristic of dominant suppression.
nin deposition, but allele e homozygosity blocks deposition. Flowers that are L–D– are white because the dominant D
Gene B is MC1R that controls eumelanin synthesis, with B allele suppresses L gene expression. Plants that are L–dd are
producing a large amount of eumelanin that overwhelms blue because the L gene is not suppressed and the L allele
the pheomelanin present to produce a black coat color. The catalyzes pigment production. Plants that are llD– are white
alternative allele b produces a reduced amount of eumela- due to the presence of D and the inability of recessive ll
nin. When mixed with pheomelanin in the coat, the result- plants to produce pigment. Lastly, plants that are lldd are
ing color is brown, sometimes called “chocolate.” Dogs that white due to the inability of ll plants to produce pigment.
are B–E– produce, transport, and deposit large amounts of Genetic Analysis 4.3 tests your ability to analyze crosses
eumelanin and have black coats. Dogs that are bbE_ produce involving epistatic gene interaction.
less eumelanin due to their bb genotype and have chocolate
(brown) coats. Dogs that are homozygous ee are unable to
transport and deposit eumelanin and instead deposit only 4.4 Complementation Analysis
pheomelanin. These dogs have yellow coat color.
Distinguishes Mutations in the
Dominant Epistasis (12:3:1 Ratio) Determination of Same Gene from Mutations
fruit color in summer squashes provides an example of
dominant epistasis. In this type of epistatic interaction, in Different Genes
the dominant allele of one gene blocks the expression of
alleles of the second gene. Summer squash occur in three Suppose you are a geneticist working in California and you
colors: white, yellow, and green. In Figure 4.21 5 , the cross have identified a recessive mutation causing petunia flow-
of dihybrid WwYy (white) plants yields a 12:3:1 ratio of ers to be white rather than the wild-type purple color. A
white:yellow:green plants. Plants with one or two copies of friend of yours, also a geneticist, is working on petunias
W—that is, W–Y– (9/16) and W–yy (3/16)—produce white in the Netherlands and contacts you because she has also
GENETIC ANALYSIS 4.3
PROBLEM Dr. Ara B. Dopsis, a famous plant geneticist, decides to try his hand at iris propagation. He
selects two pure-breeding irises, one red and the other blue, and crosses them. To his surprise, all F1
BREAK IT DOWN: Neither plants have purple flowers. He decides to create more purple irises by self-fertilizing
red nor blue is dominant the F1 irises. Dr. Dopsis produces 320 F2 plants consisting of 182 with purple flowers, BREAK IT DOWN: Examine the ratio
(p. 135).
59 with blue flowers, and 79 with red flowers. of progeny phenotypes carefully to
propose a mechanism of inheritance
a. From the information available, identify the genetic phenomenon that produces the phenotypic (p. 133).
ratio observed in the F2 plants. Include the number of genes that are involved in this trait.
b. Using clearly defined symbols of your own choosing, identify the genotypes of parental and F1
plants.
Evaluate
1. Identify the topic this problem addresses and 1. This problem concerns the interpretation of F1 and F2 results; it
describe the nature of the required answer. requires identification of the genetic mechanism responsible for the
observed results, and the assignment of genotypes to parental and F1
plants in a manner consistent with the genetic mechanism.
2. Identify the critical information given in the 2. The problem states that the blue- and red-flowered parents are pure-
problem. breeding and that their F1 are exclusively purple flowered. Among the
F2, purple is predominant, but red and, to a lesser extent, blue are also
observed.
Deduce
3. Deduce the potential genetic mechanisms that 3. Two potential mechanisms are suggested by these data. First, a single
could account for producing purple-flowered gene with incomplete dominance might generate a phenotype in F1
F1 plants from the pure-breeding red and blue heterozygous plants that is different from that of either homozygous
parental plants. parent. Second, two genes displaying an epistatic interaction might
account for a phenotype in an F1 dihybrid that is distinct from either
TIP: Compare the relative pure-breeding parent.
4. Determine the relative percentages of each 4. A single-gene model predicts that the self-fertilization of an F1 hetero-
phenotype to see which
phenotype proportions genetic model most zygote will result in a 1:2:1 (25%:50%:25%) ratio in the F2. A two-gene
predicted by the possible closely predicts the epistasis model producing three F2 phenotypes could be dominant
observed percentages.
genetic mechanisms and gene interaction (9:6:1 ratio), dominant epistasis (12:3:1 ratio), or
compare them with the recessive epistasis (9:4:3 ratio). Recessive epistasis predictions are a
observed phenotype ratio. closer match to the observations than dominant epistasis predictions.
Recessive epistasis predicts phenotype percentages of approximately
56%:25%:19%. The observed ratio of F2 phenotypes is 182 320 = 56.8,
79 59
purple, 320 = 24.7, red, and 320 = 18.4 blue.
Solve Answer a
5. Identify the genetic mechanism most likely to 5. Comparison of the F2 predictions of the single-gene incomplete domi-
account for the outcomes of these crosses. nance model and the two-gene recessive epistasis model determines
that recessive epistasis is a better match with the relative progeny
proportions. The likely genetic model explaining these data is reces-
TIP: See Foundation Figure 4.21
for the phenotype ratios char- sive epistasis. (For confirmation, the number of F2 observed in each
acteristic of each type of epi- category can be compared with the number expected by chi-square
static interaction.
analysis.)
Answer b
6. Assign genotypes to parental and F1 plants. 6. Using symbols A and a for one gene and B and b for the second gene,
TIP: Foundation Figure 4.21 identifies geno-
the genotypes of plants are
types associated with each phenotype.
Parents: aaBB (red) and AAbb (blue)
F1 : AaBb (purple).
For more practice, see Problems 5, 10, and 22. Visit the Study Area to access study tools. Mastering Genetics
134
4.4 Complementation Analysis Distinguishes Mutations in the Same Gene from Mutations in Different Genes 135
identified a recessive mutation resulting in white-flowered mutant indicate homozygosity for recessive alleles on
petunias. Since there has been no contact between Califor- different genes in the parents and a dihybrid genotype in
nia petunias and Netherland petunias, the mutations have the F1. In contrast to this result, Cross 2 and Cross 3 are
arisen independently. When geneticists encounter organisms also made using pure-breeding white-flower parentals. In
with the same mutant phenotype, two initial questions are both crosses, however, the F1 have the mutant phenotype.
(1) do these organisms have mutations on the same gene or This indicates that there is no genetic complementation
on different genes, and (2) how many genes are responsible and that the mutant parents in the respective crosses carry
for the mutations observed? mutations on the same gene. Cross 2 illustrates mutant pat-
Mutations of different genes can produce the same, ental plants that are homozygous for the C gene (ccPP),
or very similar, abnormal phenotypes. This phenomenon and cross 3 illustrates mutant parental plants for gene P
is known as genetic heterogeneity, and several examples (CCpp).
have been seen in this chapter. For example, in a multistep Genetic complementation analyses using numer-
pathway whose end point is the production of a pigment ous crosses of different pure-breeding mutants can deter-
that colors flower petals, it is possible that a mutation of mine which mutants represent mutations of a certain gene,
any of the genes in the pathway could block production of which represent mutations of certain other genes, and how
the pigment and produce mutant flower color. In this sec- many different mutant genes are represented in a group of
tion, we discuss genetic complementation analysis, an mutants. A genetic complementation table organizing each
experimental analysis of crosses designed to test alterna- of the crosses made to test genetic complementation of nine
tive genetic explanations of an abnormal phenotype. The different mutations of eye color in the fruit fly Drosophila is
results of genetic complementation analysis can deter- shown in Figure 4.23. Crosses of pure-breeding parental eye
mine whether mutant organisms carry mutations of differ- color mutants that produce wild-type eye color in F1 prog-
ent genes that produce the abnormal phenotype or if the eny are indicated by plus symbols ( +), signaling genetic
abnormal phenotype occurs due to allelic mutations on the complementation (i.e., mutations in different genes). Paren-
same gene. tal crosses producing mutant F1 are indicated by minus
Genetic complementation testing is done by cross- symbols (-), signaling no genetic complementation (i.e.,
ing pure-breeding mutants for a recessive mutation and mutations in the same gene).
observing the phenotype of F1 progeny. If the F1 progeny Complementation analysis of this type initially focuses
have the wild-type phenotype, genetic complementation has on crosses that indicate no complementation, as this is a
occurred, and the conclusion is that the mutant alleles are of sign of mutations that are in the same gene. Mutations that
different genes. On the other hand, if the mutant alleles are mutually fail to complement one another are identified as a
of the same gene, the progeny of two pure-breeding mutants complementation group, which can consist of one or more
will have a mutant phenotype. This result indicates that no mutant alleles of a single gene. All members of a comple-
genetic complementation has taken place. mentation group will fail to complement other members
Let’s look at an example using two genes we identi- of the group, but they will complement members of other
fied in Figure 4.21. In discussing complementary gene complementation groups that represent mutations of other
interaction, we described production of the purple-colored genes. In the genetic context, a “complementation group”
pigment anthocyanin as requiring the action of dominant is synonymous with a “gene” because the mutant alleles of
alleles of the C gene and the P gene. Figure 4.22 shows each complementation group all affect the same phenotypic
three crosses involving four pure-breeding white-flower characteristic. Thus, in genetic complementation analysis,
mutants. Cross 1, between mutant A and mutant B, pro- the number of complementation groups equals the number
duces F1 progeny that have wild-type purple flowers. of genes.
The genetic interpretation of this result is that genetic Assessment of the complementation testing data in
complementation is observed. Genotypes given for each Figure 4.23 finds that apricot, buff, cherry, coral, and white
Mutation Apricot Brown Buff Carnation Cherry Claret Coral Vermilion White Figure 4.23 Genetic complementa-
tion analysis of Drosophila eye color
Apricot – + – + – + – + –
Brown – + + + + + + + mutants. Genetic complementation
Buff – + – + – + – testing among nine distinct Drosophila
Carnation – + + + + + eye color mutants reveals five comple-
Cherry – + – + – mentation groups corresponding to
Claret – + + + five genes. Five mutant alleles of white
Coral – + – mutually fail to complement and are
Vermilion – +
assigned to the same gene. The other
White –
four mutants each complement one
another and the white gene mutants
Complementation and are assigned to their own gene.
group Mutant (allele)
Complementation is indicated by ; + <
I Apricot (w a), buff (w b), cherry (w ch), coral (w co), white (w) and no complementation by ; - .<
II Carnation (c)
III Claret (cl) Q If a tenth eye color mutation fails
IV Brown (b) to complement carnation but comple-
V Vermilion (v) ments the other eight mutations, into
which group is it placed?
all exhibit a mutual failure to complement. This result iden- a mutation of a gene of its own (i.e., complementation
tifies the five mutations as occurring in the same gene. The groups II through V). Therefore, among the nine Drosophila
conclusion is that apricot, buff, cherry, coral, and white are eye color mutants examined, five genes (five complementa-
mutant alleles of the white (w) gene in Drosophila. These tion groups) are identified. One gene is represented by five
mutations form complementation group I. In contrast, the mutants, and the other four genes are represented by one
mutations brown, carnation, claret, and vermilion each com- mutation each.
plement all other mutations. This observation tells investiga- Genetic complementation analysis is an important tool
tors that each of these mutant alleles represents a separate of genetic analysis. The rare human cancer-prone disorder
gene. In other words, because each of these mutants comple- xeroderma pigmentosum (various OMIM designations) can
ments mutants of group I (gene w), they are not mutations of result from inherited mutations of any of seven genes that
gene w. Further, the mutations carnation, claret, brown, and were originally identified by genetic complementation anal-
vermillion all complement one another, thus each represents ysis. The following Case Study outlines this analysis.
C A SE ST U D Y
Complementation Groups in a Human Cancer-Prone Disorder
In this case study, we examine the use of genetic comple- skin cells from XP patients and from normal controls and
mentation analysis to identify the number of genes involved tested the ability of the cells to grow after exposure to
in a rare human cancer-prone condition called xeroderma measured doses of UV irradiation (Figure 4.24). The cells
pigmentosum (XP). XP is characterized by severe sensitivity to were exposed to UV light for different amounts of time,
ultraviolet (UV) irradiation from sunlight and by an increase of and their growth was measured as the percentage of orig-
up to a thousandfold in the rate of sun-induced skin cancer. inal cells able to form colonies after UV exposure. These
People with XP are deficient in a type of DNA damage researchers identified five distinct patterns of response
repair called nucleotide excision repair (NER), one of the to UV exposure that are designated as complementation
normal processes the body uses to repair UV-induced dam- groups A to E.
age in DNA. In NER, a short section of DNA containing a Other researchers measured the response of cultured XP
UV-induced lesion is removed, and the gap is filled by new cells to UV exposure by determining the level of NER taking
DNA (see Section 11.5). place in XP cell cultures taken from different XP individuals in
comparison with normal cells. The results showed that XP cell
COMPLEMENTATION GROUPS Research work that began lines vary in their levels of NER from less than 5% of normal
in the late 1970s identified seven complementation groups to about 50% of normal. These results could be due to the
representing seven different genes that are mutated in differ- mutations being in different genes or, alternatively, to differ-
ent forms of XP. Each form of XP has its own OMIM number, ent hypomorphic alleles of the same gene.
and the forms differ in their severity and clinical presentation Genetic complementation analysis was then used in the
as a result of these different mutant genes. study of XP cell cultures with low NER to identify cell lineages
Two approaches were used to identify these groups. carrying different XP gene mutations. For this analysis, many
Anthony Andrews and his colleagues obtained cultured tests were done in which two cells from lineages with low
Summary 137
C
Gr
ou
up
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
4.1 Interactions between Alleles Produce ❚❚ In variable expressivity, organisms with the same genotype
Dominance Relationships have different degrees of phenotypic expression.
❚❚ Pleiotropic mutations affect two or more distinct and seem-
❚❚ Loss-of-function mutations decrease or eliminate gene ingly independent attributes of the phenotype.
activity. Gain-of-function mutations can cause overexpres-
sion or result in new functions.
❚❚ Incomplete dominance produces heterozygotes with phe-
4.3 Gene Interaction Modifies
notypes that differ from those of either homozygote but are Mendelian Ratios
closer to one homozygous phenotype than the other. ❚❚ Epistasis is revealed by six alternative ratios that are modi-
❚❚ Codominant alleles are both detected in the heterozygous fications of the 9:3:3:1 ratio expected among the progeny of
phenotype. a dihybrid cross.
❚❚ The levels of activities of allelic products and their effects ❚❚ The types of epistasis and their ratios are complementary
on phenotypes determine the dominance relationship gene interaction (9:7), duplicate gene action (15:1), domi-
between alleles. nant gene interaction (9:6:1), recessive epistasis (9:3:4),
❚❚ ABO blood types are produced by alleles whose protein dominant epistasis (12:3:1), and dominant suppression
products produce dominance or codominance depending on (13:3).
the genotype.
❚❚ Multiple alleles of a single gene can display a variety of 4.4 Complementation Analysis Distinguishes
dominance relationships that establish an allelic series. Mutations in the Same Gene from Mutations
❚❚ Lethal alleles can kill gametes, can prevent the gestational in Different Genes
development of certain classes of progeny, or can have their
lethal effect later in life. ❚❚ Genetic complementation produces progeny with the
ild-type phenotype from parents that are pure-breeding
w
for similar mutant phenotypes. The detection of genetic
4.2 Some Genes Produce Variable Phenotypes complementation means the mutations occur in different
❚❚ Sex-limited and sex-influenced traits are expressed differ- genes.
ently in the sexes due to the influences of hormones. ❚❚ The failure to detect genetic complementation from the
❚❚ In incomplete penetrance, a genotype does not always have cross of two similar mutant organisms identifies the mutant
the expected corresponding phenotype. alleles as being carried by the same gene.
138 CHAPTER 4 Gene Interaction
PREPA R IN G F O R P R O B LE M S O LV I NG
In addition to the list of problem-solving tips and sugges- 3. When building a genetic hypothesis, use the results
tions given here, you can go to the Study Guide and Solu- of genetic crosses. Begin with the simplest model and
tions Manual that accompanies this book for help at solving devise more complex models only when the data do not
problems. fit a simpler model.
1. Dominance relationships between the alleles of a 4. Once you have formed a genetic hypothesis, assign
gene are determined by the activity of the allelic gene genotypes or make predictions about phenotypes and
products. Do not assume the mutations are always their frequencies based on the hypothesis.
recessive. Instead, use the transmission pattern to
5. Be familiar with the ratios commonly observed in epi-
determine the dominance relationships of alleles to one
static interactions, and be prepared to use those ratios to
another.
interpret the results of crosses.
2. Genes determine phenotypes by the sequential action of
6. Be familiar with the rules and interpretation of the
their gene products in multistep pathways. Usually, one
results of genetic complementation analysis.
step must be completed before the next step can occur.
Fit genetic data to molecular models of pathways.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Define and distinguish incomplete penetrance and vari- Mutant 1 grows only on min + Ser. In addition to growth
able expressivity. on min + Ser, mutant 2 also grows on min + 3@PHP
and min + 3@PS. Mutant 3 grows on min + 3@PS and
2. Define and distinguish epistasis and pleiotropy. min + Ser. Identify the step of the serine biosynthesis
3. When working on barley plants, two researchers inde- pathway at which each mutant is defective.
pendently identify a short-plant mutation and develop 5. In a type of parakeet known as a “budgie,” feather color is
homozygous recessive lines of short plants. Careful controlled by two genes. A yellow pigment is synthesized
measurements of the height of mutant short plants versus under the control of a dominant allele Y. Budgies that are
normal tall plants indicate that the two mutant lines have homozygous for the recessive y allele do not synthesize
the same height. How would you determine if these two yellow pigment. At an independently assorting gene, the
mutant lines carry mutation of the same gene or of differ- dominant allele B directs synthesis of a blue pigment.
ent genes? Recessive homozygotes with the bb genotype do not
4. Fifteen bacterial colonies growing on a complete medium produce blue pigment. Budgies that produce both yellow
are transferred to a minimal medium. Twelve of the colo- and blue pigments have green feathers; those that produce
nies grow on minimal medium. only yellow pigment or only blue pigment have yellow or
a. Using terminology from the chapter, characterize the blue feathers, respectively; and budgies that produce nei-
12 colonies that grow on minimal medium and the ther pigment are white (albino).
3 colonies that do not. a. List the genotypes for green, yellow, blue, and albino
b. The three colonies that do not grow on minimal budgies.
medium are transferred to minimal medium supple- b. A cross is made between a pure-breeding green budgie
mented with the amino acid serine (min + Ser), and a pure-breeding albino budgie. What are the geno-
and all three colonies grow. Characterize these three types of the parent birds?
colonies. c. What are the genotype(s) and phenotype(s) of the F1
c. The serine biosynthetic pathway is a three-step path- progeny of the cross described in part (b)?
way in which each step is catalyzed by the enzyme d. If F1 males and females are mated, what phenotypes
product of a different gene, identified as enzymes A, B, are expected in the F2, and in what proportions?
and C in the diagram below. e. The cross of a green budgie and a yellow budgie pro-
duces offspring that are 12 green, 4 blue, 13 yellow,
Enzyme A
3@Phosphoglycerate ¡ 3@Phospho@hydroxypyruvate ¡
Enzyme B and 3 albino. What are the genotypes of the parents?
(3@PHP) 6. The ABO and MN blood groups are shown for four sets of
Enzyme C parents (1 to 4) and four children (a to d). Recall that the
3@Phosphoserine ¡ Serine ABO blood group has three alleles: I A, I B, and i. The MN
(3@PS) (Ser) blood group has two codominant alleles, M and N. Using
Problems 139
your knowledge of these genetic systems, match each man who has blood types A, Rh+ , and M. Determine
child with every set of parents who might have conceived the genotypes of each parent.
the child, and exclude any parental set that could not have b. What proportion of children born to a man with geno-
conceived the child. type I AI B Rr MN and a woman who is I Ai Rr NN will
have blood types B, Rh- , and MN? Show your work.
Mother Father c. A man with blood types B, Rh+ , and N says he could
ABO MN ABO MN
not be the father of a child with blood types O, Rh- ,
and MN. The mother of the child has blood types A,
1 O M B M Rh+ , and MN. Is the man correct? Explain.
2 B N B N
10. In rats, gene B produces black coat color if the genotype
3 AB MN B MN is B–, but black pigment is not produced if the genotype
4 A N B MN is bb. At an independent locus, gene D produces yel-
low pigment if the genotype is D–, but no pigment is
Children
produced when the genotype is dd. Production of both
pigments results in brown coat color. If neither pigment
ABO MN is produced, coat color is cream. Determine the geno-
a B M types of parents of litters with the following phenotype
b O M distributions.
a. 4 brown, 4 black, 4 yellow, 4 cream
c AB MN
b. 3 brown, 3 yellow, 1 black, 1 cream
d B N c. 9 black, 7 brown
7. The wild-type color of horned beetles is black, although 11. In the rats identified in Problem 10, a third independently
other colors are known. A black horned beetle from a assorting gene involved in determination of coat color is
pure-breeding strain is crossed to a pure-breeding green the C gene. At this locus, the genotype C– permits expres-
female beetle. All of their F1 progeny are black. These F1 sion of pigment from genes B and D. The cc genotype,
are allowed to mate at random with one another, and 320 however, prevents expression of coat color and results in
F2 beetles are produced. The F2 consists of 179 black, albino rats. For each of the following crosses, determine
81 green, and 60 brown. Use these data to explain the the expected phenotype ratio of progeny.
genetics of horned beetle color. a. BbDDCc * BbDdCc
b. BBDdcc * BbddCc
8. Two genes interact to produce various phenotypic ratios c. bbDDCc * BBddCc
among F2 progeny of a dihybrid cross. Design a differ- d. BbDdCC * BbDdCC
ent pathway explaining each of the F2 ratios below, using
hypothetical genes R and T and assuming that the domi- 12. Using the information provided in Problems 10 and 11,
nant allele at each locus catalyzes a different reaction or determine the genotype and phenotype of parents that
performs an action leading to pigment production. The produce the following progeny:
9 3 4
recessive allele at each locus is null (loss-of-function). a. 16 brown : 16 black : 16 albino
Begin each pathway with a colorless precursor that pro- b. 38 black : 38 cream : 28 albino
duces a white or albino phenotype if it is unmodified. The 27 16 9 9 3
ratios are for F2 progeny produced by crossing wild-type c. 64 brown : 64 albino : 64 yellow : 64 black : 64 cream
3 1
F1 organisms with the genotype RrTt. d. 4 brown : 4 yellow
9 6 1
a. 16 dark blue : 16 light blue : 16 white 13. Total cholesterol in blood is reported as the number of
b. 12 3 1
16 white : 16 green : 16 yellow
milligrams (mg) of cholesterol per 100 milliliters (mL) of
9 3 3 1 blood. The normal range is 180–220 mg/100 mL. A gene
c. 16 green : 16 yellow : 16 blue : 16 white
mutation altering the function of cell-surface cholesterol
9 7
d. 16 red : 16 white receptors restricts the ability of cells to collect cholesterol
e. 15 1
black : 16 white from blood and draw it into cells. This defect results in
16
9 3 4 elevated blood cholesterol levels. Individuals who are
f. 16 black : 16 gray : 16 albino heterozygous for a mutant allele and a wild-type allele
13 3
g. 16 white : 16 green have levels of 300–600 mg/100 mL, and those who are
9. The ABO blood group assorts independently of the Rhe- homozygous for the mutation have levels of 800–1000
sus (Rh) blood group and both assort independently of the mg/100 mL. Identify the genetic term that best describes
MN blood group. Three alleles, I A, I B, and i, occur at the the inheritance of this form of elevated cholesterol level,
ABO locus. Two alleles, R, a dominant allele producing and justify your choice.
Rh+ , and r, a recessive allele for Rh- , are found at the 14. Flower color in snapdragons results from the amount of
Rh locus, and codominant alleles M and N occur at the the pigment anthocyanin in the petals. Red flowers are
MN locus. Each gene is autosomal. produced by plants that have full anthocyanin produc-
a. A child with blood types A, Rh- , and M is born to a tion, and ivory-colored flowers are produced by plants
woman who has blood types O, Rh- , and MN and a that lack the ability to produce anthocyanin. The allele
140 CHAPTER 4 Gene Interaction
An1 has full activity in anthocyanin production, and the the reduced fertility line and counts 622 viable seeds and
allele An2 is a null allele. Dr. Ara B. Dopsis, a famous 204 nonviable seeds.
genetic researcher, crosses pure-breeding red snapdrag- a. What single-gene mechanism best explains the breed-
ons to pure-breeding ivory snapdragons and produces F1 er’s observation?
progeny plants that have pink flowers. He proposes that b. Propose an additional experiment to test the genetic
this outcome is the result of incomplete dominance, and mechanism you propose. If your hypothesis is correct,
he crosses the F1 to test his hypothesis. What phenotypes what experimental outcome do you predict?
does Dr. Dopsis predict will be found in the F2, and in
16. In cattle, an autosomal mutation called Dexter produces
what proportions?
calves with short stature and short limbs. Embryos that
15. A plant line with reduced fertility comes to the atten- are homozygous for the Dexter mutation have severely
tion of a plant breeder who observes that seed pods often stunted development and either spontaneously abort or are
contain a mixture of viable seeds that can be planted to stillborn. What progeny phenotypes do you expect from
produce new plants, and withered seeds that cannot be the cross of two Dexter cows? What are the expected pro-
sprouted. The breeder examines numerous seed pods in portions of the expected phenotypes?
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
17. The coat color in mink is controlled by two codominant a. What are the possible genotype(s) for pure-breeding
alleles at a single locus. Red coat color is produced by red petunias?
the genotype R1R1, silver coat by the genotype R1R2, and b. What are the possible genotype(s) for true-breeding
platinum color by R2R2. White spotting of the coat is a blue petunias?
recessive trait found with the genotype ss. Solid coat color c. True-breeding red petunias are crossed to pure-breeding
is found with the S– genotype. blue petunias, and all the F1 progeny have purple flow-
a. What are the expected progeny phenotypes and pro- ers. If the F1 are allowed to self-fertilize and produce
portions for the cross SsR1R2 * ssR2R2? the F2, what is the expected phenotypic distribution of
b. If the cross SsR1R2 * SsR1R1 is made, what are the the F2 progeny? Show your work.
progeny phenotypes, and in what proportions are they 19. Feather color in parakeets is produced by the blending of
expected to occur? pigments from two biosynthetic pathways shown below.
c. Two crosses are made between mink. Cross 1 is the Four independently assorting genes (A, B, C, and D)
cross of a solid, silver mink to one that is solid, plati- produce enzymes that catalyze separate steps of the path-
num. Cross 2 is between a spotted, silver mink and one ways. For the questions below, use an uppercase letter
that is solid, silver. The progeny are described in the to indicate a dominant allele producing full enzymatic
table below. Use these data to determine the genotypes activity and a lowercase letter to indicate a recessive allele
of the parents in each cross. producing no functional enzyme. Feather colors produced
by mixing pigments are green (yellow + blue) and purple
Cross Offspring
(red + blue). Red, yellow, and blue feathers result from
Spotted, Spotted, Spotted, Solid, Solid, Solid, production of one colored pigment, and white results from
platinum silver red platinum silver red absence of pigment production.
1 2 3 0 6 5 0 Enzyme A Enzyme B
Pathway I: Compound I ¡ Compound II ¡ Compound III
2 3 7 2 4 5 3
(colorless) (red) (yellow)
18. Strains of petunias come in four pure-breeding colors: Enzyme C Enzyme D
white, blue, red, and purple. White petunias are produced Pathway II: Compound X ¡ Compound Y ¡ Compound Z
when plants synthesize no flower pigment. Blue petunias (colorless) (colorless) (blue)
and red petunias are produced when plants synthesize a. What is the genotype of a pure-breeding purple para-
blue or red pigment only. Purple petunias are produced keet strain?
in plants that synthesize both red and blue pigment (the b. What is the genotype of a pure-breeding yellow strain
mixture of red and blue makes purple). Flower-color of parakeet?
pigments are synthesized by gene action in two separate c. If a pure-breeding blue strain of parakeet (aa BB CC DD)
pigment-producing biochemical pathways. Pathway I is crossed to one that is pure-breeding purple, predict the
contains gene A that produces an enzyme to catalyze genotype(s) and phenotype(s) of the F1. Show your work.
conversion of a colorless pigment designated white 1 to d. If F1 birds identified in part (c) are mated at random,
blue pigment. In Pathway II, the enzymatic product of what phenotypes do you expect in the F2 generation?
gene B converts the colorless pigment designated white 2 What are the ratios among phenotypes? Show your work.
to red pigment. The two genes assort independently.
20. Brachydactyly type D is a human autosomal dominant con-
gene A dition in which the thumbs are abnormally short and broad.
Pathway I: White 1 ¡ Blue In most cases, both thumbs are affected, but occasionally just
+ = Purple one thumb is involved. The accompanying pedigree shows a
Pathway II: White 2 ¡ Red family in which brachydactyly type D is segregating. Filled
gene B circles and squares represent females and males who have
Problems 141
involvement of both thumbs. Half-filled symbols represent 23. Three strains of green-seeded lentil plants appear to
family members with just one thumb affected have the same phenotype. The strains are designated
G1, G2, and G3. Each green-seeded strain is crossed
1 2
I to a pure-breeding yellow-seeded strain designated
1 2 3 4 5 6 7 8 Y. The F1 of each cross are yellow; however, self-
II fertilization of F1 plants produces F2 with different
1 2 3 4 5 6 7 8 9 10 11 proportions of yellow- and green-seeded plants as shown
III
below.
1 2 3 4 5 6
IV
Parental Strain F1 Phenotype F2 Phenotype
a. Is there any evidence of variable expressivity in this Green Yellow Green Yellow
family? Explain.
b. Is there evidence of incomplete penetrance in this fam- G1 Y All yellow 1
4
3
4
ily? Explain. G2 Y All yellow 7 9
16 16
21. A male and a female mouse are each from pure-breeding G3 Y All yellow 37 27
albino strains. They have a litter of 10 pups, all of which 64 64
6 + – + + – –
7 – + – + + + – 25. The crosses shown on the following page are performed
8 + + + – + + + – between morning glories whose flower color is deter-
mined as described in Problem 24. Use the segregation
9 + + + + + + + + –
data to determine the genotype of each parental plant.
10 + – + + – – + + + –
1 2 3 4 5 6 7 8 9 10
Mutant
142 CHAPTER 4 Gene Interaction
Parental Phenotypes Offspring Phenotypes differences shown in the phenotypes of family members
say about the expression of the mutant allele?
a. blue * blue 3
4 blue : 14 purple
28. Yeast are single-celled eukaryotic organisms that grow in
b. purple * purple 1
blue : 12 purple : 14 red
4 culture as either haploids or diploids. Diploid yeast are
c. blue * red 1
4 blue : 12 purple : 14 red generated when two haploid strains fuse together. Seven
haploid mutant strains of yeast exhibit similar normal
d. purple * red 1
purple : 12 red
2 growth habit at 25°C, but at 37°C, they show different
e. blue * purple 3
8 blue : 12 purple : 18 red growth capabilities. The table below displays the growth
pattern
26. Two pure-breeding strains of summer squash produc-
ing yellow fruit, Y1 and Y2, are each crossed to a pure-
breeding strain of summer squash producing green Strain growth
A B C D E F G
fruit, G1, and to one another. The following results are
25°C
obtained:
37°C
Cross P F1 F2 Normal growth
Slow growth
I Y1 (yellow) All yellow 3
4
1
yellow : green
4
No growth
* G 1 (green)
II Y2 (yellow) All green 3
green : 14 yellow
4 a. Hypothesize about the nature of the mutation affect-
* G 1 (green)
ing each of these mutant yeast strains, including why
III Y1 (yellow) All yellow 13
16
3
yellow : 16 green strains B and G display different growth habit at 37°C
* Y2 (yellow) than the other strains.
b. Researchers induce fusion in pairs of haploid yeast
a. Examine the results of each cross and predict how
strains (all possible combinations), and the resulting
many genes are responsible for fruit-color determina-
diploids are tested for their ability to grow at 37°C.
tion in summer squash. Justify your answer.
The results of the growth experiment are shown
b. Using clearly defined symbols of your choice, give the
below.
genotypes of parental, F1, and F2 plants in each cross.
c. If the F1 of Crosses I and II are mated, predict the phe-
notype ratio of the progeny. 37°C growth data
27. Marfan syndrome is an autosomal dominant disorder in Strain
humans. It results from mutation of a gene on chromo- A B C D E F G
some 15 that produces the connective tissue protein fibril- A
lin. In its wild-type form, fibrillin gives connective tissues, B
such as cartilage, elasticity. When mutated, however, C
fibrillin is rigid and produces a range of phenotypic com- D
plications, including excessive growth of the long bones E
of the leg and arm, sunken chest, dislocation of the lens of F
the eye, and susceptibility to aortic aneurysm, which can G
lead to sudden death in some cases.
Different sets of symptoms are seen among various
family members, as shown in the pedigree below. Each How many different genes are mutated among these
quadrant of the circles and squares represents a different seven yeast strains? Identify the strains that represent
symptom, as the key indicates. each gene mutation.
29. During your work as a laboratory assistant in the research
facilities of Dr. O. Sophila, a world-famous geneticist, you
come across an unusual bottle of fruit flies. All the flies
in the bottle appear normal when they are in an incubator
set at 22°C. When they are moved to a 30°C incubator,
however, a few of the flies slowly become paralyzed; and
after about 20 to 30 minutes, they are unable to move.
Returning the flies to 22°C restores their ability to move
Long bones Sunken chest after about 30 to 45 minutes.
Lens dislocation Aortic aneurysm With Dr. Sophila’s encouragement, you set up 10
individual crosses between single male and female flies
All cases of Marfan syndrome are caused by mutation that exhibit the unusual behavior. Among 812 progeny,
of the fibrillin gene, and all family members with Mar- 598 exhibit the unusual behavior and 214 do not. When
fan syndrome carry the same mutant allele. What do the you leave one of the test bottles in the 30°C incubator too
Problems 143
long, you discover that more than 2 hours at high tem- phenotype have the hh genotype. Use the information
perature kills the paralyzed flies. When you tell this to above to make predictions about the outcome of the cross
Dr. Sophila, he says, “Ah ha! I know how to explain this shown below.
condition.” What is his explanation?
I AI B Hh * I AI B Hh
30. Dr. Ara B. Dopsis and Dr. C. Ellie Gans are performing
genetic crosses on daisy plants. They self-fertilize a blue- 32. In rabbits, albinism is an autosomal recessive condition
flowered daisy and grow 100 progeny plants that consist caused by the absence of the pigment melanin from
of 55 blue-flowered plants, 22 purple-flowered plants, skin and fur. Pigmentation is a dominant wild-type trait.
and 23 white-flowered plants. Dr. Dopsis believes this is Three pure-breeding strains of albino rabbits, identi-
the result of segregation of two alleles at one locus and fied as strains 1, 2, and 3, are crossed to one another. In
that the progeny ratio is 1:2:1. Dr. Gans thinks the prog- the table below, F1 and F2 progeny are shown for each
eny phenotypes are the result of two epistatic genes and cross. Based on the available data, propose a genetic
that the ratio is 9:3:4. explanation for the results. As part of your answer,
The two scientists ask you to resolve their conflict create genotypes for each albino strain using clearly
by performing chi-square analysis on the data for both defined symbols of your own choosing. Use your
proposed genetic mechanisms. For each proposed mecha- symbols to diagram each cross, giving the F1 and F2
nism, fill in the values requested on the form the research- genotypes.
ers have provided for your analysis.
a. Use the form below to calculate chi square for the Cross F1 Progeny F2 Progeny
1:2:1 hypothesis of Dr. Sophila.
Cross A strain 1 56 albino 192 albino
* strain 2
Phenotype Observed Expected
Cross B strain 1 72 pigmented 181 pigmented,
Blue 55 ___________ * strain 3 139 albino
Purple 22 ___________ Cross C strain 2 34 pigmented 89 pigmented,
White 23 ___________ * strain 3 72 albino
Chi-square value: _________ df: _______ p value 7 ________
33. Dr. O. Sophila, a close friend of Dr. Ara B. Dopsis,
b. Use the form below to calculate chi square for the reviews the F2 results Dr. Dopsis obtained in his experi-
9:3:4 hypothesis of Dr. Gans. ment with iris plants described in Genetic Analysis 4.3.
Dr. Sophila thinks the F2 progeny demonstrate that a
single gene with incomplete dominance has produced a
Phenotype Observed Expected
1:2:1 ratio. Dr. Dopsis insists his proposal of recessive
Blue 55 ___________ epistasis producing a 9:4:3 ratio in the F2 is correct. To
Purple 22 ___________ test his proposal, Dr. Dopsis examines the F2 data under
White 23 ___________
the assumptions of the single-gene incomplete dominance
model using chi-square analysis. Calculate and interpret
Chi-square value: ________ df: ________ p value 7 ________ this chi-square value. Can Dr. Dopsis reject the single-
gene incomplete dominance model on the basis of this
c. What is your conclusion regarding these two genetic analysis? Explain why or why not.
hypotheses?
d. Using any of the 100 progeny plants, propose a cross 34. In a breed of domestic cattle, horns can appear on males
that will verify the conclusion you proposed in part and on females. Males and females can also be hornless.
(c). Plants may be self-fertilized, or one plant can be The following crosses are performed with parents from
crossed to another. What result will be consistent with pure-breeding lines.
the 1:2:1 hypothesis? What result will be consistent
with the 9:3:4 hypothesis? Cross I Cross II
31. Human ABO blood type is determined by three alleles, Parents: horned male * Parents: hornless male *
two of which (I A and I B) produce gene products that mod- hornless female horned female
ify the H antigen produced by protein activity of an inde- F1: males horned, females F1: males horned, females
pendently assorting H gene. A rare abnormality known as hornless hornless
the “Bombay phenotype” is the result of epistatic interac-
tion between the gene for the ABO blood group and the F2: males are 34 horned, F2: males are 34 horned,
1 1
H gene. Individuals with the Bombay phenotype appear 4 hornless 4 hornless
to have blood type O based on the inability of both anti-A females are 14 horned, females are 14 horned,
antibody and anti-B antibody to detect an antigen. The 3
4 hornless
3
4 hornless
apparent blood type O in Bombay phenotype is due to the
absence of H antigen as a result of homozygous recessive Explain the inheritance of this phenotype in cattle, and
mutations of the H gene. Individuals with the Bombay assign genotypes to all cattle in each cross.
144 CHAPTER 4 Gene Interaction
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
35. Cross 1 shown in Figure 4.22 illustrates genetic comple- b. What genetic principle is the basis of this expected F2
mentation of flower-color mutants. The F1 produced from ratio?
this cross of two pure-breeding mutant parental plants c. Give two examples of modified F2 ratios produced
are dihybrid (CcPp) and have wild-type flower color. If by epistatic gene interactions and describe how gene
these F1 are allowed to self-fertilize, what phenotypes are interaction results in the ratios.
expected in the F2 and what are the expected ratios of the 38. Draw a pedigree containing two parents and four children.
phenotypes? Both of the parents have AB blood type. The first child is
36. The wild-type allele of a gene has an A–T base pair at a type A, the second child is type AB, and the third child is
particular location in its sequence, and a mutant allele of type B.
the same gene has a G–C base pair at the same location. a. Assign the genotypes to these five people.
Otherwise, the sequences of the two alleles are identical. b. The fourth child tests as having blood type O, which is
Does this information tell you anything about the domi- not possible given the parental genotypes. Look at Fig-
nance relationship of the alleles? Explain why or why not. ure 4.4 and read the description of the molecular pro-
cess that generates ABO blood group antigens. What
37. Epistatic gene interaction results in a modification of the other mutation could account for this observation?
F2 dihybrid ratio. c. What is the name of the genetic phenomenon produc-
a. What is the expected F2 ratio? ing this observation?
Genetic Linkage and Mapping
in Eukaryotes 5
CHAPTER OUTLINE
5.1 Linked Genes Do Not Assort
Independently
5.2 Genetic Linkage Mapping
Is Based on Recombination
Frequency between Genes
5.3 Three-Point Test-Cross Analysis
Maps Genes
5.4 Multiple Factors Cause
Recombination to Vary
5.5 Human Genes Are Mapped
Using Specialized Methods
ESSENTIAL IDEAS
❚❚ Genetic linkage occurs between genes
that lie so close to one another on a chro-
mosome that alleles are unable to assort
independently.
❚❚ Genetic linkage produces significantly
more progeny with parental phenotypes
Recombination between homologous chromosomes reshuffles the genetic and significantly fewer progeny with non-
information in genomes. These two homologs show multiple chiasmata parental phenotypes than are expected
that indicate the locations of crossing over between the chromosomes. by chance.
❚❚ Crossing over between homologous
I
chromosomes results in recombination of
alleles on chromosomes in gametes.
n 1933, Thomas Hunt Morgan won the Nobel Prize for
❚❚ Geneticists use the frequency of recom-
Physiology or Medicine in recognition of his many contri- bination between genes to construct
butions to genetics. These include his work establishing sex- gene maps identifying the relative order
linked inheritance and the chromosome theory of heredity, of and distance between genes on
chromosomes.
which we explored in Chapter 3, and also his role in identifying
❚❚ Cytological evidence demonstrates that
and explaining genetic linkage and recombination and their recombination results from crossing over
application to genetic linkage mapping, which we discuss in between homologous chromosomes.
this chapter. Morgan, like all successful scientists, was assisted ❚❚ Specialized statistical methods aid in
mapping human genes.
by dedicated colleagues, including many exceptional students.
❚❚ Recombination creates substantial new
Among the latter were Calvin Bridges, whose work we dis- genetic diversity that is favored by evo-
cussed in connection with the chromosome theory of heredity, lution. It also randomizes the arrange-
ments of alleles of linked genes on
and Alfred Sturtevant, who as an undergraduate researcher
chromosomes.
145
146 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
in Morgan’s laboratory became the first person to use for investigating the evolutionary biology of organisms
genetic linkage data to assemble a genetic map. A and the role of recombination in genetics.
number of less well remembered researchers, including The detection and analysis of genetic linkage and
Morgan’s wife Lilian, were also important members of recombination; the principles of genetic linkage map-
the research enterprise. ping; and the role of recombination in evolution are
The work of Morgan, his colleagues, and numer- the topics on which we focus in this chapter. In the
ous others led to the validation of three foundational process, we explore recent developments in gene
theories in genetics. First, the work validated the mapping, molecular genetic marker mapping, and the
chromosome theory of heredity—the idea that genes investigation of chromosome evolution.
are carried on chromosomes—and expanded the
theory by showing that each chromosome carries
many genes in a specific order. Second, the research
validated the concept of the gene as a physical entity
5.1 Linked Genes Do Not Assort
that is an integral part of a chromosome, and led to Independently
work that expanded understanding of gene structure
Genes that are located on the same chromosome are called
by demonstrating that genes are composed of nucleo- syntenic genes. When two syntenic genes are so close to one
tides between which recombination may occur. Third, another that their alleles are unable to assort independently,
the work validated evolutionary theory by confirming the genes display genetic linkage. Genetic linkage produces a
distinctive pattern of gamete genotypes that can be quantified
that closely related species have a similar number of
and analyzed to map the locations of genes on chromosomes.
chromosomes and a similar arrangement of genes on Homologous recombination is the process that occurs
chromosomes; and it expanded evolutionary theory as a result of crossing over in prophase I of meiosis in
by suggesting that recombination could be a mecha- eukaryotic cells. It takes place through the equal exchange
of genetic material contained in homologous chromosomes.
nism through which variation in chromosome number
It is a reciprocal process, meaning that neither of the partici-
and in the arrangement of genes on chromosomes pating chromosomes has more or less genetic material at the
could accrue as species diverge from a common end ofthe process than at the start. At the end of meiosis the
ancestor. outcome is the generation of recombinant chromosomes
or nonparental chromosomes that come about by the reshuf-
The investigation of genetic linkage and recom-
fling of alleles residing on recombining chromosomes. In
bination is a central tool of genetic analysis, and sexually reproducing organisms this means that recombi-
recombination itself is an essential biological process. nant chromosomes contain a combination of alleles initially
Along with mutation, recombination generates the raw carried by the different parents of the organism in which
recombination is occurring. In contrast, homologous chro-
genetic diversity on which evolution depends. Along
mosomes that do not undergo crossing over during meiosis
with sexual reproduction, recombination operates to retain all the same alleles they had when they were trans-
increase diversity between generations. In addition, it mitted from a parent. To distinguish them from recombinant
has an important functional role in mammalian meiosis. chromosomes, these are called parental chromosomes or
nonrecombinant chromosomes.
Homologous chromosome synapsis and segregation
Syntenic genes located very near each other on a chro-
does not occur normally in the absence of recombina- mosome tend to recombine less often during crossing over
tion. Given its pivotal functions, one might be tempted than do genes located farther apart on the chromosome. This
to think that recombination would be ubiquitous creates a distinguishing pattern by which linkage can be rec-
ognized and quantified.
among sexually reproducing organisms and would
On the other hand, syntenic genes located far apart on
occur to an equal degree throughout the genome of a chromosome, and genes located on separate chromosomes,
such an organism. In fact, genome-based analysis of always assort independently according to the predictions of
recombination reveals that none of these presump- Mendel’s law of independent assortment. The independent
assortment of genes on separate chromosomes is explained
tions is true. Recombination is highly variable within
by the movement of chromosomes and chromatids in meio-
any one genome, and it is highly variable among dif- sis, as Figure 3.15 illustrates. The independent assortment
ferent organisms. These findings open new avenues of syntenic genes is a product of there being sufficient
5.1 Linked Genes Do Not Assort Independently 147
recombination along the homologous chromosomes con- Figure 5.1 illustrates the consequences of genetic link-
taining those genes to randomize the allele combinations. age by comparing the frequencies of gamete genotypes for
Thus, to establish the presence of genetic linkage requires a two crosses. In Figure 5.1a, gene A and gene B are on dif-
statistical demonstration of the absence of independent assort- ferent chromosomes, and alleles of the genes assort inde-
ment. The chi-square statistic discussed in Section 2.5 is used pendently. The parental organisms are AABB and aabb, and
to compare the observed and expected outcomes of crosses their gametes AB and ab are the parental gametes. The F1
for this purpose. In experimental analysis of genetic linkage, progeny are dihybrid (AaBb), and independent assortment
independent assortment is the expected result of crosses; to predicts these dihybrids will produce four genetically differ-
be indicative of genetic linkage, a cross outcome must have a ent gametes in a ratio of 1:1:1:1. Notice that the frequency
statistically significant deviation from cross expectations. of parental gametes (AB and ab) is 50%, and that the fre-
In this chapter section and throughout the remainder quency of nonparental gametes (Ab and aB) is also 50%.
of the chapter, these basic concepts of genetic linkage and Figure 5.1b illustrates gamete-genotype production for
some of the experimental results that support them are elab- syntenic genes D and E that are linked. The DDee parent
orated and explained. To help focus the discussion, we offer produces parental gametes that are De, and the ddEE parent
the following observations and conclusions, all of which are produces dE gametes. The dihybrid F1 progeny are DdEe,
essential to understanding the linkage phenomenon. carrying alleles D and e on one chromosome and d and E
on the homolog. This arrangement of alleles can be written
1. Linked genes are always syntenic, and they are always
De/dE, with the slash (“/”) separating the alleles carried on one
located near one another on a chromosome. When
member of the homologous chromosome pair from the alleles
syntenic genes are so far apart on the chromosome that
carried on the other member of the pair. The use of a slash
crossing over between them generates independent
to separate the alleles of homologous chromosomes is usually
assortment of the alleles, the genes are not linked.
reserved for linked genes. A genotype designated De/dE is the
2. Genetic linkage leads to the production of a signifi- same as DdEe, the difference being that in the former case the
cantly greater number of gametes containing chro- genes are linked and the alleles on each homolog are known.
mosomes with parental combinations of alleles than A characteristic of genetic linkage is that the rate of
would be expected under assumptions of independent recombination between linked genes is low, and parental allele
assortment, and to a significantly smaller number of combinations usually stay together during meiosis, leading to
gametes containing chromosomes with alleles that are the production of parental gametes (De and dE) at a combined
different from the parental combinations. frequency that is significantly greater than 50% 1W 50%2, as
3. Crossing over is less likely to occur between linked genes in Figure 5.1b. The low frequency of crossing over between
that are close to one another than between genes that are closely linked genes results in the production of recombinant,
farther apart on a chromosome. The frequency of cross- or nonparental, gametes (DE and de) at a combined frequency
ing over is roughly proportionate to the distance between that is significantly less than 50% 1V 50%2. Note that the
genes, a relationship that allows genes to be mapped. term “parental” refers to the combination of alleles carried by
parental organisms and “nonparental” to allele combinations
The discovery of genetic linkage, made more than
not on the parental chromosomes.
a century ago, opened the door to the development of
Complete genetic linkage is observed when no recombi-
several applications. The first of these was genetic linkage
nation at all occurs between linked genes. Complete genetic
mapping, which plots the positions of genes on chromo-
linkage can be identified, for example, in cases where a
somes. Over the ensuing century, new methods for identify-
dihybrid produces two equally frequent gametes containing
ing genetic variants and new applications for mapping genes
only parental allele combinations and no recombinant gam-
and variants have added to the analytical arsenal of genetics.
etes (Figure 5.2a). The absence of recombination between
Genetic linkage and its old and new mapping applications
homologs usually has a specific biological basis. Certain
remain a strong central pillar of genetic analysis.
organisms, including Drosophila males and other males in
the insect order Diptera (of which Drosophila is a member),
Detecting Genetic Linkage exhibit complete genetic linkage. There is no recombination
Genetic linkage can be detected by comparing the observed between homologous chromosomes in these male flies. The
frequencies of gamete genotypes, or the corresponding prog- biological basis of the absence of recombination in these
eny phenotypes, with the frequencies expected under the organisms remains unknown.
assumptions of independent assortment. If genes are linked, Incomplete genetic linkage is far more common for
parental gametes—also known as nonrecombinant gametes— linked genes. The resulting recombination between the
that contain parental combinations of the alleles will be pro- homologs produces a mixture of parental and nonparental
duced significantly more often than predicted by chance. The gametes. In the F1 dihybrid shown in Figure 5.2b, recom-
excess parental gametes will also result in progeny in which bination produces four genetically different gametes, of
parental phenotypes (or parental combinations of alleles) will which two are parental (nonrecombinant) and two are non-
be detected significantly more often than predicted by chance. parental (recombinant). The two parental gametes each have
148 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
Genotype Phenotype
_________ ____________________ Frequency
_________ Genotype Phenotype
_________ ___________ Frequency
_________
A B De
AB = 25% De >> 25%
a b Parental dE Parental
ab = 25% gametes (~ –50%) dE >> 25% gametes (>>50%)
a B DE
aB = 25% DE << 25%
A b Nonparental de Nonparental
Ab = 25% gametes (~ –50%) de << 25% gametes (<<50%)
Independent assortment predicts 25% of each gamete type, Parental gametes are significantly more frequent (>>)
with parental and nonparental gametes each totaling 50%. and nonparental gametes significantly less frequent
(<<) than predicted by independent assortment.
Figure 5.1 Independent assortment versus genetic linkage. (a) For this dihybrid, four genetically differ-
ent gametes are expected at 25% each when the genes assort independently. (b) When genes are linked,
parental gametes are significantly more frequent than expected by chance, and their individual and com-
bined frequencies are much greater than nonparental gametes.
approximately the same frequency, and their total is signifi- The recombination frequency, expressed in the gen-
cantly greater than 50% of all gametes. In this example, the eral formula as the variable r, identifies the rate of recombi-
frequency of each parental gamete (RT and rt) is 40%, and nation for a given pair of syntenic genes. The value of r is
the total frequency of parental gametes is 80%. Recombi- expressed as
nant gametes, which have nonparental combinations of
number of recombinants
alleles, are approximately equal to one another in frequency r =
and constitute significantly less than 50% of all gametes. In total number of progeny
this case, a total of 20% of gametes are recombinant: 10% As stated above, recombination frequency varies between
of the gametes are Rt and 10% are rT. different pairs of syntenic genes, depending roughly on the
The proportion of parental to recombinant chromo- distance separating the genes on the chromosome. Compar-
somes or gametes from a cross depends on the frequency of ing Figure 5.2b and Figure 5.2c, for example, we see that
crossing over between syntenic genes. This proportion dif- recombination frequency is 20% 1r = 0.202 in Figure 5.2b
fers among different pairs of genes and is expected to be and 40% 1r = 0.402 in Figure 5.2c. The greater recombina-
greater for syntenic genes that are farther apart and smaller tion frequency in Figure 5.2c compared with Figure 5.2b is
for genes that are closer together on a chromosome. Note most likely the consequence of a greater distance between
that the percentages of different gametes obtained for the genes N and M than between genes T and R. The correlation
cross in Figure 5.2c are different from those in Figure 5.2b, between recombination frequency and gene distance can be
and also notice that the parental alleles on chromosomes in expressed in two equivalent ways: (1) crossing over occurs
Figure 5.2c are a dominant and a recessive allele—Mn/mN. at a higher rate between genes that are separated by a greater
Once again, parental chromosomes are defined by the spe- distance, and at a lower rate for genes that are closer together;
cific combinations of alleles that are present on the homo- and (2) linked genes with higher recombination frequencies
logs of the parents in the cross. are more distant from one another than linked genes with
5.1 Linked Genes Do Not Assort Independently 149
(a) Complete genetic linkage (no crossover) (b) Incomplete genetic linkage (crossover in 20% of gametes)
Centromere
FG fg RT r t
The syntenic
P × genes are linked. P × Linked genes
FG fg RT r t
FG/FG fg/fg The slash (”/”) RT/RT rt/rt
separates the
Gamete formation Gamete formation alleles on each Gamete formation Gamete formation
FG fg homolog. RT r t
FG fg RT rt
Gamete union Gamete union
(c) Incomplete genetic linkage (crossover in 40% of gametes) Parental gametes are 80% and recombinant gametes are
20% for these genes.
M n m N
P × Linked genes
M n m N Figure 5.2 Complete versus incomplete genetic linkage.
Mn/Mn mN/mN (a) Genes exhibiting complete genetic linkage do not
recombine, and all gametes are parental. (b) Linked genes
Gamete formation Gamete formation
with a recombination frequency of 20% produce 20% non-
M n m N parental gametes and 80% parental gametes. (c) Linked
genes with a recombination frequency of 40% produce 60%
Mn mN
Gamete union parental gametes and 40% nonparental gametes.
both traits in the same plants, intending to test the law of each homologous X chromosome in females and indicates
independent assortment. They crossed pure-breeding plants the X-linked alleles and the Y chromosome in males.
with the two dominant traits, purple flowers and long pollen Morgan produced an F1 and then an F2 generation,
(PPLL), to pure-breeding recessive plants with red flowers crossing a dihybrid F1 female 1 w + m+ /wm2 to a hemizy-
and round pollen (ppll). As expected, the F1 consisted gous F1 male (wm/Y). He predicted a 1:1:1:1 ratio in the F2
exclusively of purple-flowered, long-pollen plants, and based on the assumption of independent assortment of the
these plants were crossed to obtain the F2. But then, genes. Instead, Morgan found substantial deviation from
instead of the 9:3:3:1 ratio predicted by the independent expectations. As in the Bateson and Punnett experiment,
assortment hypothesis, a far larger than expected portion of Morgan observed that parental phenotypes predominated
F2 progeny showed parental combinations of phenotypes, (791 + 750 = 1541, or 63.1%) and that fewer than the
and many fewer showed nonparental combinations (Table expected number of nonparental phenotypes were pro-
5.1). Although the chi-square test was not applied to the data duced. The recombination frequency for this experiment is
by Bateson and Punnett, its use today identifies p 6 0.05, r = 1 445 + 4552 /2441 = 0.369, or 36.9%. Notice that
a significant deviation between observed and expected the two parental phenotypes are observed in an approxi-
numbers. mate 1:1 ratio (791:750), as are the nonparental phenotypes
In the F2, Bateson and Punnett observed that the two (455:445), as expected from segregation.
parental phenotypes—purple, long and red, round—were sub- Based on this result, Morgan proposed that parental phe-
stantially in excess of expected frequencies, and that the two notypes are produced when the gametes of the F1 female pre-
nonparental phenotypes—purple, round and red, long—were dominantly contain X chromosomes with one of the original
substantially less frequent than expected. This observation led parental sets of alleles, in this case w +m+ and wm. Eggs con-
Bateson and Punnett to suggest that the two combinations of taining parental alleles unite with sperm carrying w and m on
alleles carried in the parents—PL and pl—remained together the X chromosome or carrying the Y chromosome, and paren-
very frequently, by an unknown mechanism, when they were tal phenotypes are produced. Conversely, nonparental pheno-
passed through gametes to subsequent generations. Bateson types are the result of recombination between homologous X
and Punnett described these alleles as exhibiting “coupling.” chromosomes during F1 female meiosis ( Figure 5.4). The pro-
They described the appearance of new, nonparental pheno- duction of recombinant chromosomes carrying either w +m or
types in the F2 as indicating “repulsion” of the parental alleles, wm+ requires the physical rearrangement (recombination) of
to produce nonparental phenotypes in progeny. homologous X chromosomes. Morgan confirmed this expla-
In 1911, Morgan performed the first of a series of nation through the examination of many other pairs of linked
experimental crosses that confirmed genetic linkage, genes on the fruit fly X chromosome.
explained the apparent coupling and repulsion identified
by Bateson and Punnett, and led to the development of the
Detecting Autosomal Genetic Linkage
first genetic linkage map. Morgan had by this time identified
several genes on the X chromosome of his wild-caught fruit through Test-Cross Analysis
flies. The X-linked genes identified included w (white eye) Turning his attention to autosomal genes and employing
and m (miniature wing). Figure 5.3 illustrates one of Mor- 20/20 hindsight, Morgan realized that Bateson and Punnett
gan’s experimental crosses, this one between a female pure- had detected genetic linkage but were unable to explain
breeding for white eyes and miniature wings (wm/wm) and it because, with respect to experimental design, they had
a hemizygous wild-type male displaying red eye and full performed the wrong cross! The F2 progeny in the Bateson
wing 1 w + m+ /Y2 . The F1 progeny were dihybrid wild-type and Punnett experiment fell into four phenotypic classes,
females 1 w + m+ /wm2 and white, miniature (wm/Y) hemi- but three of those classes contained multiple genotypes
zygous males. Here the slash (“/”) identifies the alleles on (e.g., PPLL, PpLL, and PPLl all had the same phenotype),
owing to the dominance relationships among the alleles
(see Figure 2.11). Bateson and Punnett were unable to
Table 5.1 Bateson and Punnett’s Observed and determine which alleles in the progeny derived from each
Expected Phenotypes in F2 Sweet Peas F1 parent because they had no way of ascertaining the high
frequency of parental combinations of alleles and the low
Phenotype Genotype Number of Progeny
frequency of recombinants in F1 gametes.
Observed Expected Morgan realized that the linkage of autosomal genes
(9:3:3:1 ratio)
in Drosophila could be fully interpreted through the use of
Purple, long P–L– 4831 (6952)(9/16) = 3910.5 two-point test-cross analysis in which a dihybrid F1 fly is
Purple, round P–ll 390 (6952)(3/16) = 1303.5 crossed to a pure-breeding mate with the recessive pheno-
Red, long ppL– 393 (6952)(3/16) = 1303.5 types. The “two points” in these analyses are the two genes
Red, round ppll 1338 (6952)(1/16) = 434.5
being tested. In two-point test-cross analysis, the homozy-
gous recessive fly contributes only recessive alleles to test-
6952 6952.0
cross progeny. In contrast, the dihybrid fly can contribute
5.1 Linked Genes Do Not Assort Independently 151
w m w + m+
P ×
w m
wm/wm w+m+/Y
White eye Red eye
Miniature wing Full wing
w + m+ w m
F1 ×
w m
wm/w+m+ wm/Y
Red eye White eye
Full wing Miniature wing
either a dominant allele of a gene, in which case the progeny fruit flies that are pure-breeding for red eyes and full
display the dominant phenotype, or the recessive allele, thus wing with pure-breeding purple-eyed, vestigial-winged
producing the recessive form of the trait. flies (Figure 5.5a). The F1 were uniformly red eyed and
In one experiment, Morgan used test-cross analysis full winged 1pr + vg+/pr vg2. Morgan then test-crossed
to examine genetic linkage of autosomal genes affecting dihybrid F1 females to purple-eyed, vestigial-winged males
eye color and wing shape. Drosophila eye color is red if (pr vg/pr vg). In this cross, males contributed only recessive
an autosomal dominant allele pr + is present, whereas the alleles (pr and vg), but females could produce any one of
recessive purple eye color is produced when the only allele four gamete genotypes. The alleles of the female gamete thus
present is pr. Full-sized wing is the product of an autosomal controlled the phenotype of test-cross progeny. If the female
dominant allele vg+, and its recessive counterpart, vestigial contributed a dominant allele to progeny, the phenotype
wing, is determined by the allele vg. Morgan crossed for that trait was dominant; and conversely, if the donated
152 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
m+ m+ m m m+ m+ m m m+ m+ m m m+ m m+ m
w+ w+ w w w+ w+ w w w+ w w+ w w+ w w w+
Q Draw a chromosome pair with Ab/aB. Illustrate a crossover between the two genes and identify
the resulting parental and recombinant chromosomes.
(a) (b)
F1 Full-wing, red-eye Vestigial-wing, purple-eye
P × females test-cross males
vg+ pr+ vg pr
vg+ pr+ vg pr
vg+ pr+ vg pr Gamete
vg pr vg pr formation
vg pr
+ +
vg pr
Red eye Purple eye
Full wing Vestigial wing vg pr vg pr
1339 + 1195
___________ 151 + 154
___________
= 0.893 = 0.107
2839 2839
Parentals Recombinants
Figure 5.5 Morgan’s test-cross analysis of genetic linkage between autosomal genes. (a) Dihybrid F1
females 1pr +vg +/ pr vg2 are test-crossed to males homozygous for recessive mutant purple eye color and
vestigial wing (pr vg/pr vg), permitting identification of progeny as carrying either a parental or a recombi-
nant chromosome. (b) Single crossover during female meiosis leads to parental and recombinant gametes
at frequencies specified by recombination or by chance, and gamete union produces test-cross progeny.
5.1 Linked Genes Do Not Assort Independently 153
female allele was recessive, the phenotype was recessive. gene recombination and physical exchange between homol-
Test-cross progeny phenotypes corresponded directly to the ogous chromosomes went hand in hand.
alleles contributed by F1 females, thus making it possible to Creighton and McClintock studied recombination
unambiguously identify the allelic content of chromosomes between homologous copies of chromosome 9 in corn that
in female gametes. were distinguished by having different alleles for two linked
Under the assumption of independent assortment, dihy- genes—the genes controlling kernel color (c1) and starch
brid females should produce four equally frequent gametes, type (wx) in Zea mays—and by two cytological, or struc-
and test-cross progeny are expected to have four phenotypes tural, differences in the homologous copies of chromosome
distributed in a 1:1:1:1 ratio (see Figure 2.12). With genetic 9 that were observable under the microscope. One copy of
linkage however, parental combinations of alleles would chromosome 9 had the normal microscopic appearance and
occur preferentially in gametes, producing test-cross prog- carried alleles c1 and Wx. The homologous copy of chro-
eny with a significant excess of parental phenotypes and a mosome 9 carried alleles C1 and wx and was cytologically
significant deficit of nonparental phenotypes. altered in two ways. On the end nearer C1, the chromo-
Morgan’s test-cross progeny displayed the four expected some had a darkly staining region called a “knob”; on the
phenotypes, but in numbers that deviated dramatically from other end, near wx, the chromosome carried a fragment of
expected Mendelian proportions. Among test-cross progeny, chromosome 8 that had been transferred by a chromosome-
89.3% were parental, and just 10.7% were recombinant. The rearrangement event called translocation (we explore this
nonrecombinant progeny classes were found in approxi- event in Section 10.5). Creighton and McClintock obtained
mately a 1:1 ratio (1339:1195), as were the recombinant cytological evidence that recombination involved the physi-
classes (154:151); thus, the two parental chromosomes were cal exchange between homologous chromosomes by detect-
transmitted equally frequently, as were the two recombinant ing genetic recombinants (chromosomes carrying the alleles
chromosomes. Figure 5.5b shows that among the 89.3% of C1 and Wx or carrying the alleles c1 and wx) that were also
parental female gametes, one-half, or 44.65%, should be of cytologically rearranged chromosomes (Figure 5.6).
each parental type. Similarly, among the 10.7% of gametes
that are recombinant, each recombinant type should have a
frequency of 5.35%. (a) c1 Wx/C1 wx heterozygote
In the years immediately following Morgan’s explana- c1 Wx Normal chromosome 9
tion of genetic linkage, other biologists, working on plant
species and animal species, used test-cross analysis to verify C1 wx Translocation chromosome 9
Morgan’s hypothesis. The collective results of these experi- Knob Chromosome 8
mental observations can be summarized as follows: segment
Cytological markers
1. Genetic linkage is a physical relationship between
genes that are located near one another on a
(b) Homologous recombination
chromosome.
c1 Wx
2. Recombination between linked genes on homologous
chromosomes occurs in significantly less than 50% of c1 Wx
meiotic divisions. Significantly more than 50% of gam- C1 wx
etes contain parental combinations of alleles.
3. The recombination frequency varies among linked C1 wx
genes and is roughly proportionate to the distance
Gametes
between genes on a chromosome.
c1 Wx
Genetic Analysis 5.1 takes you through the identifica- Parentals
tion of parental and recombinant progeny and the determi-
C1 wx
nation of recombination frequency.
c1 wx
Cytological Evidence of Recombination Recombinants
C1 Wx
Morgan’s hypothesis that gene recombination required
physical exchange between homologous chromosomes Figure 5.6 Cytological proof from Zea mays that recombination
was a functional working hypothesis, but direct evidence results from crossing over. Progeny displaying recombinant phe-
of exchange was not obtained until 20 years after Morgan notypes are also seen to carry physically rearranged chromosomes.
proposed it. In 1931, research published by Harriet Creigh- Q In two or three sentences explain why the results of this
ton and Barbara McClintock on crossing over in corn (Zea experiment confirm that the genetic observation of recombination
mays), and a nearly simultaneous report by Curt Stern on is the result of physical exchange between homologous
crossing over in Drosophila, provided direct evidence that chromosomes.
GENETIC ANALYSIS 5.1
PROBLEM In tomato plants (Lycopersicon esculentum), red fruit color 1T -2 is dominant to tangerine color (tt), BREAK IT DOWN: Pure-
and smooth leaf 1H -2 is dominant to hairy leaf (hh). Both genes are located on chromosome 7, and they have a breeding tangerine, smooth
is ttHH and pure-breeding
BREAK IT DOWN: A recombination recombination frequency of 20%. A pure-breeding plant producing tangerine-colored fruit red, hairy is TThh (pp. 151
frequency of 20% means that 80% and smooth leaves is crossed to a pure-breeding red-fruited, hairy-leaved plant. The F1 and 152).
of gametes are parental and 20% are test-crossed to a pure-breeding tangerine-fruited, hairy plant. What are the expected
are recombinant (p. 148). BREAK IT DOWN: The F1 are
genotypes, phenotypes, and phenotype proportions among test-cross progeny? TtHh, and they are test-crossed
to tthh (pp. 151 and 152).
Evaluate
1. Identify the topic of this problem and 1. This problem concerns the prediction of inheritance in progeny of a test cross for
the nature of the required answer. linked genes. The answer requires that the expected frequency of each possible
category of test-cross progeny be predicted from the information given about
recombination frequency between the genes.
2. Identify the critical information given 2. Dominant and recessive phenotypes, the phenotypes of two pure-breeding
in the problem. parental plants, and the recombination frequency between genes controlling two
traits are given in the problem.
Deduce
3. Identify the alleles in the gametes of 3. Each parent is pure-breeding for a dominant and a recessive trait:
the parental plants. Tangerine, smooth = ttHH
Red, hairy = TThh
Parental gametes = all tH from one parent and all Th from the other
4. Identify the genotype and phenotype 4. F1 are dihybrid (tH/Th) and have the two dominant phenotypes (red and smooth).
of F1 plants, and determine the paren- The pure-breeding parents have contributed chromosomes carrying tH and Th.
tal arrangements of alleles.
Solve
5. Determine the number and 5. Four genetically different gametes are possible: tH, Th, TH, and th.
frequency of F1 gametes, given the Among these gametes, 20% will be recombinants and 80% parentals
recombination frequency of 20%. 1100% - 20% = 80%2. Chance predicts that the two parental gametes (tH and
Th) are produced at equal frequency. Likewise, the two recombinant gametes (TH
TIP: With genetic linkage, parental
combinations of alleles are sig- and th) are produced at equal frequency. The expected gamete frequencies are
nificantly greater than 50% of the
gametes. Parentals: tH = 10.80211/ 22 = 0.40
Th = 10.80211/ 22 = 0.40
Recombinants: TH = 10.20211/ 22 = 0.10
th = 10.20211/ 22 = 0.10
6. Determine the expected outcome of 6. Test-cross progeny are expected to be 40% each tangerine, smooth and red,
the test cross. hairy; and 20% each red, smooth and tangerine, hairy.
TIP: There are two equally
likely parental gametes and th (1.0) Progeny
two equally likely recombi-
nant gametes.
0.40 tH tH/th 0.40 Tangerine, 40%
smooth
Parental
Red,
0.40 Th Th/th 0.40 hairy 40%
Red,
0.10 TH TH/th 0.10 10%
smooth
Recombinant
Tangerine,
0.10 th th/th 0.10 10%
hairy
For more practice, see Problems 5, 6, and 12. Visit the Study Area to access study tools. Mastering Genetics
154
5.2 Genetic Linkage Mapping Is Based on Recombination Frequency between Genes 155
Just a few weeks after Creighton and McClintock could be quantified. If this hypothesis was correct, then
reported their evidence of a link between chromosome recombination frequencies could be used to produce a
rearrangement and genetic recombination, Stern reported genetic linkage map depicting gene order along a chro-
similar findings in Drosophila. The combined genetic and mosome and to infer the linear distances between genes.
chromosomal recombination analyses in corn and fruit fly As Morgan discussed his ideas about recombination fre-
provided convincing evidence that genetic recombination quency and gene distances, Alfred Sturtevant, then an
between homologous chromosomes is accompanied by undergraduate student working in Morgan’s laboratory,
physical exchange between the chromosomes in plants and had an epiphany. In a 1965 book, Sturtevant recalled the
in animals. moment:
In the latter part of 1911, in a conversation with Mor-
5.2 Genetic Linkage Mapping gan, I suddenly realized that the variations in strength
of linkage, already attributed by Morgan to differences
Is Based on Recombination in the spatial separation of genes, offered the possibil-
Frequency between Genes ity of determining sequences in the linear dimension
of a chromosome. I went home and spent most of the
An important outcome of Morgan’s studies of linked genes night (to the neglect of my other undergraduate home-
in Drosophila was his recognition that significantly more work) in producing the first chromosome map.
parental than recombinant progeny occurred and that the Sturtevant used the results of numerous two-point test-
proportion of recombinants varied considerably from one cross experiments on five X-linked genes in Drosophila
pair of linked genes to another. Morgan summarized this to create the first genetic linkage map. He based his map-
idea in 1911, stating, “The proportions that result are not so building approach on the idea that smaller recombination
much the expression of a numerical system as of the relative frequencies indicated genes residing closer to each other
location of the factors (genes) in the chromosome.” Morgan on the chromosome, and larger recombination frequencies
was saying that independent assortment was not determin- indicated greater distances between genes on the chromo-
ing the relative proportions of all gametes produced by an some. To construct his genetic map, Sturtevant used the data
organism. Instead, the close proximity of linked genes on in Table 5.2. His finished recombination map is illustrated in
a chromosome overrode the expected influence of indepen- Figure 5.7. In the century since Sturtevant first compiled his
dent assortment on the alleles of those genes. The linkage map, millions of progeny fruit flies have been analyzed for
of genes preferentially retained parental combinations of X-chromosome recombination. The accumulated data have
alleles and led to a much higher proportion of parental gam- led to slight modifications in Sturtevant’s estimated recom-
etes and a much lower proportion of nonparental gametes bination frequencies but have not necessitated any changes
than were expected by chance. Morgan’s intuition was cor- in gene order. Sturtevant assembled his map using logic of
rect, and his insight profoundly changed views of hereditary the kind demonstrated in the following four steps:
transmission and of the location and organization of genes
on chromosomes. In this section, we examine methods for 1. Of the genes tested, the pair with the smallest recom-
constructing genetic maps from recombination data for two bination frequency, and therefore in closest proximity,
linked genes, and in the next section, we’ll move on to con- are the gene producing white eye (w) and the gene
sider the mapping of three linked genes. carrying yellow (y) body. With their recombination
frequency of just 1%, they must be at almost the same
spot on the chromosome.
The First Genetic Linkage Map
In the context of early 20th-century biology, Morgan’s
idea that genes were on chromosomes was not novel. For
Table 5.2 Sturtevant’s Recombination Data for
example, Sutton, Boveri, and others had noted the parallel
Five X-Linked Genes in Drosophila
between hereditary transmission and chromosome division.
But biologists at the time did not know either the structure Recombination
of genes or how they were encoded on chromosomes (see Gene Pairs Frequency (r)
Section 3.3). Morgan was the first to demonstrate that genes Yellow (y) and white (w) 214/21,736 = 0.010
are on chromosomes, and his proposal that the recombina-
Yellow (y) and vermilion (v) 1464/4551 = 0.322
tion frequency for a linked pair of genes might correspond
to the distance between those genes on a chromosome was Vermilion (v) and white (w) 471/1584 = 0.297
a novel idea. Vermilion (v) and miniature (m) 17/573 = 0.030
Morgan viewed genes as inhabiting fixed locations on Miniature (m) and white (w) 2062/6116 = 0.337
chromosomes. Like cities along a road, the order of genes White (w) and rudimentary (r) 406/898 = 0.452
could be determined, the locations of genes on a chromo-
Rudimentary (r) and vermilion (v) 109/405 = 0.269
some could be specified, and the distances between genes
156 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
yw vm r Centromere Figure 5.7 The first linkage map. The original Drosophila X
Sturtevant’s map chromosome map of five genes assembled by Alfred Sturtevant
(top) and the contemporary X-chromosome map for Drosophila
Map units (m.u.) 0.0 1.0 30.7 33.7 57.6
based on current data (bottom). Sturtevant’s map is based in
part on the recombination frequencies given in Table 5.2.
yw vm r Centromere
Contemporary map Q Consider the relative distances between and
recombination frequencies for (1) the white and vermillion
m.u. 0.0 1.5 33.0 36.1 54.5
genes and (2) the yellow and white genes. Make a summary
statement about the relationship between the physical
distance between two genes and the recombination
frequency between them.
2. Vermilion (v) is more distant from yellow (32.2% and expected values to identify statistically significant dif-
recombination) than it is from white (29.7% recombi- ferences. (Section 2.5 describes the chi-square test and dem-
nation), suggesting the order y–w–v. onstrates the calculation and interpretation of chi-square p,
3. Miniature (m) is close to vermilion (3% recombination) or probability, values.)
but is more distant from white (33.7% recombination) As an example, let’s revisit the data obtained by Morgan
than is vermilion. Adding miniature to the gene map on the w gene affecting eye color and the m gene controlling
produces the order y–w–v–m. wing form in Drosophila, presented in Figure 5.3. The cross
of F1 dihybrid females 1wm/w +m+2 to white-eyed, miniature-
4. Rudimentary (r) is very distant from white (45.2%
winged males (wm/Y) produces an F2 generation that would
recombination) and also fairly distant from vermilion
have been expected to display a 1:1:1:1 phenotypic ratio. This
(26.9% recombination). This information places rudi-
ratio is based on the assumption that independent assortment
mentary on the opposite side of the map from white,
determines the alleles contained in female gametes.
yielding the final map y–w–v–m–r.
The question to answer by chi-square testing is whether
the results are consistent with independent assortment or not.
Map Units In other words, this is a test of a hypothesis of no genetic
linkage between the genes. Using the observed and expected
As we examine our map of the Drosophila X chromosome, values, we calculate the chi-square value as follows:
the correlation between recombination frequency and physi-
cal distance on chromosomes becomes easier to understand. 1791 - 610.2522 1750 - 610.2522
x2 = +
The recombination frequencies between genes on a chromo- 610.25 610.25
some can even be converted into units of physical distance,
1445 - 610.2522 1455 - 610.2522
using the concept of a map unit (m.u.). A map unit is also + + = 169.79
known as a centiMorgan (cM) in honor of Thomas Hunt 610.25 610.25
Morgan’s contribution to recombination mapping. It is com- For this analysis there are 3 degrees of freedom 1df = 32,
mon (at least in introductory genetics courses) to use the and the corresponding p value is p 6 0.005 (see Table 2.4).
equivalency This observed result indicates a significant deviation from
1% recombination = 1 m.u. or 1 cM of distance expected results, suggesting that chance is not responsible
between linked genes for the observed distribution. Combined with the observa-
tion that the two phenotypes that exceed the expected num-
This is an approximation, and not a very good one for cer- ber are parental, these data are consistent with the presence
tain regions of particular genomes, as we discuss in a later of genetic linkage between the genes.
section. Despite its shortcomings, however, it is accurate
enough for our instructional purposes in this textbook.
5.3 Three-Point Test-Cross Analysis
Chi-Square Analysis of Genetic
Linkage Data
Maps Genes
In our discussion of genetic linkage data, we have noted Two-point test-cross analysis is an effective way to calcu-
that when genes are linked, significantly more parental late the recombination frequency between two linked genes
phenotypes than recombinant phenotypes are found among and to infer the distance between the genes, but it is not the
progeny. But how can we tell whether the observed data most effective way to build genetic maps containing mul-
constitute evidence of genetic linkage rather than a simple tiple genes. By expanding the idea of test-cross analysis to
case of chance variation from expected values? The ques- three-point test-cross analysis, however, geneticists can
tion is settled by the use of chi-square analysis of observed efficiently map three linked genes simultaneously.
5.3 Three-Point Test-Cross Analysis Maps Genes 157
Identifying Parental, Single-Crossover, the three recessive alleles are on the homologous chromo-
and Double-Crossover Gametes in Three- some (Figure 5.8a). A total of eight genetically different
chromosomes are expected: two parental, four from single
Point Mapping crossovers, and two from double crossover. During meio-
Let’s consider a three-point test cross between a trihybrid sis, trihybrid 1 generates parental chromosomes (a +b+c +
organism 1a +ab+bc +c2 and an organism that is homozy- and abc) when no crossovers occur between the genes.
gous recessive for the three traits (aabbcc). The configura- A single crossover occurring between genes a and b pro-
tion of alleles in the trihybrid (i.e., which of the alleles are duces two recombinant chromosomes, a +bc and ab+c +,
on the same homolog) does not have to be known at the and likewise, a single crossover occurring between genes
start, since the three-point analysis will deduce the con- b and c also produces two different recombinant chro-
figuration of alleles on parental chromosomes as part of mosomes, a +b+c and abc +. A double-crossover event that
the process. causes crossing over both between a and b and between
Incomplete genetic linkage of three genes in a trihy- b and c will produce a pair of double-crossover chromo-
brid produces eight genetically different gamete genotypes. somes, a +bc + and ab+c.
This is the same number of genetically different gametes Trihybrid 2, shown in Figure 5.8b, has a differ-
expected if we assume independent assortment; but, unlike ent arrangement of the dominant and recessive alleles on
the expectations for independent assortment, the gamete homologous chromosomes. Trihybrid 2 is a+bc+/ab+c.
frequencies are unequal if the genes are linked. Among the Trihybrid 2 produces the same eight chromosome geno-
eight gamete genotypes are two parental genotypes that types as trihybrid 1, but since the alleles start out with dif-
are significantly more frequent than expected by chance ferent configurations on the parental chromosomes, the
as well as six recombinant genotypes, each detected less assignment of chromosomes to parental and recombinant
often than expected. Assuming, for the purposes of this categories differs from those assigned for trihybrid 1. For
example, that the three linked genes are in the order a–b–c, trihybrid 2, the parental chromosomes are a +bc + and ab+c.
we can identify parental and recombinant gametes by the The single-crossover chromosomes are a +b+c and abc + for
relative frequencies of the corresponding test-cross prog- crossover between genes a and b. Single crossover between
eny classes. genes b and c produces chromosomes a +bc and ab+c +. A
Suppose a trihybrid organism, designated trihybrid 1, double crossover causing recombination between each pair
has the genotype a +b+c +/abc with alleles arranged so that of genes produces double-crossover chromosomes a +b+c +
the three dominant alleles are on one chromosome and and abc.
Gametes Gametes
a b c a b+ c
Parental
a+ b+ c+ (no recombination) a+ b c+
a+ b c a+ b+ c
Single crossover
a b+ c+ (recombination a b c+
between a and b) Figure 5.8 Gametes from trihybrid organisms with dif-
ferent allele configurations. (a) Trihybrid 1 is a+b+c +/abc.
a +
b+
c a +
b c
Meiosis in trihybrid 1 produces two chromosomes with parental
Single crossover
(recombination combinations of alleles, a total of four single-crossover
a b c +
a b+ c+ chromosomes in two pairs, and two double-crossover
between b and c)
chromosomes. (b) Trihybrid 2 is a+bc +/ab+c. Meiosis in this
a b+ c a b c organism also produces parental, single-crossover, and
Double crossover double-crossover chromosomes, but the alleles on these
a+ b c+ (recombination a+ b+ c+ chromosomes differ from those of trihybrid 1 due to different
between both pairs) allele configurations of the parental chromosomes.
158 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
In evaluating genetic linkage and crossing over, the two genetic linkage in maize (Zea mays). Emerson tested three
most important guidelines to follow are (1) find out what genes: the gene producing the phenotypes green seedling
alleles are carried on the parental chromosomes—this infor- 1V-2 and yellow seedling (vv), the gene producing rough
mation may be given or it may have to be deduced—and leaf 1Gl-2 and glossy leaf (gl gl), and the gene for normal
(2) expect that each of the six recombinant gametes will be fertility 1Va-2 and variable fertility (va va).
observed at a frequency that is significantly less than pre-
Emerson crossed pure-breeding wild-type plants having
dicted by chance. Single-crossover gametes form at fre-
the dominant phenotypes green seedling, rough leaves, and
quencies determined by the relative distances between gene
pairs. Within each single-crossover class, the two gametes normal fertility (which, not knowing the gene order, we will
will be equally frequent. Double-crossover gametes will provisionally identify as V Gl Va/V Gl Va) to pure-breeding
be the least frequent class because both crossover events plants having the recessive phenotypes yellow seed-
must occur. As within each single-crossover class, the two ling, glossy leaves, and variable fertility (v gl va/v gl va).
kinds of double-crossover gametes are produced at equal The cross produced F1 trihybrid plants with the dominant
frequency. phenotypes and the genotype V Gl Va/v gl va that carries
three dominant alleles on one chromosome and three reces-
Constructing a Three-Point Recombination sive alleles on the homolog. The F1 were then test-crossed to
Map pure-breeding yellow, glossy, variable plants (v gl va/v gl va).
The test-cross progeny are shown in Table 5.3. To create a
To illustrate the evaluation of genetic linkage and recom-
genetic map that places the three genes in correct relative
bination for the purpose of mapping gene order and rela-
order and to calculate recombination frequencies between
tive distance, we turn to the use of three-point test-cross
analysis. The data presented are based on test crosses gene pairs, we ask and answer five questions about these
between an organism that is trihybrid (the allele arrange- data:
ments may be known or may have to be deduced) and an 1. Are the data consistent with the proposal of genetic
organism that is homozygous for the three recessive alleles.
linkage?
This cross design ensures that the phenotypes of test-cross
progeny will directly reflect the alleles contributed during 2. What alleles are on each parental chromosome?
mating by the trihybrid parent. The triple-recessive parent 3. What is the gene order on the chromosome?
can contribute only recessive alleles, so if progeny exhibit
4. What are the recombination frequencies of the gene
a dominant trait, the trihybrid parent has contributed the
pairs?
dominant allele, and if progeny exhibit arecessive trait, the
trihybrid parent has contributed arecessive allele. The data 5. Is the frequency of double crossovers consistent with
we describe are from a 1935 study by Rollins Emerson of the independent occurrence of single crossovers?
Parental cross:
V Gl Va/V Gl Va v gl va/ v gl va
:
Green, rough, normal yellow, glossy, variable
Test cross: V Gl Va/v gl va v gl va/v gl va
:
Green, rough, normal yellow, glossy, variable
Test-cross progeny:
Number Number Genotype
Phenotype Observed Expected (gamete / gamete)
1. Yellow, rough, normal 60 90.75 v Gl Va/v gl va
2. Yellow, glossy, normal 48 90.75 v gl Va/v gl va
3. Yellow, rough, variable 4 90.75 v Gl va/v gl va
4. Yellow, glossy, variable 270 90.75 v gl va/v gl va
5. Green, rough, normal 235 90.75 V Gl Va/v gl va
6. Green, glossy, normal 7 90.75 V gl Va/v gl va
7. Green, rough, variable 40 90.75 V Gl va/v gl va
8. Green, glossy, variable 62 90.75 V gl va/v gl va
726 726
5.3 Three-Point Test-Cross Analysis Maps Genes 159
Question 1: Are the Data Consistent with the Proposal single-crossover classes, and two are double crossovers.
of Genetic Linkage? Under the assumptions of indepen- Double-crossover progeny will be the least frequent of all
dent assortment, trihybrid plants produce eight genetically classes, because both crossover events must occur simul-
different gametes at a frequency of 0.125, or 1/8, each, and taneously to produce double recombinants, or double
test-cross progeny are expected in eight equally frequent crossovers. From progeny numbers, we may presume that
phenotypic classes. In this experiment, with 726 test-cross the smallest classes, Class 3—yellow, rough, variable—and
progeny, the expected number of progeny in each class Class 6—green, glossy, normal—are the probable double
would be 17262 10.1252 = 90.75. Chi-square analysis recombinants. We can use these predictions to test possible
comparing observed and expected numbers of progeny in gene orders on parental chromosomes.
each class (Table 5.3) yields a chi-square value in excess For these three genes there are only three possible
of 800. There are 18 9 12 = 7 degrees of freedom, and the gene orders: (1) va–v–gl, (2) v–va–gl, or (3) va–gl–v.
corresponding p value is p 6 0.005. From this result, we There are no data to assist us in determining the left-to-
conclude that the observed distribution of test-cross prog- right orientation of the chromosome, so the difference
eny deviates significantly from expectation, and we reject between these gene orders is defined entirely by which
the independent assortment hypothesis as the explanation gene is in the middle—v, va, or gl—and which two genes
of these data. flank the middle gene. Each gene order could be written
If the deviation in this experiment is due to genetic in the opposite direction, since each is a relative order of
linkage, then we would expect the numbers of progeny the three genes. For example, va–v–gl and gl–v–va are
having parental phenotypes to be excessively high. Com- equivalent gene orders because each has v as the middle
paring the observed and expected values in each test- gene.
cross class shows that only two phenotype classes exceed There are two ways to determine the gene order. One
expected numbers: the green, rough, normal class and the procedure is to list each gene order possible for the paren-
yellow, glossy, variable class. These are the two parental tal chromosomes, draw the corresponding double-crossover
phenotypes. From this analysis, we conclude that the data chromosomes, and then determine whether the double-
are consistent with genetic linkage: the distribution of test- crossover gametes produced by this activity match the pre-
cross progeny deviates significantly from what would be dicted double-crossover progeny. If a match is not seen, the
expected from independent assortment, and only parental gene order is incorrect, but if a match is found, the correct
phenotypes are seen more often than expected by chance. gene order has been identified.
1. Possible gene order va–v–gl
Question 2: What Alleles Are on Each Parental Chro-
mosome? We can answer this question in two ways.
Predicted double-
The simpler approach is to use the phenotype informa- Parental chromosomes crossover gametes
tion available about pure-breeding parental plants in the Va V Gl Va v Gl
cross. The parent plants were pure-breeding dominant
and pure-breeding recessive. From this information, we
va v gl va V gl
know that trihybrid F1 plants have the dominant alleles
on one chromosome and the recessive alleles on the
homologous chromosome. The genetic structure of the Result: Double-crossover gametes obtained from this gene
test cross is V Gl Va/v gl va * v gl va/v gl va, and so the order are not those predicted from the data (i.e., do not
alleles on parental chromosomes must be V Gl Va and match Class 3 and Class 6 phenotypes).
v gl va. Test-cross progeny Classes 4 and 5 in Table 5.3 Conclusion: The proposed gene order is incorrect; v is not
are parentals. the middle gene.
The second approach is necessary when we do not 2. Possible gene order v–va–gl
know the phenotypes of parents or when the alleles on
each chromosome are not known. In this approach, test-
Predicted double-
cross data are used to determine parental chromosomes. Parental chromosomes crossover gametes
The data in Table 5.3 indicate that the test-cross progeny in V Va Gl V va Gl
Class 5—green, rough, normal (V Gl Va/v gl va)—and in
Class 4—yellow, glossy, variable (v gl va/v gl va)—exceed
expected frequency and are therefore the p arental classes. v va gl v Va gl
Both approaches tell us the same story: The parental chro-
mosomes carry alleles V Gl Va and v gl va. Result: Double-crossover gametes obtained from this gene
order are not those predicted from the data (i.e., do not
Question 3: What Is the Gene Order on the Chro- match Class 3 and Class 6 phenotypes).
mosome? With parental chromosomes identified, the Conclusion: The proposed gene order is incorrect; va is not
six remaining classes must be recombinants: four are the middle gene.
160 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
3. Possible gene order v–gl–va single crossovers between v and gl, and 11 are also added to
the number of single crossovers between gl and va.
Predicted double- Let’s continue with our presumption that the gene order
Parental chromosomes crossover gametes is v–gl–va. Between v and gl, a single crossover produces
V Gl Va V gl Va the following
v gl va v Gl va Predicted single-
Parental chromosomes crossover gametes
V Gl Va V gl va
Result: Double-crossover gametes obtained from this gene
order match those predicted from the data (i.e., do match
Class 3 and Class 6 phenotypes). v gl va v Gl Va
Conclusion: This proposed gene order is correct: gl is the Test-cross progeny carrying recombination between these
middle gene, and the gene order may be written as either two genes have the phenotypes yellow, rough, normal
v–gl–va or va–gl–v. This analysis confirms that test-cross (Class 1) and green, glossy, variable (Class 8). The recom-
progeny Classes 3 and 6 are double-crossover progeny. bination frequency is calculated as the sum of all single
The second method for determining gene order is crossovers for this gene pair plus the 11 crossovers seen in
a shortcut approach that requires some familiarity with double recombinants divided by the total number of progeny:
recombination. Looking back at Figure 5.8, note that if 160 + 62 + 4 + 72/726 = 0.183, or 18.3%. Therefore, the
we compare parental and double-crossover chromosomes, distance between v and gl, is approximately 18.3 cM.
the alleles of the outside genes appear to remain the same Single crossover between gl and va produces the
while the middle allele appears to switch. In other words, following
when we compare one parental chromosome with one dou-
ble-recombinant chromosome, two alleles match and one Predicted single-
Parental chromosomes crossover gametes
does not. The odd one out is the allele in the middle. If
a trihybrid parent has alleles arranged as a +b+c +/abc, then V Gl Va V Gl va
double crossover produces gametes that are a +bc +/ab+c.
Parental alleles a + and c + match one double recombinant, v gl va v gl Va
and alleles b and b+ are switched. Similarly, the second
parental gamete has alleles a and c that match the other Test-cross progeny carrying recombination between these
double recombinant. Alleles of the middle gene, b and b+, two genes have the phenotypes yellow, glossy, normal
have switched in the double recombinant compared with (Class 2) and green, rough, variable (Class 7). Recombina-
the parental chromosome. tion frequency r = 148 + 40 + 4 + 72/726 = 0.136, or
Remember, we have already identified the parental 13.6%. The intergenic distance between gl and va is approx-
and double-crossover phenotypic groups by their numbers. imately 13.6 cM.
We now look at the double crossovers to see which two Recombination between the flanking genes, va
alleles match parental phenotypes and to see which allele and v, is calculated by counting all crossovers between
changes and is therefore the middle gene. In our data set, those genes. Recombination between v and va is
double-recombinant chromosomes are V gl Va and v Gl va. r = 160 + 62 + 48 + 40 + 222/726 = 0.320, or 32%.
In this case, alleles of the gl gene have switched, indicating
that gl is the middle gene. Based on this approach, the gene Question 5: Is the Frequency of Double Crossovers
orders and alleles on parental chromosomes are V Gl Va Consistent with the Independent Occurrence of the
and v gl va. Single Crossovers? In most tests of genetic linkage,
the number of double crossovers is less than the number
Question 4: What Are the Recombination Frequen- expected given the frequencies of the single crossovers.
cies of the Gene Pairs? We calculate the recombina- Question 5 allows this common observation to be quanti-
tion frequency for a pair of linked genes by counting the fied. The reduction in the observed number of double
total number of crossovers that occur between them. Every crossovers relative to the number expected if the two single
crossover event between the two genes is counted, whether crossovers happened independently of one another is caused
the event occurs by itself (a single crossover) or simultane- by an effect called interference (I). Interference indicates
ously with another event (a double crossover). In this case, the influence of some process or processes that limit the
there are 11 double recombinants, each with one crossover number of crossovers that can occur in a short length of
between v and gl and one crossover between gl and va, for chromosome. Interference is quantified by comparing the
a total of 22 crossover events. These 22 crossovers must be number or frequency of observed double-crossover events
counted in the determination of recombination frequency, with the number or frequency expected assuming each
so 11 of these crossovers will be added to the number of crossover event occurs independently. In Emerson’s data
5.3 Three-Point Test-Cross Analysis Maps Genes 161
set, there are 11 double crossovers among test-cross prog- double crossovers are equal. The molecular basis of inter-
eny, or 111/7262 = 0.015 11.5%2. If each crossover were ference is not fully understood, but current research sug-
independent, the expected double-crossover frequency gests that the molecular process of crossing over operates
would be the product of the two single-crossover frequen- to distribute cross-over events widely on chromosomes and
cies, 10.183210.1362 = 0.025, or 2.5%. The expected that there is a mechanical limit that restricts the number of
number of double-crossover progeny would therefore be recombination events in close proximity on a chromosome.
10.025217262 = 18.2. Observed double recombinants are We discuss the molecular process of homologous recombi-
divided by expected double recombinants to produce a nation in Chapter 10.
value known as the coefficient of coincidence (c). Either
the numbers or the frequencies of observed and expected Determining Gamete Frequencies from
double recombinants can be used to determine c:
Genetic Maps
observed double recombinants
c = The same principle used for constructing genetic linkage
expected double recombinants maps—the relation between relative distances and recom-
= 11/18.2 = 0.60 1using numbers2 bination frequency—can be used for making predictions in
or the reverse direction, that is, to determine the expected fre-
= 0.015/0.025 = 0.60 1using frequencies2 quencies of recombinant and nonrecombinant gametes on
the basis of completed genetic linkage maps.
Interference is defined as I = 1 - c, so for this data set In Figure 5.9a, two linked genes have a recombina-
I = 1 - 0.60 = 0.40. Interference identifies the proportion tion frequency of 10%. For the dihybrid organism AB/ab,
of double recombinants that are expected but are not pro- two gametes (AB and ab) are parental, and two (Ab and
duced in the experiment (the difference between expectation aB) are recombinant. Recombinant gametes equal 10% of
and actuality). In this case, the number of double recombi- total gametes, and each recombinant is expected to occur
nants was 40% lower than expected. Interference is a very with the same frequency. The probability is calculated as
common observation in most regions of most genomes. On 112210.0102 = 0.05 for each recombinant gamete. In this
occasion, however, certain regions of some genomes gener- calculation, 12 is the probability of each recombinant chro-
ate more double recombinants than expected. In these cases mosome appearing in a gamete, and 0.010 is the probability
I 6 0, a situation called negative interference. Interfer- of recombination between the genes. From this information,
ence will be zero 1I = 02 when the observed and expected we can calculate that, conversely, parental gametes AB and
(a) (b)
r = 0.10 r = 0.10 r = 0.20
A B A B C
F1 F1
a b a b c
Meiosis and gamete Meiosis and gamete
production production
Gamete
___________ Frequency
_____________ Type
__________________ Gamete
________________ Frequency
_________________ Type
___________________
A B A B C
( 12– )(0.90) = 0.45 ( 12– ) (0.9) (0.8) = 0.36
a b Parental a b c Parental
( 12– )(0.90) = 0.45 ( 12– ) (0.9) (0.8) = 0.36
A b A b c
( 12– )(0.10) = 0.05 ( 12– ) (0.1) (0.8) = 0.04
a B Recombinant a B C Single
( 12– )(0.10) =_____
0.05 ( 12– ) (0.1) (0.8) = 0.04 recombinant (a-b)
1.00 A B c
( 12– ) (0.9) (0.2) = 0.09
a b C Single
( 12– ) (0.9) (0.2) = 0.09 recombinant (b-c)
A b C
1
( 2 ) (0.1) (0.2) = 0.01
–
a B c Double
( 12– ) (0.1) (0.2) =_____
0.01 recombinant
1.00
Figure 5.9 Gamete genotype frequencies calculated from genetic linkage data. (a) Gamete frequencies
predicted from a map of two linked genes. (b) Gamete frequencies predicted from a map of three linked
genes assuming interference is zero 1I = 02.
162 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
ab are formed at a frequency equal to 100% minus 10%, A single crossover between genes A and B in a d ihybrid
or 90% of total gametes. Both of the parental gametes are (AB/ab) produces two parental gametes (AB and ab) and
also expected at equal frequency—in this case 1 12 2 (0.90), or two recombinant gametes (Ab and aB). Double crossover
45% each. between the same genes, however, produces crossover gam-
Gamete frequencies for three linked genes are pre- etes that are not recombinant for the A and B genes and
dicted in a similar manner. In Figure 5.9b, genes a and so are indistinguishable from parentals. These crossover-
b are shown along with a third gene, c, located 20 cM nonrecombinant gametes are not counted when recombina-
from gene b. To predict gamete frequencies, we make the tion frequency between genes is calculated, because they
assumption that interference is I = 0 to simplify the cal- are not observed. Larger distances between genes provide
culation of the number of recombinants. For the trihybrid greater opportunity for double crossover and thus greater
organism ABC/abc, parental gametes are produced when likelihood of crossover-nonrecombinant gametes.
crossover does not occur in either gene interval. Accord- In theory, the relationship between recombination fre-
ing to the genetic map, the probability of no crossovers quency and map distance is linear, but this is not the case
between genes a and b is 90% (0.9), and between b and c it in reality. Line 1 in Figure 5.10 depicts a linear relationship
is 80% (0.8). Considering both gene pairs, the proportion between recombination frequency and the distance in map
of nonrecombinant gametes is 10.9210.82 = 0.72: there units (cM). In contrast, line 2 illustrates the correspondence
are two equally frequent parental gametes, each with an between recombination frequency and actual distance along
expected frequency of 112210.9210.82 = 0.36. Recombina- the map. The lines diverge at about 8 cM, indicating that
tion frequency is 10%, or 0.1, between a and b. The two the relationship between recombination frequency and map
single recombinants between genes a and b each have an distance is linear only for linked genes that are separated by
expected frequency of 10.1210.821122 = 0.04 each (the fre- less than 8 cM, and that observed recombination frequencies
quency of recombination between a and b times the fre- usually underestimate the physical distance between genes.
quency of no recombination between b and c times 12 since The central problem in correlating recombination fre-
there are two such gametes). Similarly, single recombi- quency with the number of recombination events is the dif-
nants between genes b and c have expected frequencies of ficulty of identifying the number of meioses that produce
10.9210.22 12 = 0.09 each. Each of the double-recombinant each possible number of crossovers—zero, one, two, three,
gametes, AbC and aBc, are expected with a frequency of four crossovers, and so on. In an attempt to correctly model
10.1210.2210.52 = 0.01. The sum of frequencies of the
eight predicted gamete genotypes is 1.0, indicating that all
gametes have been counted. 50
1
Recombination frequency (r)
Evaluate
1. Identify the topic of this problem and 1. This problem involves the assessment of three test crosses involving X-linked
the nature of the required answer. genes. The answer requires determination of genetic linkage versus independent
assortment for each gene pair and, for linked genes, the calculation of recombi-
nation frequency.
2. Identify the critical information given 2. The genotypes and phenotypes of test-cross flies are given, and the number of
in the problem. test-cross progeny in each phenotypic category is also given.
Deduce
3. Determine the test-cross results ex- 3. In each cross, the dihybrid female would be expected to produce four genetically
pected under the assumption of inde- different gametes at frequencies of 25% each, and the progeny would be ex-
pendent assortment. pected to display four phenotypes in a 1:1:1:1 ratio (250 each). In Test cross I, for
example, the following results would be expected, and expected results would
be similar for the other test crosses as well.
For more practice, see Problems 2, 4, and 28. Visit the Study Area to access study tools. Mastering Genetics
163
164 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
different recombination classes and to accurately assess the result indicates that evolution is enhanced by the occurrence
correlation between recombination frequency and crossover, of recombination and that recombination rates increase in
J. B. S. Haldane in 1919 developed a mapping function response to evolution.
that correlates map distance and recombination frequency The discussion of mapping functions in Section 5.3
between gene pairs. The Haldane mapping function has lim- mentioned that age, environment, sex, and other, as yet unde-
itations, and several researchers proposed modifications of termined, factors may influence recombination frequency
it to account for specific conditions affecting recombination and affect the relationship between the genetic recombina-
in different species. tion map and the physical map of a chromosome. In female
One consistent concern raised about Haldane’s map- fruit flies, advancing age decreases the frequency of cross-
ping function is that it may overestimate the actual recom- over between gene pairs, so that more crossovers between
bination frequency when interference occurs. Damodar a specific pair of genes are seen in younger females than in
Kosambi developed a modified mapping function to correct older ones. Female Drosophila crossover frequency is also
map distance in species with interference, and it has become affected by temperature: Growing a fruit-fly colony at 22°C
one of the most widely applied improvements. produces many crossovers between chromosomes. Recom-
Mapping functions are a quantitative solution to the bination frequencies change, however, with increases or
problem of variability of recombination frequencies across decreases in temperature. Restricting dietary levels of cal-
the genome and between species. Mapping functions are cium and magnesium, important cofactors for enzymes that
largely made obsolete by genomic sequence analysis in gene interact with DNA, also decreases crossover frequency in
mapping that allows geneticists to use genome sequences to fruit flies.
devise physical maps of the genes on chromosomes. Gene Several other biological factors affecting recombination
mapping is no less important today than it was when Alfred and recombination frequency in organisms are identified in
Sturtevant determined the first genetic map more than 100 the remainder of this section.
years ago, but the methods for constructing maps continue
to evolve. Sex Affects Recombination
The sex of an animal can have a dramatic impact on recom-
bination frequency, which differs for males and females
5.4 Multiple Factors Cause of most animal species. In the general pattern, the hetero-
Recombination to Vary gametic sex, the sex with two different sex chromosomes
(most often males), has a lower rate of recombination
Despite the biological and evolutionary importance of recom- than the homogametic sex, the sex with two fully homolo-
bination, its occurrence is variable among organisms. For gous sex chromosomes (most often females). The higher
example, recombination is a vital component of accurate recombination frequency in the homogametic sex is a
chromosome segregation in mammalian meiosis, but it is not genome-wide phenomenon and is not limited to the sex
required for meiotic efficiency in other organisms. Most ani- chromosomes.
mal species undergo recombination, but in certain species, This difference is seen across the taxonomic spec-
such as Drosophila, recombination is exclusive to females trum, including in humans. Human females experience
and does not occur in males. Furthermore, although our dis- more crossing over than human males, resulting in a larger
cussion of recombination in this chapter is limited to events recombination map in females. A detailed recombination
taking place in meiosis, crossing over between homologous and genome sequencing analysis of human chromosome 19
chromosomes also occurs in mitosis in many species, and exemplifies this phenomenon. Chromosome 19 is composed
rates of mitotic crossover are also highly variable. of about 65 megabases (Mb), or 65 million base pairs, in
From an evolutionary perspective, crossing over and both male and female genomes (Figure 5.11). However,
recombination contribute to genetic diversity. Experimen- the length of the chromosome as determined by adding the
tal evidence supports the idea that homologous recombina- estimated recombination distances along its entire length
tion is a potent factor in evolution and that recombination is is a larger number of map units in females than in males.
favored by natural selection. A meta-analysis by Sarah Otto Also notice that recombination frequencies are greater in
and Thomas Lenormand in 2002 examined recombination regions at the ends of the chromosome in males but are
rates in a large number of artificial selection experiments greater in females in central chromosome regions. For the
conducted by other researchers who were studying the human genome as a whole, the female genetic map contains
evolution of traits that were unrelated to sex or recombina- about 4400 cM, and the male map about 2700 cM. Geneti-
tion. (A meta-analysis is a study that combines the results cists studying the human genome usually produce a “sex
of multiple previous studies with similar structure.) Otto averaged” human genetic map that is slightly larger than
and Lenormand determined that in the majority of cases, 3500 cM.
the rate of recombination had increased significantly as a Among different species, the number of nucleotide base
result of the application of artificial selection to a trait. This pairs per map unit varies. For example, the human genome
5.4 Multiple Factors Cause Recombination to Vary 165
consists of a little less than 3 billion base pairs of DNA, and the regions of DNA are about equally likely to initiate recom-
sex-averaged genome contains about 830,000 bp/cM. In con- bination. Nevertheless, as noted above, many genomes do
trast, the Arabidopsis genome contains about 200,000 bp/cM; contain hotspots and coldspots of recombination—segments
thus, recombination is about four times as frequent in Arabi- of chromosomes that undergo substantially more or substan-
dopsis as it is in humans. tially less recombination than the average for a species.
Studies in yeast have examined this phenomenon in
Recombination Is Dominated by Hotspots detail, and one study of yeast chromosomes has identified
hotspots and coldspots side by side. In Figure 5.12, the cold-
Estimates of average numbers of base pairs per centiMorgan, spot of recombination between spo7 and cdc15 results in
of the average recombination frequency for a species, and of mapping data that appear to place the genes closer to one
distances in a sex-averaged recombination map such as the another than they are in the physical map. In contrast, the
one described for humans are just that: averaged estimates. hotspot between cdc15 and FLO1 makes them appear to be
In contrast, genome-based information on organisms has led farther apart on the genetic recombination map than on the
to the creation of fine-scale genetic maps of species that iden-
tify the distribution of recombination across the genome with
much greater precision. Detailed assessment of recombina- Physical length (kb) Genetic length (cM)
tion in human, mouse, and yeast genomes reveals a highly 0 0
variable pattern of recombination within each genome that cdc24
has led to the identification of recombination hotspots and
recombination coldspots, even while reinforcing the general
cdc19
theme of a rough proportionality between recombination fre-
quencies and the physical maps of chromosomes.
Genetic recombination maps are generated by analysis
of recombination information and recombination frequency mak16
cdc24
data. Physical maps of chromosomes, on the other hand, are cdc19
based on genomic sequence data that identify specific genes
within DNA sequence. The proportionality between genetic cys3
recombination maps and physical maps of a chromosome
mak16
makes it possible to generate gene maps that locate the
spo7
position and approximate distance between genes along a
Centromere Cold spot
chromosome. This proportionality exists because almost all cys3
spo7 cdc15
Figure 5.11 Physical distance versus recombination distance on Q Considering the information in this figure, in Figure 5.11,
human male and female chromosome 19. In most sexually repro- and in the corresponding discussion in this chapter, why is the
ducing organisms, the heterogametic sex has fewer recombination generalization that 1% recombination equals 1 map unit of
events and a shorter recombination map than does the homoga- distance between genes not an accurate reflection of the reality
metic sex. Data adapted from J. L. Weber et al. (1993). of crossing over?
166 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
physical map of the chromosome. The other genes in this plays an important role in determining the site at which
chromosome region have generally good proportionality recombination will occur.
between recombination and physical distances. Recent studies in humans verify the possible involve-
The reason for the existence of hotspots and coldspots ment of PRDM9 in recombination at hotspots. In addition,
of recombination may have to do with the ability of DNA human genome–aided analysis of recombination distribu-
regions near specific genes to initiate the molecular events tion finds that human recombination hotspots are located in
associated with the first steps of crossing over. In the case of short regions of 1000–2000 bp. The data indicate that there
the coldspot between spo7 and cdc15 in yeast, the chromo- may be 30,000 or so such recombination hotspots in the
some centromere is between the genes, which may be an human genome, spaced about every 50,000 to 100,000 bp.
additional factor contributing to the relatively low recombi-
nation between them. We discuss more about the molecular
process of recombination in Section 11.6. 5.5 Human Genes Are Mapped
Genome Sequence Analysis Reveals
Using Specialized Methods
Recombination Hotspot Distribution Until relatively recently, the human genetic map was rather
Variability of recombination across the genome appears sparse. Humans cannot be studied through controlled mat-
to be the rule, as verified by recent studies in Drosoph- ings and in any case produce much smaller numbers of off-
ila, mouse, and humans. These studies show that within spring than do organisms like Drosophila and Zea mays.
the genome, recombination occurs primarily at specific Consequently, gene-mapping methods developed and used
hotspots, punctuated by long stretches in which little or no successfully to map genes in model organisms are difficult
recombination occurs. to apply to human gene mapping. Historically, X-linked
A 2013 study in Drosophila by Nadia Singh and col- genes, by virtue of their unique patterns of transmission,
leagues examined more than 6700 crossovers in the X chro- were the first and easiest human genes to map, whereas
mosome between the garnet gene controlling eye color and progress in mapping human autosomal genes was hampered
the scalloped gene controlling wing shape. The authors iden- by a scarcity of known polymorphic genes, such as those for
tified a recombination rate of 7.3% (7.3 cM) between these blood group antigens and blood proteins.
genes, using the kind of recombination mapping analysis Human genome mapping changed significantly in the
described in the preceding discussion. Drosophila genome mid-1980s, facilitated both by the emergence of molecular
sequence information indicated that the two genes are sepa- genetic methods to identify polymorphic DNA sequences
rated by approximately 2 million base pairs. To find specifi- and by advances in gene-mapping software. The various
cally where within the 2 million base pairs recombination DNA sequence polymorphisms are broadly identified as
occurs, the authors used 451 known sequence variations lying genetic markers. This term includes several types of inher-
between the two genes to map the location of each recom- ited DNA sequence polymorphisms that we describe below.
bination event with great precision. The 2 million base pairs Collectively, these genetic markers provide thousands of
between the genes were divided into blocks of 5000 base signposts on every chromosome to assist in gene mapping
pairs, and the number of crossovers in each 5000-bp block was and linkage analysis. Combined with sophisticated statis-
tabulated. The results revealed a 90-fold difference in recom- tical techniques and modern computer power, the use of
bination rates for different blocks. Some 5000-bp regions had these genetic markers has given geneticists the ability to
low recombination rates equivalent to 0.3 cM per million base effectively map human genes by genetic linkage analysis.
pairs, whereas other blocks had rates as high as the equiva- The availability of large numbers of DNA markers on
lent of 27 cM per million base pairs. This result indicates that each chromosome led first to the identification of linkage
recombinational hotspots are distributed very unevenly within groups, clusters of syntenic genes that are linked to one
the Drosophila genome and that most recombination events another, and then to assignment of chromosomal locations to
are limited to relatively short segments of DNA. linkage groups. The discovery of genetic linkage between a
Studies in mammalian genomes, particularly those of genetic marker with a known chromosome location and any
mouse and human, produce similar results. In the mouse member of a linkage group assigns the linkage group to a
genome, recombination rates are highly uneven, with chromosome location near the genetic marker. Different link-
hotspots of recombination serving as the predominant loca- age groups on the same chromosome can then be organized
tions of crossing over. Mouse results have identified thou- into maps of chromosome segments and whole chromosomes.
sands of regions containing a 13-bp sequence—a so-called
13-mer—that appears to be located at the sites of up to 40%
of the hotspots in the genome. Strong evidence indicates
Mapping with Genetic Markers
that a mouse protein designated PRDM9 binds to genome An array of different variants of DNA sequence constitute
regions containing the 13-mer. It has been proposed that in the genetic markers that are located along chromosomes
a large proportion of mouse recombination events, PRDM9 and can be used to study the locations of expressed genes.
5.5 Human Genes Are Mapped Using Specialized Methods 167
These markers are almost always in noncoding regions of (a) VNTR (variable number tandem repeat)
the genome, meaning that the sequence variation does not
affect the coding or regulatory region of an expressed gene 6 repeats
or protein and does not affect the phenotype of the organ-
isms in any way. One kind of genetic marker is the variable 10 repeats
number tandem repeat (VNTR). These consist of short
sequences of DNA, usually 3 to 20 base pairs. The short
sequences are repeated end-to-end in a chromosome region. (b) Codominant inheritance of a VNTR
Since these occur in noncoding regions, natural selection
does not put any rigid constraints on their variation; differ-
6,10 8,14
ent chromosomes can carry different numbers of repeats of
the sequence, and there may be a large number of different
repeat lengths among chromosomes in a population. 6,8 10,14 8,10 6,14
Each individual is either homozygous or heterozygous
for alleles at a VNTR marker. Figure 5.13a illustrates the 14
appearance of a VNTR in a pair of homologous chromo- 12
Repeats
somes of a heterozygous individual. Figure 5.13b illustrates 10
the observation of VNTRs in a gel and the codominant pat- 8
tern of transmission of VNTRs on autosomal chromosomes 6
when heterozygous parents each donate one VNTR allele
to each child. Transmission of these alleles is codomi-
nant because each allele in a heterozygous genotype can (c) SNP (single nucleotide polymorphism)
be detected, and homozygotes can be distinguished from ...ATCCGAC... ...ATCCGGC...
heterozygotes. ...TAGGCTG... ...TAGGCCG...
Much more commonly used than VNTRs as genetic Allele A1 Allele A2
markers are single nucleotide polymorphisms (SNPs;
pronounced “snips”). SNPs are DNA sequence variants in Figure 5.13 VNTRs and SNPs. Variable number tandem repeats
which one base pair is substituted by another base pair; (VNTRs) contain a variable number of repeat-sequence blocks.
SNPs, too, are usually located in noncoding parts of the (a) A chromosome pair in which one homolog has 6 repeats and
genome. Figure 5.13c shows a pair of SNP alleles: allele A1 the other has 10 repeats. (b) VNTRs are inherited in a codomi-
nant manner. (c) Single nucleotide polymorphisms (SNPs) are
contains an A–T base pair whereas allele A2 contains a G–C
single base-pair sequence variants, also inherited in a codominant
base pair. As with VNTR genotypes, individuals are either
manner.
homozygous or heterozygous for SNP alleles, which also
are transmitted in a codominant manner. It is estimated that
there are approximately 3.3 million SNPs spread through- DNA fragments called restriction fragments. These restric-
out the human genome, and they have proven to be enor- tion fragments are detected by methods that are similar to
mously useful in mapping analysis of the human genome. those used to identify VNTRs. RFLPs on autosomal chro-
A type of DNA genetic marker related to SNPs is the mosomes are also transmitted in a codominant manner.
restriction fragment length polymorphism (RFLP; pro-
nounced “riff lip”). RFLPs result from a change in DNA The Inheritance of Disease-Causing Genes
sequence, but they are analyzed in a different way. Instead
of sequencing the region containing the sequence vari-
Linked to Genetic Markers
ant, geneticists detect RFLPs with the aid of DNA-cutting The genetic markers used to help map genes usually have
enzymes known as restriction endonucleases—restriction known chromosome locations. SNP genes, for example,
enzymes, for short—that recognize and cut specific are identified by detecting variation in DNA sequence at a
sequences of DNA. There are hundreds of different restric- particular location. Contemporary methods of detecting and
tion enzymes. Each recognizes a different short sequence recording genome sequences are able to identify these loca-
of DNA and cuts DNA at that recognition site every time tions with precision, leading to a catalog of SNPs on each
the site is encountered. For example, the restriction enzyme chromosome and a map that identifies the location of each
EcoRI (pronounced “eco are one”) recognizes the double- of them.
stranded DNA sequence 5´–GAATTC–3´. EcoRI cuts DNA With more than 3 million SNPs in the human genome,
at this sequence, and only at this sequence. (To repeat, other tens of thousands of SNPs are located on each chromosome.
restriction enzymes have their own recognition sequences.) Syntenic SNPs that are close together in a small region of
A large genome like the human genome contains hundreds a chromosome constitute a set of closely linked variants
of thousands of the EcoRI recognition sequence, and treating called a haplotype. Haplotypes consist of several genetic
human DNA with EcoRI produces hundreds of thousands of variants closely packed along a segment of a chromosome.
168 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
The term is a contraction of “haploid genotype,” where d allele being on the chromosome with the B2A2 haplotype.
haploid is used to mean one chromosome. Each haplotype This pattern holds true for individuals III-1 through III-4.
has a distinctive genetic makeup that allows it to be used Individuals III-5 and III-6 display different patterns of
to distinguish one chromosome from another chromosome: the haplotype and disease gene alleles. The mother of each
One or more of the SNPs in a haplotype on one of a pair of child (II-6) has donated the B2A1 haplotype along with a copy
chromosomes may differ from those in the equivalent haplo- of her d allele. Individual III-5 has the wild-type phenotype,
type on the other chromosome. has received the father’s B1A1 haplotype, and has received
Figure 5.14a shows a haplotype consisting of two the d allele. This is the result of crossing over between his
SNPs. The alleles of one SNP are designated A1 and A2, and father’s chromosomes, as illustrated in Figure 5.14c. By
the alleles of the other are B1 and B2. A gene, gene D, is also similar analysis, we see that III-6 has the disease allele D
shown on these chromosomes. The D allele for this gene is inherited on her father’s chromosome that contains the B2A1
a dominant mutant allele causing a rare hereditary disease haplotype. This is also the result of crossing over between
inherited as an autosomal dominant trait. Allele d of this the father’s chromosomes as shown in Figure 5.14c.
gene is the recessive wild-type allele. Individual I-1 in the
family tree shown in Figure 5.14b has the SNP haplotype
Allelic Phase
containing alleles B1 and A1 on one chromosome and alleles
B2 and A2 on the homologous chromosome. He also has the Suppose the chromosome location of a disease-causing
autosomal dominant disease, suggesting that the disease- gene is unknown. What research strategy should be used by
causing allele D is carried on one of these chromosomes. researchers seeking to map the gene? Often, the answer is
His mate, I-2, has the wild-type phenotype and the genotype to use genetic linkage analysis to establish linkage between
dd. Her SNP haplotypes are B2A1 and B1A2. genetic markers of haplotypes with known chromosome
In generations II and III of the family tree, the auto- locations and the disease-causing genes whose chromosome
somal dominant pattern of transmission of the disease is location is sought. Determining that genetic linkage exists
apparent. Looking carefully at the SNP haplotypes, we see identifies the location of the unmapped gene.
that each child in generation II who inherits the disease also To map genes by this approach, it is essential that
inherits the B1A1 haplotype from their father, and each child parental and recombinant chromosomes be identified. Thus,
who has the wild-type phenotype has inherited the father’s one of the first challenges researchers encounter in the effort
B2A2 SNP haplotype. This is consistent with the D allele to map human genes is to determine the allelic phase—the
being on the chromosome with the B1A1 haplotype and the particular combination of alleles of linked genes—on each
B1 A1 D B2 A1 d
(a)
B2 A2 d B1 A2 d
1 2
(b) I
B1 A1 B2 A1
B2 A2 B1 A2
1 2 3 4 5 6 7 8 9 10
II
B1 A1 B2 A2 B1 A1 B1 A1 B1 A1 B2 A1 B2 A2 B1 A1 B2 A2 B2 A2
B2 A1 B1 A1 B2 A1 B1 A2 B2 A1 B2 A1 B2 A1 B2 A1 B2 A1 B1 A2
III 1 2 3 4 5 6
B1 A1 B2 A1 B2 A1 B1 A1 B1 A1 B2 A1
B2 A2 B2 A2 B2 A1 B2 A1 B2 A1 B2 A1
Recombinants
B1 A1 D B1 A1 d
(c) Figure 5.14 Haplotype inheritance. (a) Syntenic
alleles for two SNPs (A and B) and a disease gene (D)
B2 A1 d B2 A1 D that form a haplotype. (b) Haplotype inheritance in a
family is used to identify recombinant and nonrecom-
Parental homologs Recombinant homologs binant chromosomes. (c) Recombination between the
in II-5 in III-5 (B1A1d) and homologs of II-5 produces two recombinant chromo-
III-6 (B2A1D) somes that are identified by haplotype changes.
5.5 Human Genes Are Mapped Using Specialized Methods 169
parental chromosome. The simplest approach to determin- information for generation I, and thus allelic phase for II-1
ing the allelic phase is to consider the alleles of two linked is unknown. He could either be P1D/P2d or P1d/P2D. For the
genes. Knowing, for example, the allelic phase of a marker purposes of genetic linkage analysis, each possible phase
gene and a gene at which a mutant allele causes a genetic must be treated as equally likely. With allelic phase in II-1
disease of interest improves the statistical power of genetic unknown, we cannot be certain which of his children have
linkage estimates used to map the location of the disease- inherited parental chromosomes and which carry recombi-
causing gene. Figure 5.15 illustrates how allelic phase is nants. If II-1 is P1D/P2d, his children III-1 to III-5, and III-7
identified in a family in which an autosomal dominant dis- and III-8 are parental, and III-6 is recombinant. Alterna-
ease is present. The two pedigrees in the figure are identical tively, if he is P1d/P2D, then III-1 to III-5 and III-7 and III-8
in structure and in the distribution of the autosomal domi- are recombinant and III-6 is parental.
nant disease that is indicated by shaded symbols. Notice,
however, that individuals I-1 and I-2 are alive and so could
Lod Score Analysis
be genotyped for the genetic marker in Family A, which is
not the case in Family B. The alleles of the gene determin- The unique genetic challenges presented by the study of
ing the disease phenotype are D and d. In addition to allelic heredity in humans have also led to investigatory methods that
information for the disease locus, the pedigrees show allelic rely heavily on statistics. A statistical method developed by
information for a closely linked polymorphic DNA marker Newton Morton in 1955 and refined and expanded since then
that has six alleles identified as P1 to P6. is one of the central methods for analyzing genetic linkage in
Allelic phase for the disease allele and the genetic humans. Morton’s method determines whether genetic link-
marker is known to be P1D in Family A because the affected age exists between genes for which allelic phase is unknown
woman in generation I (I-2) transmits marker allele P1 along by comparing the likelihood of obtaining the genotypes and
with the dominant disease allele (D) to her son, II-1. The phenotypes observed in a pedigree if two genes are linked ver-
unaffected man in generation I (I-1) is homozygous for the sus the likelihood of getting the same pedigree outcomes if the
recessive wild-type allele (dd) at the disease locus and het- genes assort independently. The ratio of these two likelihoods
erozygous for DNA marker alleles P2 and P5. Allelic phase gives the “odds” of genetic linkage, and the logarithm of the
in II-1 is P1D/P2d; the chromosome on the left of the solidus odds ratio generates the lod score, a statistical value represent-
(/) is maternal, the chromosome on the right paternal. Con- ing the probability of genetic linkage between the genes.
sidering that his mate (II-2) is P3d/P4d, we can identify the The numerator of the odds ratio that yields the lod
transmission of parental and recombinant gametes from II-1 score is the likelihood that the distribution of phenotypes
to his children in generation III. Children III-1, III-3, and and genotypes in the pedigree is produced by genetic link-
III-4 inherited a paternal chromosome carrying P1D to pro- age between the genes. The denominator is the likelihood of
duce their disease and either the P3 or P4 allele along with d the same pedigree outcomes assuming independent assort-
on their maternal chromosome. On the other hand, III-2, III- ment between the genes (i.e., no genetic linkage). Lod
5, III-7, and III-8 inherited alleles P2 and d on their pater- score analysis evaluates each pedigree and determines the
nal chromosome and either P3 or P4 along with d on their likelihood of genetic linkage for many different recombi-
maternal chromosome. Child III-6 has apparently inherited nation frequencies, each expressed as a variable called the
a recombinant chromosome carrying alleles P2 and D from u value (“theta value”). Using input data on each family
her father along with P3 and d on the maternal chromosome. member that identifies presence or absence of the disease
The pedigree for Family B does not allow identifica- and the genotype at a potentially linked marker gene, soft-
tion of allelic phase. In this family, there is no marker ware programs calculate the likelihoods of genetic linkage
P1P3 P2P3 P1P4 P 1P 3 P2P4 P2P3 P2P4 P2P3 P1P3 P2P3 P1P4 P1P3 P2P4 P2P3 P2P4 P2P3
Allelic phase is known in family A by tracing Allelic phase is not known in family B because
the transmission of the disease allele (D) and the disease allele carried by II-1 could be on either
the P1 genetic marker allele from I-2 to II-1 and the chromosome carrying genetic marker allele P1
to III-1, III-3 and III-4; III-6 is a probable or the chromosome carrying P2.
recombinant.
170 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
versus no linkage and compute lod scores for each u value sufficiently greater than the probability of independent
specified by the investigator. The u values are any recom- assortment; or it can argue against genetic linkage, if the
bination frequency between u = 0 (complete genetic link- probability of independent assortment is sufficiently greater
age) and u = 0.50 (independent assortment). The programs than the linkage probability. Lod scores can be interpreted
determine lod scores, and because they are log values, the for individual families, or they can be added together for as
lod scores for a given u value in different families can be many families as are analyzed. In either case, lod score sig-
added together. After analyzing all available family data, the nificance is interpreted by the following parameters:
lod scores for each u value are summed, and the highest lod
1. A lod score of 3.0 or greater is considered significant
score value obtained in a study is designated Zmax. The Zmax
evidence in favor of genetic linkage. Such a score indi-
corresponds to the u value that is the most likely recombina-
cates significant odds of genetic linkage at each u value
tion frequency between the genes tested.
at which it occurs. The u values identified as signifi-
For each u value tested, the lod score will be positive if
cant indicate the most likely number of centiMorgans
the likelihood of genetic linkage is greater than the likelihood
between linked genes.
of independent assortment, because in that case, the numera-
tor value (likelihood assuming genetic linkage) is greater than 2. Lod score values of less than -2.0 represent significant
the denominator value (likelihood assuming independent evidence against genetic linkage. Any lod score values
assortment). Conversely, if the pedigree is more likely to be for single or multiple families less than -2.0 reject
produced by independent assortment than by genetic linkage, genetic linkage at each u value with that result.
the independent assortment likelihood will be larger than the 3. Lod score values between 3.0 and -2.0 are inconclu-
genetic linkage likelihood, and the lod score will be negative. sive, neither affirming nor rejecting genetic linkage
Lod scores are calculated using the assumption that if between the genes examined. Inconclusive results can
two genes have a recombination frequency equal to u, the be revised as additional data are collected.
probability that a particular gamete is recombinant is also
The three lod score curves shown in Figure 5.16 illus-
equal to u, and the probability that a gamete is nonrecom-
trate that lod score results may produce different patterns
binant is 1 - u. Table 5.4 shows calculated lod score val-
ues for the two families shown in Figure 5.15. Notice that
Maximum
the lod scores are higher for Family A than for Family B. 5 lod score (Zmax )
This is because, with allelic phase known in Family A, the Significant
likelihood estimate for genetic linkage between the disease u range
4 1. Significant result
gene and the marker gene is more accurate and leads to a favoring linkage
higher probability of genetic linkage in this case. For each 1
child in generation III, the probability that the gamete from 3 +3.0
the mother is parental is 1 - u, and the probability that a
recombinant gamete is transmitted from mother to child 2
is u. Since allelic phase is known for Family A, only the
known phase is tested. In contrast, Family B does not have 1
a known allelic phase; thus, each possible phase is assumed 2. Inconclusive result
Lod score
Mapping a Gene for Breast and Ovarian polymorphic blood proteins and enzymes. None of the doz-
ens of biochemical markers screened produced significant
Cancer Susceptibility evidence of genetic linkage to a breast and ovarian cancer
Most cases of cancer develop through the acquisition of mul- susceptibility gene. In the early 1990s, however, King and
tiple mutations in somatic cells, with no inherited mutation her colleagues turned to the use of DNA genetic markers.
increasing the likelihood of cancer development. In some In 1994, they identified genetic linkage between a group
families, however, the frequent occurrence of a particular of tightly clustered DNA markers on human chromosome
kind of cancer in a pattern consistent with single-gene inheri- 17 and a gene named Breast Cancer 1 (BRCA1). Lod score
tance can strongly suggest the hereditary transmission of analysis of chromosome 17, as summarized in the following
a mutant allele that increases the susceptibility of individu- table, revealed that the candidate gene has a Zmax value of
als to the cancer. The identity, indeed the very existence of 21.68 at u = 0.13.
these genes, is not known until they are conclusively shown Five genetic markers that are part of a multipoint link-
to contribute to cancer development. One research strategy age analysis are shown. BRCA1 is most likely close to the
for identifying cancer-susceptibility genes looks for genetic middle of this linkage group, near the DNA marker gene
linkage of susceptibility genes to genetic markers that have a D17S588.
known chromosome location. Subsequent studies have identified and cloned the
In the late 1970s, Mary-Claire King and several collabo- BRCA1 gene and determined that it participates with a sec-
rators conducted a search for a gene whose mutation could ond gene called BRCA2 in DNA mutation repair. A large
increase susceptibility to breast and ovarian cancer in families. number of mutations of BRCA1 have been identified, and
some of them dramatically increase the likelihood that a
The strategy devised by King and her colleagues to maximize
woman will develop breast or ovarian cancer. Other muta-
their chance of finding such a cancer-susceptibility gene was to
tions of BRCA1 do not appear to significantly increase breast
carefully select families in which multiple cases of breast and
or ovarian cancer risk. A good deal of work remains to be
ovarian cancers appeared at young ages, and in which occa-
done to clarify the role of this gene in breast and ovarian can-
sional cases of bilateral cancer occurred (affecting both breasts
cer development, but the research strategy designed by King
or both ovaries in a single patient) in patterns consistent with demonstrates the power of genetic linkage analysis for locat-
an autosomal dominant inheritance of disease susceptibility. ing genes of interest. (We discuss more about BRCA1 and
King initially looked for genetic linkage between inher- cancer in Application Chapter C).
ited cancer susceptibility and biochemical markers such as
depending on the level of information available for the A number of more comprehensive software programs
pedigree and on the actual relationship between the genes permitting multipoint linkage analysis have been devel-
tested. Curve 1 displays data with a maximum lod score oped to analyze genetic linkage data for multiple genes and
value 1Zmax2 of about 4.0 at u = 0.23, suggesting the two genetic markers simultaneously. Multipoint linkage analy-
genes are separated by 23 cM. The lod scores are sig- sis tests all possible gene orders to identify the most likely
nificantly positive in the range of 18 to 30 map units. The order of linked genes.
curve provides significant evidence against genetic link- Experimental Insight 5.1 discusses the application of
age at u 6 0.5. Curve 2 results from a situation in which lod score analysis in the mapping of BRCA1, a gene whose
very little genetic linkage information is available, and its mutation can increase susceptibility to breast and ovarian
lod scores are inconclusive at all distances. Curve 3 rejects cancer in women. Genetic Analysis 5.3 guides you through
genetic linkage at u values less than 0.12 but is inconclusive the interpretation of lod score values for linkage between a
through the rest of the linkage range. disease-causing gene and a linked DNA genetic marker.
GENETIC ANALYSIS 5.3
PROBLEM In a study of human families with an autosomal dominant disease caused by a gene whose
location is unknown, geneticists use lod score analysis to test linkage between the disease gene and
a variable DNA genetic marker. Provide a complete interpretation of the
BREAK IT DOWN: The lod score is a
statistical value that allows identifica- lod score data displayed in the following table, and identify the most
tion of the most likely recombination likely distance between the marker gene and the disease gene.
distance between genes and, by exten- BREAK IT DOWN: Lod score values
sion, rejection of linkage (p. 169). greater than + 3.0 indicate statistically
significant evidence in favor of genetic
u Value linkage, and values less than - 2.0
significant evidence against linkage, at
0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.08 0.10 0.15 0.20 0.30 0.40 0.50 specified u values (p. 170).
- ∞ -6.95 -1.10 0.20 1.22 2.25 7.23 7.02 5.11 4.23 -2.01 -6.84 -9.91 0.0
Evaluate
1. Identify the topic of this problem and 1. This problem concerns lod score analysis assessing genetic linkage between a
the nature of the required answer. variable DNA genetic marker and a gene carrying a dominant mutation produc-
ing a disease. The answer requires interpretation of the lod score values, identifi-
cation of potential genetic linkage, and determination of the most likely distance
between the DNA marker gene and the disease gene.
2. Identify the critical information given 2. Lod score values are given for 14 u values (map units between genes).
in the problem.
TIP: Survey the entire lod score table to identify
significant and nonsignificant lod score values.
Deduce
3. Identify significant lod score values in 3. Significant evidence against genetic linkage occurs at u … 0.01 and at u Ú 0.20.
the lod score table and locate Zmax. Conversely, significant results in favor of genetic linkage are seen at u = 0.06 to
u = 0.15. The Zmax value is 7.23 and corresponds to u = 0.06 (6 m.u.).
Solve
4. Interpret the meaning of the lod 4. The data support genetic linkage between the marker gene and the disease
scores for genetic linkage. gene at recombination distances of between 6 m.u. and 15 m.u. Linkage be-
tween the genes is rejected at less than 2 m.u. and at more than 20 m.u. The lod
TIP: Note the u values corresponding score results between 2 m.u. and 5 m.u. are inconclusive.
to significant lod score values.
5. Identify the most likely distance 5. The Zmax value is 7.23 at u = 0.06, thus identifying the most likely distance be-
between the DNA tween the disease gene and the marker gene as 6 m.u.
marker gene and
the disease gene. TIP: The maximum lod score value
corresponds to a specific distance
between genes that is identified
by its u value.
For more practice, see Problems 18, 28, and 29. Visit the Study Area to access study tools. Mastering Genetics
Genome-Wide Association Studies GWAS is designed to detect and locate the genes that as a
group influence the form or appearance of traits produced
The genetic mapping approach that links alleles for pheno- by multiple genes. The multiple genes contributing to a par-
typic traits to molecular markers is built on one-to-one rela- ticular trait or condition are likely to be scattered throughout
tionships. This means that one genetic marker is linked to the genome. GWAS helps identify where in the genome the
another genetic marker, and that a series of linked markers genes influencing a trait are located.
along a chromosome constitutes a genetic map of the chro- GWAS does not create a gene map along a chromosome.
mosome. Tens of thousands of genes have been mapped Instead, it looks for associations between traits and groups
by this approach in the genomes of organisms commonly of alleles in populations of organisms to spot where on dif-
used for genetic study, including fruit flies, corn, mice, and ferent chromosomes influential genes are located. In the
humans. context of GWAS, the term “association” means that a trait
Another method of analysis known as genome-wide co-occurs with a group of alleles more often than expected
association studies (GWAS) takes a different approach. by chance. The alleles used in GWAS are usually SNPs,
172
5.5 Human Genes Are Mapped Using Specialized Methods 173
as these are the most frequent type of molecular marker in of a gene is preferentially associated with the haplotype on
most genomes. The statistical analysis that identifies asso- the same chromosome. This can lead to particular alleles
ciations between a SNP marker and a disease-susceptibility of the gene contributing to the trait of interest being found
gene identifies the strength or level of significance of the more frequently than expected with a particular haplotype.
association as a P value (probability value). For example, an allele contributing to the development of
GWAS uses small haplotypes consisting of very closely a particular condition might be more commonly found on
linked SNPs. Many such groups of SNPs can be identified a chromosome with a certain haplotype than expected by
on all chromosomes throughout the genome. These groups chance. The detection of linkage disequilibrium between
of SNPs have known chromosome locations, usually as an allele of a gene that contributes to the development of
a result of genome sequence mapping (see Chapter 16). a particular trait or condition can help researchers locate
Because these haplotypes are most often used for purposes the contributing gene by genetic linkage to the haplotype.
of GWAS, each chromosome of a homologous pair can be GWAS analysis assesses linkage disequilibrium between
conveniently described by a particular haplotype. As an alleles of genes potentially involved in generating pheno-
example, the same DNA region of two homologous chro- typic variation and closely linked haplotypes to map the
mosomes can be compared as shown here (each chromo- locations of the potential contributory genes. The potential
some is represented by only one strand of its DNA duplex): significance of associations is assessed by determining P
(probability) values. Significant P values indicate the likely
Chromosome 1: ATTCATGCTCGA
presence of a gene influencing the appearance of a trait or
Chromosome 2: ATACATGATCTA
condition.
The third, eight, and eleventh nucleotides of these se- In recent years, GWAS has been used to analyze the
quences differ, thus there are three SNP variants detected. human genome and other eukaryotic genomes in the search
Each chromosome can also be said to carry a distinct haplo- for genes that influence many kinds of traits to which
type for this region of the genome. multiple genes make a contribution. This is done by using
In populations, alleles for different genes are expected GWAS to show that significant associations exist between
to be found in genotypes in random combinations. Gener- an inherited trait or condition and haplotypes on multiple
ally, no allele for any one gene is associated with a given chromosomes. GWAS results of this kind suggest that there
allele for any other gene in a genotype more frequently than are multiple genes contributing to the condition or trait of
would be expected by chance. This is a common state that interest.
is known as linkage equilibrium. For example, if allele A One large GWAS analysis of common conditions in
has a frequency of 70 percent in a population (0.70), with humans is a 2007 meta-analysis that tested for linkage
allele A′, the other allele of the gene, having a frequency disequilibrium between several thousand SNPs and seven
of 30 percent (0.30), and if allele B has a frequency of common disease conditions in humans. The genomes of
20 percent (0.20), with allele B′ having a frequency of more than 14,000 patients and more than 3000 condition-
80 percent (0.80), then linkage equilibrium is in place when free control individuals were part of the analysis. The
the frequency of each genotype is the product of the two study identified more than two dozen regions where a
allele frequencies. In other words, linkage equilibrium pre- gene likely to contribute to the development of one of the
dicts the following frequencies for the combinations of conditions may occur. Figure 5.17 shows a “Manhattan
alleles of these two genes in haplotypes: plot” that indicates the locations of genes contributing to
AB = 10.70210.202 = 0.14 the development of each of these conditions in humans.
AB′ = 10.70210.802 = 0.56 Manhattan plots are so named because their high-rise
profile is reminiscent of the Manhattan (New York City)
A′B = 10.30210.202 = 0.06
skyline. The profile is scattered with green dots and bars
A′B′ = 10.30210.802 = 0.24
representing locations of chromosomes where linkage dis-
1.00 equilibrium has been detected between a SNP haplotype
The close proximity of SNP variants in a haplotype and a potential contributing gene. The higher the green
can severely limit the occurrence of crossing over between bar, the stronger the association between a potential con-
the variants. This delays the attainment of linkage equilib- tributing gene and a chromosome location as determined
rium for many generations, since the alleles of a haplotype by the P value.
are passed together during reproduction. Crossing over Additional molecular genetic investigation is required
will eventually randomize the combinations of alleles in to identify the potential contributing genes associated with
genotypes, but until that time, alleles of any other genes each SNP haplotype. Twelve likely contributing genes are
in close proximity to the haplotype genes will also tend identified in the figure. To determine that a gene actually
to remain syntenic to the haplotype. This relationship is contributes to the development of a condition, it is neces-
called linkage disequilibrium. It reflects the nonrandom sary to first locate all the genes in the chromosome region
relationship between alleles of very closely linked genes. showing linkage disequilibrium to a SNP haplotype. The
Linkage disequilibrium indicates that one particular allele activities of identified genes are then determined to see if
174 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
15 Bipolar disorder
10
5
0
1
10
11
12
13
14
15
16
17
18
19
20
21
22
X
15 Coronary artery disease
10 APOE
5
0
1
10
11
12
13
14
15
16
17
18
19
20
21
22
X
15 Crohn’s disease ATG16L1 IRGM
10 IL23R IBD5 NKX23 CARD15 PTPN2
5
0
1
10
11
12
13
14
15
16
17
18
19
20
21
22
X
15 Hypertension
−log10(P)
10
5
0
1
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Rheumatoid arthritis HLA-DRB1
15
PTPN22
10
5
0
1
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Type 1 diabetes HLA-DRB1
15
10 PTPN22
5
0
1
10
11
12
13
14
15
16
17
18
19
20
21
22
X
15 Type 2 diabetes
10
5
0
1
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Chromosome
Figure 5.17 Manhattan plots resulting from a genome-wide association study (GWAS) of seven com-
mon human diseases. The vertical axes show the P value for each SNP–disease association along 22 auto-
somes and the X chromosome. Green dots and bars mark the locations of regions yielding significant
associations. Known genes mapping to these regions are shown.
their action might potentially contribute to the develop- To date, GWAS has identified numerous genes contrib-
ment of a condition. uting to traits and conditions in humans and other eukary-
One example shown in Figure 5.17 is the gene CARD15 otes, and it has the potential to be instrumental in the
on human chromosome 16 that contributes to the develop- discovery of genes contributing to some very complex con-
ment of the intestinal condition Crohn’s disease (CD). CD is ditions, including psychiatric disorders, heart disease, and
an inflammatory condition that affects the intestines. GWAS diabetes.We discuss GWAS more fully and give additional
first identified a specific region of chromosome 16 as likely examples of its application in Chapter 19.
to contain a gene contributing to the development of CD.
Researchers subsequently screened expressed genes in that Linkage Disequilibrium and Evolutionary
region of chromosome 16 and identified CARD15 and sev-
eral other genes. Additional screening searched for variants
Analysis
of these genes that co-occurred with the appearance of CD, In addition to its usefulness in GWAS analysis, linkage
and the researchers indeed found that certain CARD15 vari- disequilibrium can also be analyzed in an evolutionary
ants correlated with the appearance of CD. This led to inves- context. Two evolutionary scenarios are observed to cause
tigations of the action of CARD15. It was determined that linkage disequilibrium. First, the migration of individuals
expression of the CARD15 variant alleles associated with into established populations can produce linkage disequilib-
CD increased the inflammatory response of intestinal tissue, rium by introducing haplotypes into a population. The gen-
thus contributing to the development of CD. erations that immediately follow this introduction of new
Case Study 175
haplotypes are in linkage disequilibrium since it takes mul- altering the function of the oxygen-carrying protein hemo-
tiple generations for crossing over to randomize (establish globin in red blood cells and producing the autosomal reces-
linkage equilibrium between) the introduced haplotypes and sive condition known as sickle cell disease. Pleiotropy in
linked alleles already present in the population. A second sickle cell disease is the subject of Figure 4.16.
evolutionary mechanism generating linkage disequilibrium Sickle cell disease exists in several human populations,
is the operation of natural selection in favor of a particular notably in populations of east and central Africa, southern
allele that is very closely linked to a haplotype. The effect Europe, and the Middle East. Evolutionarily, the question is
of natural selection can be to increase the frequency both of whether the bS alleles in these populations have a common
the favored allele and of the haplotype in the population. In evolutionary origin—that is, did they originate with a muta-
most cases, the alleles in the haplotype are passengers that tion in a single ancestral population—or are they indepen-
are favored because of their close proximity to the favored dent mutations that have risen to high frequency in certain
allele. populations due to natural selection. The analysis of hap-
Evolutionary analysis involving haplotypes takes lotypes surrounding the b@globin gene on chromosome 11
advantage of such retention of linkage disequilibrium to holds the answer. The results of extensive genotyping of
study the origins of alleles that have been subject to natural chromosomes carrying the bS mutation in populations in
selection. One example of the application of this research Africa, southern Europe, and the Middle East conclusively
strategy in the assessment of human evolution concerns a show that the chromosome 11 haplotypes are substantially
specific mutation known as the bS mutation, caused by a different from one another. The differences are not the result
base-pair substitution at position 6 in the wild-type allele, of recombination and could only occur if independent bS
bA, of the human b@globin gene. The DNA base-pair substi- mutations occurred on these chromosomes. This evidence
tution is shown in Figure 4.16 (p. 123). The base substitu- clearly indicates that the human bS mutation has occurred
tion leads to an amino acid change in the b@globin protein, and evolved independently at least three times.
C A SE S T U D Y
Mapping the Gene for Cystic Fibrosis
Cystic fibrosis (CF; OMIM 219700) is an autosomal recessive With family studies indicating that a single autosomal
disorder caused by a defect in the cystic fibrosis transmem- gene was responsible for CF, researchers used genetic link-
brane conductance regulator (CFTR) gene that is located age mapping and lod score analysis to locate the CF gene.
on chromosome 7 in humans. The protein product of CFTR All 22 autosomes were studied, and initially a great deal of
spans the membrane of cells, regulating the flow of chloride negative genetic linkage information was obtained. These
ions in and out of the cell. Mutations of CFTR primarily affect data identified chromosomes where the gene was not
glands producing mucus, digestive enzymes, and sweat. located. The first important piece of positive gene mapping
First identified in the late 1930s, CF proved to be a evidence came in 1985 when Hans Eiberg and colleagues
relatively common disorder, particularly in Caucasian popu- identified the close linkage of the CF gene to the PON gene
lations, where it occurs at a frequency of 1 in 2500 infants, that produces the blood serum enzyme paraoxonase. Unfor-
according to the American Lung Association. It is much less tunately, PON did not have a known chromosome location
common in Hispanics (1 in 15,000), African Americans (1 in at the time, so despite the finding that the CF gene was near
30,000), and Native Pacific Islanders (1 in 100,000). In Cau- PON, the identity of the chromosome carrying the genes
casians, the frequency of heterozygous carriers of the reces- remained a mystery.
sive allele is approximately 4%. Numerous family studies A few months later, however, Lap-Chee Tsui and col-
identified CF as being caused by mutation of a single gene, leagues identified a DNA RFLP marker known as D7S15 that
although the gene was not identified until the 1980s. Many was linked to both the CF gene and to PON (see Section
mutant alleles of the gene are known, although one mutation 5.5). D7S15 was known to reside near the middle of the
is very common. long arm of chromosome 7. Like almost all RFLPs, D7S15
The principal clinical difficulty in CF is very thick mucus is not part of an expressed gene, and it has nothing to do
that clogs the airways in the lungs and obstructs the ducts with causing CF. It is merely a DNA sequence variant that
that transport digestive enzymes from the pancreas to the is detected in a noncoding segment of chromosome 7. As
small intestine. Chronic and severe respiratory infections are Table 5.5 shows, however, lod score values for D7S15–CF
a hallmark of CF, as are digestive difficulties that can result in and D7S15–PON linkage as reported by Tsui et al. (1985)
chronic malnutrition, even with adequate food intake. Aware- for 39 families with CF clearly demonstrated close genetic
ness of these complications has led to better management linkage between the genes and the RFLP. Lod score values
and improved survival. In the 1950s, CF patients rarely sur- greater than +3.0 are seen for D7S15–CF linkage in the
vived long enough to enter elementary school. By 1985, the range u = 0.10 to 0.20, with a maximum u value of 3.96 at
average age of survival stood at about 25 years. By 2007, u = 0.14. For the D7S15–PON analysis, significantly posi-
mean survival had improved to approximately 28 years. tive lod scores are seen in the range u = 0.01 to 0.20, with
CF patients with less severe forms of the disease survive even a Zmax value of 5.01 at u = 0.05. Taken together, these lod
longer. score analyses indicated the order D7S15-PON-CF with a
176 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
distance of approximately 5 cM from D7S15 to PON and accounts for almost 50% of the known CFTR mutant alleles.
14 cM from PON to CF. Numerous other CFTR mutant alleles have also been iden-
With the segment of chromosome 7 containing the tified, but none of these has a frequency of more than a
CF gene identified, researchers examined that chromosome 7 few percent. The various CFTR mutant alleles produce dif-
region and found additional DNA genetic markers that were ferent levels of functionality in the transmembrane protein,
linked even more closely to the CF gene. Using these markers, to some extent allowing clinical variation in CF patients
they identified a segment of about 500,000 bp of DNA as the to be attributed to particular mutant alleles. Knowing the
likely location of the CF gene. By examining DNA sequences frequency of the one common mutation and having iden-
for the probable presence of expressed genes and by testing tified many other CFTR mutations, medical geneticists are
for the presence of genes that were known to be expressed in able to offer prenatal genetic testing to CF families and to
sweat glands, a group of investigators led by Tsui and Francis accurately identify the mutant alleles and probable disease
Collins cloned and sequenced the CF gene in 1989. Investiga- severity in patients.
tors soon determined that the protein product of the CF gene The process of first mapping, then cloning, then
is a transmembrane conductance regulatory protein, at which sequencing CFTR to identify its function is a genetic strat-
point the gene acquired its CFTR designation. egy known as positional cloning or reverse genetic analysis.
One mutation known to delete three consecutive DNA We discuss this investigative strategy more completely in
base pairs and alter one amino acid of the CFTR protein Chapter 14.
SU MMA RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
5.1 Linked Genes Do Not Assort ❚❚ Studies correlating genetic recombination with the vis-
Independently ible recombination of distinctive physical structures on
chromosomes support the idea that crossing over causes
❚❚ Genetic linkage identifies genes that are so close to one recombination.
another on a chromosome that their alleles do not assort
independently. 5.2 Genetic Linkage Mapping Is Based on
❚❚ With genetic linkage, parental combinations occur at fre- Recombination Frequency between Genes
quencies that are significantly greater than those predicted
by chance, and nonparental combinations are much less ❚❚ The correlation between physical map distance and recom-
frequent than expected. bination frequency permits gene mapping based on recom-
❚❚ William Bateson and Reginald Punnett first observed bination frequency.
genetic linkage when they noticed high numbers of parental
phenotypes in F2 progeny. 5.3 Three-Point Test-Cross Analysis
❚❚ Thomas Hunt Morgan performed test-cross analysis of Maps Genes
linked genes to demonstrate that linkage violates indepen-
dent assortment and that crossover between homologous ❚❚ Three or more genes can be mapped by test-cross analysis.
chromosomes is responsible for the production of recombi- In a three-point cross, parental phenotypes are most fre-
nant gametes. quent, double recombinants are least frequent, and the four
phenotypes resulting from two single-recombination events
❚❚ Crossover frequency between linked genes is correlated
are of intermediate frequency that depends on the actual
with the distance between genes on a chromosome. Cross-
distance between genes.
over occurs less often between genes that are close together
than between genes that are farther apart. ❚❚ Genetic linkage maps are constructed in five steps:
❚❚ In crosses involving linked genes, the two parental phe- 1. Find significantly higher proportions of parental pheno-
notypes are observed in progeny in approximately equal types than predicted by chance.
frequencies. The two recombinant phenotypes also occur at 2. Identify the alleles on parental chromosomes (the most
approximately equal frequency. common classes).
Problems 177
3. Identify double recombinants (the least frequent ❚❚ Hotspots and coldspots of recombination are found in many
classes), comparing them with parental chromosomes to genomes, reflecting the uneven distribution of homologous
determine gene order. recombination.
4. Calculate recombination frequencies between genes. ❚❚ Mammalian genome analysis reveals potential sequences
and mechanisms associated with recombinational hotspots.
5. Calculate interference in the occurrence of double
crossovers.
❚❚ Recombination frequency usually underestimates the physi- 5.5 Human Genes Are Mapped Using
cal distance between genes. Mapping functions are used to Specialized Methods
correct these estimates.
❚❚ Statistical approaches such as lod score analysis detect evi-
dence of linkage in small families.
5.4 Multiple Factors Cause Recombination ❚❚ Lod score analysis determines the likelihood of genetic
to Vary linkage between genes at specified recombination values
(u values). A cumulative lod score of + 3.0 or more is sta-
❚❚ Several biological properties of organisms affect recom- tistically significant evidence in favor of genetic linkage
bination. In animals, the heterogametic sex experiences between two genes. Lod scores of - 2.0 or less represent
less recombination genome-wide than the homogametic sex. significant evidence against genetic linkage.
❚❚ Recombination between homologs adds substantially to ❚❚ Genome-wide association studies (GWAS) locate genes affect-
the genetic diversity produced through sexual reproduction. ing phenotypes that are the result of the action of several genes.
P R E PA R I N G F O R P R O B LEM S O LV I NG
In addition to the list of problem-solving tips and suggestions 3. Be prepared to deduce genetic maps from genetic-cross
given here, you can go to the Study Guide and Solutions Man- data by identifying the occurrence of genetic linkage
ual that accompanies this book for help at solving problems. and calculating recombination frequencies.
1. Be sure you have a clear understanding of the rules, 4. Practice solving three-point test-cross analysis using
computation, and expected outcomes of crosses involv- the five steps illustrated in the chapter in the order in
ing independently assorting genes. You cannot assess which they are presented.
genetic linkage without understanding what is expected
as a result of independent assortment. 5. Be ready to propose and construct genetic tests based
on a hypothesis of genetic linkage.
2. Be prepared to evaluate and interpret genetic maps by
understanding the relationship between recombination 6. Understand the interpretation of lod scores in the
frequencies and the distance between genes on a map. assessment of human genetic linkage analysis.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. For parts (a), (b), and (c) of this problem, draw a diagram a. A plant with the genotype DR/dr produces gametes.
illustrating the alleles on homologous chromosomes for Identify gamete genotypes, label parental and recom-
the genotypes given, assuming in each case that the genes binant gametes, and give the frequency of each gamete
reside on the chromosome in the order written. For parts genotype.
(d) and (e), give the information requested. b. Give the same information for a plant with the geno-
a. AB/ab type Dr/dR.
b. aBc/abC 3. A pure-breeding tall plant producing oval fruit as
c. DFg/DFG described in Problem 2 is crossed to a pure-breeding short
d. the gametes produced by an organism with the geno- plant producing round fruit.
type Rt/rT
a. The F1 are crossed to short plants producing oval
e. progeny of the cross Rt/rT * rt/rt
fruit. What are the expected proportions of progeny
2. In a diploid species of plant, the genes for plant height phenotypes?
and fruit shape are syntenic and separated by 18 m.u. b. If the F1 identified in part (a) are crossed to one
Allele D produces tall plants and is dominant to d for another, what proportion of the F2 are expected to be
short plants, and allele R produces round fruit and is short and produce round fruit? What proportion are
dominant to r for oval fruit. expected to be tall and produce round fruit?
178 CHAPTER 5 Genetic Linkage and Mapping in Eukaryotes
4. Genes E and H are syntenic in an experimental organ- d. Explain how each of the predicted progeny classes is
ism with the genotype EH/eh. Assume that during each produced.
meiosis, one crossover occurs between these genes. No 7. Genes A, B, and C are linked on a chromosome and
homologous chromosomes escape crossover, and none found in the order A-B-C. Genes A and B recombine with
undergo double crossover. Are genes E and H genetically a frequency of 8%, and genes B and C recombine at a
linked? Why or why not? What is the proportion of paren- frequency of 24%. For the cross a +b+c/abc + * abc/abc,
tal gametes produced by meiosis? predict the frequency of progeny genotypes. Assume
5. In tomato plants, purple leaf color is controlled by a interference is zero.
dominant allele A, and green leaf by a recessive allele a. 8. Gene G recombines with gene T at a frequency of 7%,
At another locus, hairy leaf H is dominant to hairless leaf and gene G recombines with gene R at a frequency of 4%.
h. The genes for leaf color and leaf texture are separated
a. Draw two possible genetic maps for these three genes,
by 16 m.u. on chromosome 5. On chromosome 4, a gene
and identify the recombination frequencies predicted
controlling leaf shape has two alleles: a dominant allele C
for each map.
that produces cut-leaf shape and a recessive allele c that
b. Assuming that organisms with any desired genotype
produces potato-shaped leaf.
are available, propose a genetic cross whose result
a. The cross of a purple, hairy, cut plant heterozygous at could be used to determine which of the proposed
each gene to a green, hairless, potato plant produces genetic maps is correct.
the following progeny:
9. Genes A, B, C, D, and E are linked on a chromosome and
occur in the order given.
Phenotype Frequency % a. The test cross Ae/aE * ae/ae indicates the genes
Purple, hairy, cut 21 recombine with a frequency of 28%. If 1000 progeny
Purple, hairy, potato 21 are produced by this test cross, determine the number
of progeny in each outcome class.
Green, hairless, cut 21 b. Previous genetic linkage crosses have determined that
Green, hairless, potato 21 recombination frequencies are 6% for genes A and B,
Purple, hairless, cut 4 4% for genes B and C, 10% for genes C and D, and
11% for genes D and E. The sum of these frequencies
Purple, hairless, potato 4
between genes A and E is 31%. Why does the recom-
Green, hairy, cut 4 bination distance between these genes as determined
Green, hairy, potato 4 by adding the intervals between adjacent linked genes
differ from the distance determined by the test cross?
100
10. Syntenic genes can assort independently. Explain this
observation.
Give the genotypes of parental and progeny plants in
this experiment. 11. Define linkage disequilibrium. What is the physical basis
b. Fully explain the number and frequency of each of linkage, and what causes linkage equilibrium? Explain
phenotype class. how crossing over eliminates linkage disequilibrium.
6. In Drosophila, the map positions of genes are given in 12. On the Drosophila X chromosome, the dominant allele y +
map units numbering from one end of a chromosome to produces gray body color, and the recessive allele y pro-
the other. The X chromosome of Drosophila is 66 m.u. duces yellow body. This gene is linked to one controlling
long. The X-linked gene for body color—with two full eye shape by a dominant allele lz + and lozenge eye
alleles, y + for gray body and y for yellow body— shape with a recessive allele lz. These genes recombine
resides at one end of the chromosome at map position with a frequency of approximately 28%. The Lz gene
0.0. A nearby locus for eye color, with alleles w + for is linked to gene F controlling bristle form, where the
red eye and w for white eye, is located at map position dominant phenotype is long bristles and the recessive one
1.5. A third X-linked gene, controlling bristle form, is forked bristles. The Lz and F genes recombine with a
with f + for normal bristles and f for forked bristles, is frequency of approximately 32%.
located at map position 56.7. At each locus the wild- a. Using any genotypes you choose, design two sepa-
type allele is dominant over the mutant allele. rate crosses, one to test recombination between genes
a. In a cross involving these three X-linked genes, do Y and Lz and the second between genes Lz and F.
you expect any gene pair(s) to show genetic linkage? Assume 1000 progeny are produced by each cross, and
Explain your reasoning. give the number of progeny in each outcome category.
b. Do you expect any of these gene pair(s) to assort inde- (In setting up your crosses, remember that Drosophila
pendently? Explain your reasoning. males do not undergo recombination.)
c. A wild-type female fruit fly with the genotype b. Can any cross reveal genetic linkage between gene Y
y +w +f/ywf + is crossed to a male fruit fly that has and gene F? Why or why not?
yellow body, white eye, and forked bristles. Predict c. Why is “independent assortment” the genetic term
the frequency of each progeny phenotype class that best describes the observations of a genetic cross
produced by this mating. between gene Y and gene F?
Problems 179
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
13. Researchers cross a corn plant that is pure-breeding for e. Explain why III-11 has nail–patella syndrome
the dominant traits colored aleurone (C1), full kernel and III-12 does not. Give genotypes for these two
(Sh), and waxy endosperm (Wx) to a pure-breeding plant individuals.
with the recessive traits colorless aleurone (c1), shrunken 15. Three dominant traits of corn seedlings, tunicate seed
kernel (sh), and starchy (wx). The resulting F1 plants were (T - ), glossy appearance (G- ), and liguled stem (L - ), are
crossed to pure-breeding colorless, shrunken, starchy studied along with their recessive counterparts, nontunicate
plants. Counting the kernels from about 30 ears of corn (tt), nonglossy (gg), and liguleless (ll). A trihybrid plant
yields the following data. with the three dominant traits is crossed to a nontunicate,
nonglossy, liguleless plant. Kernels on ears of progeny
Kernel Phenotype Number plants are scored for the traits, with the following results:
Colored, shrunken, starchy 116
Phenotype Number
Colored, full, starchy 601
Tunicate, glossy, liguled 102
Colored, full, waxy 2538
Tunicate, glossy, liguleless 106
Colored, shrunken, waxy 4
Tunicate, nonglossy, liguled 18
Colorless, shrunken, starchy 2708
Tunicate, nonglossy, liguleless 20
Colorless, full, starchy 2
Nontunicate, glossy, liguled 22
Colorless, full, waxy 113
Nontunicate, glossy, liguleless 23
Colorless, shrunken, waxy 626
Nontunicate, nonglossy, liguled 99
6708
Nontunicate, nonglossy, liguleless 110
a. Why are these data consistent with genetic linkage 500
among the three genes?
a. Is there evidence of genetic linkage among any of
b. Perform a chi-square test to determine if these data
these gene pairs? If so, identify the evidence.
show significant deviation from the expected pheno-
b. Is there evidence of independent assortment among
type distribution.
any of these gene pairs? If so, identify the evidence.
c. What is the order of these genes in corn?
c. Using the gene symbols given above, write the geno-
d. Calculate the recombination frequencies between the
types of F1 and F2 plants.
gene pairs.
d. If evidence of linkage is present, calculate the recombi-
e. What is the interference value for this data set?
nation frequency or frequencies from the data presented.
14. Nail–patella syndrome is an autosomal disorder affecting e. Could all three genes be carried on the same chromo-
the shape of nails on fingers and toes as well as the struc- some? Discuss why or why not.
ture of kneecaps. The pedigree below shows the transmis- 16. In a diploid plant species, an F1 with the genotype
sion of nail–patella syndrome in a family along with ABO Gg Ll Tt is test-crossed to a pure-breeding recessive plant
blood type. with the genotype gg ll tt. The offspring genotypes are
as follows:
1 2
I O A Genotype Number
1 2 3 4 5 6 7 8 9 10 Gg Ll Tt 621
II Gg Ll tt 3
A O B A O A O A A A
Gg ll Tt 64
III 1 2 3 4 5 6 7 8 9 10 11 12 13
Gg ll tt 109
A O A AB B O A O A A A A O gg Ll Tt 103
gg Ll tt 67
a. Is nail–patella syndrome a dominant or a recessive
gg ll Tt 7
condition? Explain your reasoning.
b. Does this family give evidence of genetic linkage gg ll tt 626
between nail–patella syndrome and ABO blood group? 1600
Why or why not?
c. Using N and n to represent alleles at the nail–patella a. What is the order of these three linked genes?
locus and I A, I B, and i to represent ABO alleles, write b. Calculate the recombination frequency between each
the genotypes of I-1 and I-2 as well as their five chil- pair of genes.
dren in generation II. c. Why is the recombination frequency for the outside
d. Explain why III-6 has nail–patella syndrome and III-8 pair of genes not equal to the sum of recombination
does not. Give genotypes for these two individuals. frequencies between the adjacent gene pairs?
180 CHAPTER 5 G
enetic Linkage and Mapping in Eukaryotes
d. What is the interference value for this data set? Rh+ . Terri’s father is Rh+ and has elliptocytosis; Terri’s
e. Explain the meaning of this I value. mother is Rh- and is healthy.
17. The table given here lists the arrangement of alleles of a. What is the probability that the first child of Tom and
linked genes in dihybrid organisms, the recombination Terri will be Rh- and have elliptocytosis?
frequency between the genes, and specific gamete geno- b. What is the probability that a child of Tom and Terri
types. Using the information provided, determine the who is Rh+ will have elliptocytosis?
expected frequency of the listed gametes. Assume one
20. A group of families in which an autosomal dominant
map unit equals 1% recombination and, when three genes
condition is present are studied to determine lod scores
are involved, interference is zero.
for possible genetic linkage between three RFLP markers
(R1, R2, and R3) and the disease gene. The chart shows
Dihybrid Recombination Gamete
lod scores at each of the recombination distances (u
Genotype Frequency Genotype values) tested. Provide an interpretation of the lod score
A. DE/de 8% De results for each RFLP. Be specific about any significant
B. AD/ad 28% ad evidence of genetic linkage.
C. DEF/def E–F 24% dEf RFLP u values
D–E 8% 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
D. BdE/bDe B–D 18% Bde R1 0.5 0.8 1.8 2.2 1.9 0.7 0.2 0.1
and show that the results are significantly different from the
0 u value
0.05 0.1 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 expectation under the assumption of independent assortment.
–1
23. A wild-type trihybrid soybean plant is crossed to a pure-
–2 breeding soybean plant with the recessive phenotypes
pale leaf (l), oval seed (r), and short height (t). The results
–3
of the three-point test cross are shown below. Traits not
a. From these data, can you conclude that Rh and ellipto- listed are wild type.
cytosis loci are genetically linked in this family? Why
Phenotype Number
or why not?
b. What is Zmax for this family? Pale 648
c. Over what range of u do lod scores indicate significant Pale, oval 64
evidence in favor of genetic linkage? Pale, short 10
19. Genetic linkage mapping for a large number of families Pale, oval, short 102
identifies 4% recombination between the genes for Rh
Oval 6
blood type and elliptocytosis (see Problem 18). At the
Rh locus, alleles R and r control Rh+ and Rh- blood Oval, short 618
types. Allele E producing elliptocytosis is dominant to Short 84
the wild-type recessive allele e. Tom and Terri each have Wild type 98
elliptocytosis, and each is Rh+ . Tom’s mother has ellip-
tocytosis and is Rh- while his father is healthy and has 1630
Problems 181
a. What are the alleles on each homologous chromosome 26. The following progeny are obtained from a test cross of
of the parental wild-type trihybrid soybean plant? a trihybrid wild-type plant to a plant with the recessive
Place the alleles in their correct gene order. Use L, R, phenotypes compound leaves (c), intercalary leaflets (i),
and T to represent dominant alleles and l, r, and t for and green fruits (g). (Traits not listed are wild type.) The
recessive alleles. test-cross progeny are as follows:
b. Calculate the recombination frequencies between the
adjacent genes. Phenotype Number
c. Calculate the interference value for these data. Compound leaves 324
24. The boss in your laboratory has just heard of a proposal Compound leaves, intercalary leaflets 32
by another laboratory that genes for eye color and the
length of body bristles may be linked in Drosophila. Your Compound leaves, green fruits 5
lab has numerous pure-breeding stocks of Drosophila Compound leaves, intercalary leaflets, green fruits 51
that could be used to verify or refute genetic linkage. In Intercalary leaflets 3
Drosophila, red eyes (c +) are dominant to brown eyes (c),
Intercalary leaflets, green fruits 309
and long bristles (d +) are dominant to short bristles (d).
Your lab boss asks you to design an experiment to test the Green fruits 42
genetic linkage of eye color and bristle-length genes, and Wild type 49
to begin by crossing a pure-breeding line homozygous for
815
red eyes and short bristles to a pure-breeding line that has
brown eyes and long bristles. a. Determine the order of the three genes, and construct
a. Give the genotypes of the pure-breeding parental flies, a genetic map that identifies the correct order and the
and the genotype(s) and phenotype(s) of the F1 prog- alleles carried on each chromosome in the trihybrid
eny they produce. parental plant.
b. In your experimental design, what are the genotype b. Calculate the frequencies of recombination between
and phenotype of the line you propose to cross to the the adjacent genes in the map.
F1 to obtain the most useful information about genetic c. How many double-crossover progeny are expected
linkage between the eye color and bristle-length among the test-cross progeny? Calculate the interfer-
genes? Explain why you make this choice. ence for this cross.
c. Assume the eye color and bristle-length genes are
separated by 28 m.u. What are the approximate fre- 27. In tomatoes, the allele T for tall plant height is dominant
quencies of phenotypes expected from the cross you to dwarf allele t, the P allele for smooth skin is domi-
proposed in part (b)? nant to the p allele for peach fuzz skin, and the allele R
d. How would the results of the cross differ if the genes for round fruit is dominant to the recessive r allele for
are not linked? oblong fruit. The genes controlling these traits are linked
on chromosome 1 in the tomato genome, and the genes
25. In rabbits, chocolate-colored fur (w +) is dominant to white are arranged in the order and with the recombination
fur (w), straight fur (c +) is dominant to curly fur (c), and frequencies shown.
long ear (s +) is dominant to short ear (s). The cross of a
trihybrid rabbit with straight, chocolate-colored fur and Gene: T P
R
long ears to a rabbit that has white, curly fur and short
ears produces the following results:
Recombination 0.04 0.18
Phenotype Number frequency:
White, short, straight 13
a. A pure-breeding tall, peach fuzz, round plant is
Chocolate, long, straight 165
crossed to a pure-breeding plant that is dwarf, smooth,
Chocolate, long, curly 13 oblong. What are the gamete genotypes produced by
White, long, straight 82 each of these plants?
Chocolate, short, straight 436
b. What are the genotype and phenotype of the F1 prog-
eny of this cross?
Chocolate, short, curly 79 c. What are the genotypes of gametes produced by the F1,
White, short, curly 162 and what is the predicted frequency of each gamete?
White, long, curly 450 d. The F1 are test-crossed to dwarf, peach fuzz, oblong
plants, and 1000 test-cross progeny are produced.
1400 What are the phenotypes of test-cross progeny, and
a. Determine the order of the genes on the chromosome, what number of progeny is expected in each class?
and identify the alleles that are present on each of the 28. Neurofibromatosis 1 (NF1) is an autosomal dominant
homologous chromosomes in the trihybrid rabbits. disorder inherited on human chromosome 17. Part of
b. Calculate the recombination frequencies between each the analysis mapping the NF1 gene to chromosome 17
of the adjacent pairs of genes. came from genetic linkage studies testing segregation of
c. Determine the interference value for this cross. NF1 and DNA genetic markers on various chromosomes.
182 CHAPTER 5 G
enetic Linkage and Mapping in Eukaryotes
Wx) were mated, and their F1 were test-crossed to color- a. For each set of test-cross progeny, determine whether
less, waxy plants. The test-cross progeny were as follows: genetic linkage or independent assortment is more
strongly supported by the data. Explain the rationale
Phenotype Number for your answer.
Colored, waxy 340 b. Calculate the recombination frequency for each of the
progeny groups.
Colored, starchy 115
c. Taken together, are the results of these two experi-
Colorless, waxy 92 ments compatible with the hypothesis of genetic link-
Colorless, starchy 298 age? Explain why or why not.
845
d. Merge the two sets of progeny data and determine the
combined recombination frequency.
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
1 2
I
1 2 3 4 5 6 7 8 9 10
II
1 2 3 4 5 6
III
R1
R2
R3
R4
36. Divide a clean sheet of paper into four quadrants and various pairs have been determined in a series of genetic
draw one pair of homologous chromosomes in each crosses. Use the recombination frequency data in the table
quadrant. Draw the chromosomes with two sister chro- below to determine the order of and distance between the
matids each. The four sets of homologous pairs are genes on a genetic map. The gene lc1 is known to be clos-
identical. Label one chromosome of each pair with est to the telomere of the chromosome.
alleles A1 and B1 and the other member of each pair
with the alleles A2 and B2. You are to illustrate a single GENE du1 mgs1 ms10 tp2 wsm3 lc1
crossover between the homologs in each quadrant, and du1 7 19 41
list the parental and recombinant chromosomes, but mgs1 5 34
you are to illustrate four different ways the crossover
can occur by involving different chromatids in each ms10 14 19
illustration. tp2 12 7 12 22
37. For six genes known to be linked on chromosome 10 of wsm3 31 24
corn (Zea mays), the recombination frequencies between lc1 29 10
Genetic Analysis and
Mapping in Bacteria
and Bacteriophages
6
CHAPTER OUTLINE
6.1 Specialized Methods Are Used
for Genetic Analysis of Bacteria
6.2 Bacteria Transfer Genes by
Conjugation
6.3 Bacterial Transformation
Produces Genetic
Recombination
6.4 Bacterial Transduction Is
Mediated by Bacteriophages
6.5 Bacteriophage Chromosomes
Are Mapped by Fine-Structure
Analysis
6.6 Lateral Gene Transfer Alters
Genomes
ESSENTIAL IDEAS
❚❚ Bacteria are propagated in liquid growth
media or on semisolid growth plates.
❚❚ Bacterial genotypes are identified by abil-
ity to grow on plates containing various
compounds.
❚❚ Bacterial conjugation is a one-way transfer of
genetic material from a donor cell to a recipi-
Bacterial conjugation is a process by which genetic material is transferred from ent cell. Three types of donor cells can conju-
one bacterial cell (the donor) to another bacterial cell (the recipient) by way of gate with recipient cells to transfer DNA.
a hair-like pilus shown in the center of the photo. ❚❚ Donor bacterial genetic maps are derived
from conjugation analysis.
❚❚ A particular type of bacterial conjugation
H
can produce bacteria with genomes that
ere’s a surprising little secret of human life: Your body are partially diploid.
contains approximately 100 trillion cells, but only about ❚❚ Transformation is the absorption of extra-
cellular DNA across the cell wall and mem-
10 trillion of them are yours! The other 90% of the cells you brane of a recipient bacterial cell, and its
carry around are bacteria, fungi, and other forms of micro- analysis leads to mapping of donor bacte-
rial genes.
scopic life. Many of these biological hitchhikers perform use-
❚❚ Transduction, mediated by bacteriophages,
ful, even essential, functions. For example, you carry hundreds transfers DNA from a donor bacterial cell
of species of bacteria in your gut that collectively have a mass to a recipient cell, and its analysis leads to
mapping of donor bacterial genes.
of more than 3 pounds. Without these intestinal bacteria, your
❚❚ Fine-structure genetic analysis of a bacte-
digestion of carbohydrates would be impaired, and your abil- riophage genome demonstrated that DNA
nucleotide base pairs are the fundamental
ity to manufacture essential nutrients such as vitamin B12 and
units of mutation and recombination.
vitamin K would be disabled. The bacteria teeming in your ❚❚ Lateral gene transfer is a prevalent mecha-
digestive tract also help keep potentially harmful bacteria at nism for the exchange of genes among bac-
teria and for the evolution of genomes.
bay by vigorously competing for available nutrients. Similarly,
185
186 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
the millions of bacteria that currently reside on your without interference from dominance interactions
skin (yes, even though you showered recently!) help between alleles.
keep your skin healthy by competing with infectious ❚❚ Short generation times. Bacteria can reproduce
rapidly, with generation times measured in minutes.
bacteria. Despite this normal and healthy competi-
Rapid doubling of the number of bacterial cells can
tion, harmful bacteria can gain access to our bodies. produce millions of cells from a few dozen original
Occasionally even our normally helpful microbial pas- cells within hours.
sengers turn against us and cause illness, infection, or, ❚❚ Large numbers of progeny. Enormous numbers
in extreme cases, death. of clonal progeny can be examined, increasing the
likelihood that statistically rare events will be observed.
Given the biological, medical, and technological
❚❚ Ease of propagation. Microbes may be grown either
importance of bacteria and other microorganisms, it
in liquid culture or on culture plates. The cultures are
is no wonder they are studied intensively in modern easy and inexpensive to maintain, and they require
genetics, using the bacterium Escherichia coli and little laboratory space.
the yeast Saccharomyces cerevisiae as model genetic ❚❚ Numerous heritable differences. Mutants are eas-
organisms. The relative ease of studying microorgan- ily created, identified, isolated, and manipulated for
examination.
isms fueled revolutionary change in genetics in the
latter half of the 20th century. Much of the initial The techniques used to study bacteria are essentially
the same as those used to study all single-celled organisms,
knowledge of molecular genetics and many of the
whether bacteria, archaea, yeast, or fungi. We briefly out-
methods of genetic analysis were acquired in the study line these methods in this section and introduce essential
of bacteria and have proven valuable in the study of terminology for discussing them.
more complex organisms.
In this chapter, we investigate how genetic analy- Bacterial Culture and Growth Analysis
sis is applied to the study of gene transfer and map- Bacteria are haploid organisms that have one copy of each
ping in bacterial and bacteriophage genomes. We gene. These genes are usually carried on a single bacterial
take a historical genetic approach in our discussion, chromosome. A few bacterial species have their genome
divided into more than one chromosome, but no bacteria (or
focusing on the applications of genetic analysis that archaea, for that matter) have homologous chromosome pairs.
were used to map genes in bacterial genomes in the Bacteria propagate by binary fission, a process in which
decades before genome sequencing was developed. the bacterial chromosome replicates and a copy is distributed
Genome sequences of thousands of bacterial species to each of the progeny cells. During rapid growth, each fission
cycle lasts 20 to 30 minutes and more than one copy of the chro-
are now published, and their analysis verifies the accu- mosome may be present. Bacterial fissioning is clonal, meaning
racy and validity of the conclusions reached through that the two daughter cells of an original bacterial cell are genet-
use of the approaches described in this chapter. ically identical to one another and to the original cell. In a matter
of hours, this growth can generate a bacterial colony, a cluster
of millions of bacterial cells all derived from a single cell.
Bacteria can be grown in either a liquid growth medium
or on a growth plate containing a semisolid growth medium
(Figure 6.1). Both kinds of growth media contain the same
6.1 Specialized Methods Are Used nutritional ingredients. The difference is that the medium in
for Genetic Analysis of Bacteria growth plates contains agar that congeals when cooled.
Because they are haploid organisms with only one copy
Bacteria are a highly diverse taxonomic group essential for of each gene, wild-type bacteria rely on the normal function-
genetic study. Among the features that make bacteria so use- ing of all their genes that are essential for growth. With these
ful to geneticists are the following: genes functioning properly, the bacteria are able to synthesize
all the compounds they require from elements and compounds
❚❚ Relative genomic simplicity. Most bacterial genomes in their growth environment. The most important of these are
contain fewer genes and fewer base pairs in their haploid a carbon source—usually the simple sugar glucose—and
genomes than do the genomes of eukaryotes, making sources of nitrogen and certain other elements—usually sup-
bacterial genomes less complex by comparison. plied in inorganic salts. They also need water, which is present
❚❚ Haploid genomes. The haploid genomes of most in the growth medium, and oxygen, which is readily available
bacteria allow all mutations to be observed directly, from the atmosphere. Glucose is the raw material that supplies
6.1 Specialized Methods Are Used for Genetic Analysis of Bacteria 187
(a)
Bacterial
loop
2 Incubate
tube.
(b)
Pipette
1 Add a small amount 2 Place a few drops 3 Spread the dilute 4 Incubate plate
of a concentrated of dilute bacteria solution evenly and observe
bacterial solution to on a growth plate. on the growth bacterial colonies.
liquid growth medium plate.
to make a dilute
bacterial solution.
Figure 6.1 Bacterial growth methods. (a) Bacteria can be grown in a liquid medium inoculated with cells from
another culture. In liquid, dense growth occurs making the medium appear cloudy. (b) Bacteria can be grown on
a plate of semisolid growth medium on which a few drops of a dilute bacterial-cell solution have been spread.
On the semisolid medium, bacteria grow as colonies.
the important energy-producing process known as glycolysis with other organisms. These auxotrophs lack some of the
that operates in most organisms, including humans. genes required to grow on minimal medium and instead
A minimal medium is one containing glucose as the sugar obtain essential nutrients from their hosts.
source along with a nitrogen source, some inorganic material, Auxotrophs are able to grow on a complete medium.
and water. Wild-type cells of many bacterial species are able This is a medium containing glucose and a nitrogen source
to grow in minimal medium and are called prototrophs, or along with all the other compounds required for growth
prototrophic strains. Prototrophic bacteria produce all the and reproduction, such as amino acids and DNA and RNA
compounds required for their metabolism, growth, and repro- nucleotides. An auxotroph will also be able to grow on the
duction using the energy provided by glycolysis. Another way right supplemented minimal medium. This is a minimal
of saying this is that prototrophs do not carry any mutations medium to which has been added the specific compound
that block their ability to produce a compound that is required the auxotrophic strain is unable to produce on its own. Say,
for growth. For this reason, a prototrophic strain is defined by for example, that an auxotrophic bacterial strain is unable
its ability to grow in a minimal medium. to synthesize the amino acid leucine. Such a strain is des-
Bacteria that are mutant for one or more genes lack ignated as being leu- (spoken “leucine minus”). If all the
the ability to produce an essential compound or perform other essential compounds can be produced by this strain,
a required growth function. These bacteria are unable to then supplementing a minimal medium with leucine will
grow in a minimal medium. Mutant bacteria are called permit the growth of a leu- strain on that medium.
auxotrophs or auxotrophic strains. Auxotrophic species Certain bacterial strains are able to grow in a growth
also include many bacterial species with more complex medium that does not contain glucose but rather a sugar
growth requirements, such as those living symbiotically that is more complex than glucose, or a sugar that requires
188 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
metabolism to generate glucose. These sugars are alter- growth between plates to assess the ability or inability of each
natives to glucose. Lactose is one example of an alterna- particular colony to grow on a plate with a specific medium.
tive sugar. As we discuss in more detail later in the chapter, Research T echnique 6.1 introduces you to the interpretation of
lactose is broken down into the sugars glucose and galactose. microbial growth results to discover bacterial genotypes. You
Glucose is used to drive energy production through glycolysis, can also review Experimental Insight 4.1, for a related discus-
and once galactose is broken down, it too drives glycolysis. sion of the identification of bacterial genotypes.
The ability of a bacterial strain to grow in a medium con-
taining lactose is tested by preparing a medium that contains
Characteristics of Bacterial Genomes
lactose instead of glucose. Strains that grow in a lactose-
containing medium are designated lac + (spoken “lack plus”). Bacterial genomes are highly variable in size, ranging from
Such strains are prototrophic, as they can grow on minimal several hundred thousand base pairs to several million base
medium and they do not carry mutations of any required pairs. The number of genes encoded by bacterial chromo-
genes. A prototrophic bacterial strain that does not grow in a somes also varies, ranging from a couple of hundred to sev-
lactose-containing medium is designated lac - (“lack minus”). eral thousand. Bacterial genomes are usually composed of a
Complete, minimal, supplemented minimal media, and single chromosome that is a covalently closed circular struc-
media prepared using an alternative sugar are instrumental in ture. This chromosome is called the bacterial chromosome,
bacterial genetic analysis, where a frequent goal is to deter- and it carries genes that are essential to the species’ meta-
mine the genotypes of strains by observing whether they grow bolic and growth activities (Figure 6.3). Bacterial chromo-
or fail to grow on various media. An important technique in somes are also characterized by having a high proportion of
these investigations is replica plating, a simple process of the DNA sequence of bacterial genomes coding for proteins.
transferring some cells from each of the bacterial colonies on The bacterial species Escherichia coli is one of a hand-
an original growth plate to one or more other growth plates. ful of so-called model genetic organisms that are so des-
Figure 6.2 illustrates an example of replica plating in which ignated because their biology, reproduction, metabolism,
two auxotrophs are identified by their growth on the original and genetics are well characterized and suited to scien-
plate with complete medium but their absence from the replica tific research. These model genetic organisms have been
minimal medium plate where only prototrophs will grow. A used in biology and genetics experiments for decades. The
key feature of replica plating is that it transfers bacterial colo- E. coli genome is typical of the most common character-
nies from the original growth plate to the new growth plate in istics of bacterial genomes. It contains a single, circular
the same relative positions. This allows direct comparisons of chromosome, more than 90% of which encodes proteins.
Block
Sterile
velvet
1 Stamp sterile velvet on 2 Stamp velvet onto 3 Compare replica and original
the original complete minimal medium to plates to identify auxotrophs
medium plate to transfer make a replica plate. that do not grow on minimal
cells from bacterial colonies. Incubate plate. medium.
Figure 6.2 Replica plating. Sterile velvet is used as a stamp Q Consider a growth plate containing complete medium with
that is first pressed upon the colonies on the original, complete- 200 growing colonies on it. If you wanted to determine which
medium plate and then pressed onto a new, minimal-medium of these colonies were auxotrophic, what would you do? How
plate, transferring cells from all the colonies of the original plate does replica plating make this task easier?
to the new plate. After an interval to allow continued growth, the
original and replica plates are compared. The absence of growth of
a colony on the replica plate indicates auxotrophy.
6.1 Specialized Methods Are Used for Genetic Analysis of Bacteria 189
Genotyping Using Microbial Growth Identifying the genotype of a microbe often requires assess-
ing the growth of a particular colony on different growth media.
The results of the experiments on microbes described in this This is accomplished by the replica plating technique described
chapter have shaped our understanding of how genes work, in Figure 6.2. An alternative method of replica plating is to sim-
including how they are organized and how they are expressed. ply touch a colony growing on one growth medium with a ster-
A basic set of common laboratory techniques and analyses as- ile toothpick or a similar instrument to gather some cells of the
sessing growth or failure to grow in liquid or semisolid media colony and then touch a spot on a different growth plate. Sys-
made up of different components can be used to determine tematic use of a grid pattern on the new plate and care in the
the genetic makeup of microorganisms. Proper interpretation recording of growth results permit comparison of growth results
of the genotype of a microbe based on its pattern of growth on these different plates so as to identify colony genotypes.
on different media is an essential skill of genetic analysis that
is easy to master once you understand a few key concepts. ALLELIC IDENTIFICATION Distinguishing between com-
ANABOLIC AND CATABOLIC PATHWAYS Compounds pounds produced by anabolic pathways (anabolism builds
that influence the growth of microbes on growth media fall compounds from elemental building blocks) and those bro-
into two broad categories. In the first are compounds syn- ken down in catabolic pathways (catabolism breaks down
thesized by prototrophic (wild-type) microbes in biosynthetic compounds into elements) is a critical aspect of interpreting
pathways that are often described as anabolic pathways. In microbial growth and identifying microbial genotype that
anabolic pathways, energy is used to synthesize complex requires knowledge of growth media and their constituents.
compounds from simpler ones through sequential reaction In a convention you saw employed in Experimental
steps. Figure 4.17 and the accompanying discussion of the Insight 4.1, the ability to synthesize an essential compound by
anabolic pathway that synthesizes the amino acid methionine completion of an anabolic pathway is indicated in genetic nota-
(pages 123–124) provide an example. In contrast, catabolic tion by a ; +< (plus) symbol and identifies a wild-type allele; thus,
pathways are pathways through which energy is produced a microbe capable of biosynthesizing the amino acid methionine
by the breakdown of complex compounds into simpler ones. is identified as met + (spoken “met plus”). In contrast, the ; -<
Catabolic pathways also follow sequential steps. Our discus- (minus) symbol indicates the organism is an auxotroph (mutant)
sion of phenylketonuria (PKU; pages 122–123) highlights the that is unable to synthesize a particular compound due to muta-
catabolic pathway that breaks down the amino acid phenylal- tion. The control prototroph shown in Figure 4.18 (page 127) is
anine. Similarly, polysaccharide sugars like lactose and other met +, whereas the four other strains are each met –.
carbohydrates are broken down in catabolic pathways. The convention is similar for catabolic pathways: allelic sym-
bols identify the ability of a strain to complete a catabolic pathway
VISUALIZING MICROBIAL GROWTH When microbial with a superscript ; +< and the inability to complete a catabolic
growth occurs on a semisolid growth plate in a petri dish, pathway with the ; -< symbol. For example, microbes that are
individual colonies may appear on the plate. Each colony able to grow on a medium containing the milk sugar lactose
is actually hundreds of thousands to millions of individual instead of glucose are lac +. The ability to grow on lactose requires
microbes that are all descended from a single microbial cell production of the enzymes that break lactose down into simpler
among those originally spread on the plate in a very dilute compounds. In contrast, microbes that are unable to grow on lac-
solution. Depending on microbe genotypes and the compo- tose-containing media are lac -. These strains are unable to pro-
sition of the growth medium, it is possible that more than one duce one or more of the enzymes required for lactose metabolism.
microbial genotype is growing on a particular plate. In addi- The accompanying figure guides you through the identifica-
tion, although each bacterial colony on a growth plate con- tion of prototrophs and auxotrophs for the amino acids alanine
sists of cells with virtual genetic identity, a colony of millions (ala) and proline (pro) among 10 microbial colonies and also
of cells can be expected to contain some cells with muta- for the ability of the colonies to break down lactose. Genotype
tions. In a liquid growth medium, microbial growth produces identification is accomplished by comparing growth on plates
cloudiness—the result of the presence of so many living cells of media containing different constituents. The accompanying
in the growth vessel that they impede the passage of light table summarizes the genotype of each colony and the reason-
through the medium. There are no colonies in liquid media. ing used to identify the genotype.
(continued)
190 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
(a) 1 1
2 5 7 5 7 Compare complete and minimal
3 4 4 medium plates.
6 8 9 10 9 10
Conclusion: colonies 1, 4, 5, 7, 9, and
Replica plate 10 are prototrophs, and colonies 2, 3,
Complete medium Minimal medium 6, and 8 are auxotrophs.
Replica plate
(b)
1 1 2 1
4 5 7 4 5 7 4 5 7
3 3
9 10 6 9 10 6 9 10
Minimal plus alanine (Ala) Minimal plus proline (Pro) Minimal plus alanine and proline
Compare to minimal medium plate. Compare to minimal medium plate. Compare to minimal medium plate.
Conclusion: colony 3 is ala–. Conclusion: colony 6 is pro–. Conclusion: colony 2 is ala–, pro–.
Comparing the results of the three supplemented minimal media to minimal medium
identifies colony 8 as an auxotroph with an unknown genotype.
1, 5, 7, and 9 ala+ pro+ lac + These are prototrophs: they grow on minimal (glucose-containing) medium.
Also grow on lactose (lactose-containing) medium.
2 ala- pro- lac + Auxotroph: does not grow on minimal medium.
Grows on minimal medium supplemented with both alanine and proline. Also
grows on lactose medium supplemented with alanine and proline.
3 ala- pro+ lac - Auxotroph: does not grow on minimal medium.
Grows on minimal medium supplemented with alanine. Does not grow on
lactose medium supplemented with alanine and proline.
4 and 10 ala+ pro+ lac - Prototroph: grows on minimal medium. Does not grow on lactose medium.
Chromosomal Ruptured Plasmid are generally unable to replicate on their own because their
DNA E. coli cell DNA replication is tied to that of the bacterial chromosome. These
plasmids are present in one or two copies per bacterial cell.
Recipient
chromosome
(c) Transduction
Transducing Bacterial
Bacteriophage Phage DNA phage chromosome
Figure 6.4 Gene-transfer processes in bacteria. (a) Conjugation. A single DNA strand transferred dur-
ing DNA replication in the donor is used to replicate a second strand in the recipient. Subsequent crossing
over recombines DNA to form the exconjugant cell. (b) Transformation. A single strand of donor DNA taken
across the membrane of the recipient cell recombines with recipient DNA to form the transformant cell.
(c) Transduction. A donor DNA fragment encapsulated in a transducing phage is injected into the recipient
cell, where it recombines to form the transductant cell.
6.2 Bacteria Transfer Genes by Conjugation 193
donor in this process—can be inserted into a new bacte- (bio) and the amino acids methionine (met), cysteine (cys) and
rial cell—the recipient—where it can recombine into p henylalanine (phe) to a minimal medium for growth. In cul-
the recipient chromosome. Transduction is illustrated in ture 2 , they placed an auxotrophic strain called Y-10, which
Figure 6.4c, and it is discussed in the following section. has the genotype met + bio+ leu- cys + phe+ thr - thi - . The
Y-10 strain requires addition of the vitamin thiamine (thi) and
Each of these processes is an example of lateral gene
the amino acids leucine (leu) and threonine (thr) for growth.
transfer, a nonreproductive process through which bacte-
Culture 3 contained an equal mixture of both Y-10 and Y-24.
ria and archaea actively exchange genetic material. Lateral
Each culture was allowed to grow, and cells from each
gene transfer also takes place between bacteria and eukary-
culture were plated on minimal medium. Lederberg and
otes. The impact of these events on genomes and on the evo-
Tatum saw no growth on Plates 1 and 2, which contained
lution of life are topics for discussion later in this chapter.
cells transferred from culture 1 and culture 2 , respectively.
These results were consistent with the nutritional require-
Conjugation Identified ments of Y-24 and Y-10 and indicated that all the cells trans-
Conjugation was first identified by Joshua Lederberg and ferred to those plates were auxotrophs. Plate 3, however,
Edward Tatum in 1946. They used two triple-auxotrophic strains developed about 100 growing colonies! These colonies
of E. coli that had different nutritional requirements for growth. grew from bacterial cells that had somehow acquired the
The researchers first established three separate bacterial cultures prototrophic genotype (met + bio+ leu+ cys + phe+ thr + thi +).
growing, initially, in a complete medium (Figure 6.5). In culture Lederberg and Tatum were certain that this outcome
1 , they grew an auxotrophic strain called Y-24, which has the did not result from the reversion (reverse mutation) of auxo-
genotype met - bio- leu+ cys - phe- thr + thi + . Because of its trophs to prototrophs. Instead of reversion, the researchers
genotype, the Y-24 strain requires addition of the vitamin biotin proposed the transfer of genetic information. Lederberg and
Tatum hypothesized that physical contact between bacteria
Culture 1 Culture 2
was necessary for gene transfer, but their original experiment
did not provide direct evidence that this might be so. Four
years later, Bernard Davis replicated the work and showed
the necessity of contact between bacterial cells for gene
transfer to take place. For his experiment, Davis constructed
a U-tube with a fine glass filter separating one arm from the
Culture other (Figure 6.6). The filter was a glass disk with very small
Y-24 Y-10
3
met bio leu cys– phe– thr + thi +
– – +
met + bio + leu– cys + phe + thr – thi – Alternating
Pure culture of Y-10 Pure culture of 58-161
Grow in complete Grow in complete suction and
thr leu thi met
– – – +
thr leu thi met
+ + + –
medium. medium. pressure
Transfer to Transfer to
minimal medium. minimal medium. Cotton
Y-24 and Y-10
Grow in complete medium. Y-10 Y-10 58-161
58-161 Glass
Transfer to Mix of filter
minimal medium. Y-10 and
58-161
1 3 2
Transfer to minimal medium.
Figure 6.5 Lederberg and Tatum’s detection of recombination Control experiments U-tube experiment
between auxotrophic E. coli cells. Auxotrophic bacterial strains
Figure 6.6 Davis’s U-tube experiment, showing that genetic
1 (Y-24) and 2 (Y-10) each contain multiple mutations and grow
recombination requires cell-to-cell contact. Auxotrophic bacterial
on complete medium but not on minimal medium. 3 Mixing the
strains Y-10 and 58-161 are unable to grow on minimal medium but
strains leads to the formation of prototrophic bacteria that grow on
produce some prototrophs that grow on minimal medium when
minimal medium.
they make contact following mixing. Prototrophs are not produced
Q Why do you think it is highly unlikely that the prototrophic when the auxotrophs are placed in a U-tube, indicating that direct
colonies detected in this experiment came about by mutation? contact is required to generate prototrophic bacteria.
194 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
pores that allowed passage of small molecules such as nutri- one strand of F factor DNA. This leads to DNA replication
ents but not bacterial cells. A cotton ball plugging one end of that along with proteins moves one strand of F factor DNA
the U-tube and a rubber stopper connected to an air line at into the recipient cell, where separate DNA replication
the other allowed Davis to move the material in the tube by forms a double-stranded F factor.
alternating suction and pressure. The tube contained a cul- Three kinds of cells are seen in conjugation: a donor
ture of E. coli strain Y-10 on one side of the glass disk and a cell that contains an F plasmid and donates genetic informa-
culture of strain 58-161, auxotrophic for methionine synthe- tion, a recipient cell that receives DNA from a donor cell but
sis (met -), on the other side of the disk, and the glass disk does not contain a functional F factor, and the exconjugant
prevented direct contact between the two bacterial strains. cell that is produced by conjugation. An exconjugant cell is
Based on Lederberg and Tatum’s experiments, Davis essentially a recipient cell that has had its genetic content
hypothesized that direct contact between the auxotrophic modified by receiving DNA from a donor cell.
strains was needed to produce prototrophic bacteria. After The F factor of the E. coli strain is the most extensively
alternating suction and pressure for several hours, Davis mapped F plasmid. It consists of some 100 kb of DNA, and
plated bacterial samples from each side of the U-tube onto about 35% of its sequence is devoted to 36 genes that con-
minimal medium and found no growth from either side of trol conjugation and gene transfer (Figure 6.7). The F plas-
the U-tube. This lack of growth was an indication that cells mid genes that play a role in E. coli conjugation are given
on either side of the disk retained their auxotrophy. Davis four-letter designations consisting of the prefix tra or trb
concluded that physical contact between bacterial cells is followed by a capital letter. Much of the remainder of the
required for gene transfer to take place. F factor consists of four insertion sequence (IS) elements.
Lederberg, Tatum, and Davis were correct in their pro- IS elements are DNA sequences that when shared by an
posal that direct contact between bacteria is required for F plasmid and a bacterial chromosome are locations for
conjugation. The genetic information is conveyed by way of recombination between the two, as we discuss momentarily.
a hollow tube known as a conjugation pilus or that physi- The F plasmid of E. coli K-12 contains one copy of IS-2,
cally connects donor and recipient. Conjugation is pictured two copies of IS-3, and one copy of Tn-1000.
in the chapter-opening photo on page 185. The conjugation Conjugation between an F + donor and an F - recipient
pilus is the thread-like structure in the center of the photo, transfers a copy of the F factor and produces exconjugants
connecting the donor and recipient bacterial cells. that are F + donors, as illustrated in Figure 6.8, where the
principal events at each step are described. The most impor-
Transfer of the F Factor tant elements of the process are as follows:
In 1953, William Hayes discovered that the bacteria inter- 1. DNA transfer always begins at a specialized F factor
acting in Lederberg and Tatum’s and in Davis’s experi- sequence called the origin of transfer (oriT). The oriT
ments did not contribute equally to the genetic outcome as sequence directs the cleavage of one phosphodiester
do parent organisms in a genetic cross between eukaryotes. bond on one DNA strand, called the T (transfer) strand.
Instead, the process was unequal, leading Hayes to conclude
that a one-way transfer of genetic information takes place (a) F factor map
Relaxosome
between donors and recipients.
Hayes further proposed that the ability to act as a donor traD 100/0
90 10
was hereditary and was determined by a “fertility factor” Coupling traG
80 IS-1 0
(F factor) that was transferable from donors to recipients. 00 3 20
Relaxase traI -1 IS- -2
Donors are designated as F + (F +cells) to indicate their T n I S
Exporter traK
possession of an F factor, and recipients are identified as 70 30
F − (F −cells) and lack the F factor. In the years after Hayes oriT
proposed the existence of the F factor, microbiologists iden- Pilin traA
tified the F factor as the F plasmid,or fertility plasmid. 60 40
50
Microbiologists today know that conjugation is con- (b) oriT strand sequence
trolled by coupling and exporter proteins produced from Base pairs
genes carried on the F plasmid. As a consequence, only 1 10 20 30
donor cells initiate conjugation. Recipient cells (F - cells) 5¿ CCA GTT TCT CGA AGA AAC CGG TAA ATG CGC CCT CCC 3¿
are unable to initiate conjugation. Furthermore, conjugation
occurs between a donor cell and a recipient but not between Cleavage site
two donor cells. F factor genes direct the construction of an
Figure 6.7 F plasmid structure. (a) Selected genes important
exporter structure formed from coupling proteins that link in donor–recipient cell conjugation and F factor transfer are shown
the donor and recipient cells and from exporter proteins that along with the origin of transfer (oriT) and four insertion sequences
form the bridge through which a single strand of F factor (IS) around the 100-kb map of the F plasmid of the E. coli K-12
DNA will pass from the donor to the recipient. A protein strain. (b) The 36-base sequence of oriT, including the cleavage
complex known as the relaxosome is responsible for cutting site on the T strand.
6.2 Bacteria Transfer Genes by Conjugation 195
Relaxosome
3¿ oriT
5¿
The relaxosome complex binds the F
factor at oriT and cleaves the T strand of
the DNA.
T strand
Degraded
relaxosome
Replication The relaxosome partially degrades,
3¿ leaving relaxase bound at the 5¿ end of
5¿ the T strand. The relaxase –T strand
complex binds to a coupling factor to
prepare for export. Rolling circle DNA
replication begins in the donor.
Relaxase
oriT oriT
The completion of replication in both cells
leaves the donor (F+) unchanged and
converts the recipient cell to an F+
donor state.
Figure 6.8 Conjugation of F+ and F− cells. Rolling circle replication transfers a single strand of the F factor,
beginning at oriT, from a donor cell to a recipient cell, where it is replicated to convert the recipient cell (F-)
to an F+ donor.
2. A protein complex composed of exporter proteins and Inside the recipient it is used as a template to produce
pilin protein forms a conjugation pilus between the a second plasmid DNA strand and thus generate a
donor and recipient cells. The conjugation pilus con- double-stranded F factor. The F - recipient is converted
tains a narrow channel that only allows passage of a to an F + donor by this process.
single DNA strand.
5. Within the donor, T strand transfer is accompanied
3. The protein complex called the relaxosome binds at by a specialized process of DNA replication, known
oriT and makes a single-stranded cut to the T strand. as rolling circle replication that uses the remaining
The relaxosome then partially degenerates, leaving strand as a template. Rolling circle replication is a spe-
relaxase attached to the free 5′ end of the T strand. cialized unidirectional process different from the more
4. Facilitated by the action of relaxase, the T strand enters common process of bidirectional replication that we
the conjugation pilus and passes into the recipient cell. describe in Chapter 7.
196 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
F+ * F- Yes, F+ S F- No
-
Hfr * F No Yes
Recombination of bacterial
-
F′ * F Yes, F′ S F- Yes chromosome and F factor
at an IS element
T strand transfer and replication cease. Figure 6.10 illustrates Mix in conjugation culture.
conjugation between an Hfr with the genotype thr + leu- str S
and an F - with the genotype thr - leu+ str R (the function of Hfr donor thr leu strS
+ –
F– recipient thr– leu+ str R
str R and str S is explained momentarily). Within the recipient R plasmid
cell, the donor DNA is a linear double-stranded DNA frag- Bacterial thr –
chromosome oriT ×
ment containing a portion of the F factor and a segment of leu +
leu – thr +
donor bacterial DNA that was adjacent to oriT. Without the str R
complete oriT sequence, the linear DNA cannot circular- Conjugation and partial T strand
ize; and since only a portion of the F factor is transferred, transfer due to interrupted mating.
Hfr donors cannot convert F - recipient cells to a donor
F factor segment
state (see Table 6.1). However, before the linear segment of Crossover sites
donated donor DNA undergoes enzymatic degradation in
the recipient cell, it can undergo homologous recombination thr +
with the recipient chromosome. The new exconjugant cell, thr +
thr–
thr –
leu – leu+
formerly the recipient cell, may thus acquire one or more str R
genes from the donor bacterial chromosome. leu –
leu +
Conjugation experiments mix one strain of donor bac-
Donor
teria in a culture vessel with a different strain of recipient chromosomal
bacteria. Exconjugants produced within the vessel can be fragment
Homologous
identified by their acquisition of donor genes that give them recombination
genotypes distinct from those of either the donor strain or
thr– thr+
recipient strain. These exconjugants can be recognized by
leu+
their growth on a selective growth medium, a medium con- Enzymatic str R
degradation leu–
taining compounds that permit only exconjugants with spe-
cific genotypes to grow and that also prevent the growth of One kind of
donor cells and recipient cells. exconjugant cell:
thr+ leu+ str R
In experiments of this kind, antibiotic sensitivity and
resistance is used as a tool to control growth of bacteria. In
the recipient cells, resistance to the antibiotic streptomycin thr +
(str R) comes from a gene carried on an extrachromosomal R leu+
plasmid. The donor cell is streptomycin sensitive (str S), but str R
this is due to the absence of an R plasmid, not to the pres- Minimal medium
ence of an allele for streptomycin sensitivity. Streptomycin plus streptomycin
resistance is therefore a genotypic attribute of recipient and
exconjugant cells but not of donor cells, and the presence
of streptomycin in the selective growth medium will kill
donor cells so they do not grow and potentially confuse the
analysis.
As an example, consider again a conjugation experi- Only thr+ leu+ str R
ment involving an Hfr strain that is susceptible to strep- exconjugants grow.
tomycin (str S) and carries the alleles thr + and leu- (for
biosynthesis of the amino acid threonine and the inability Figure 6.10 Hfr conjugation and exconjugant detection. An
Hfr chromosome fragment transferred during interrupted mat-
to synthesize leucine). Imagine that the F - strain is unable
ing between an Hfr donor cell to an F- recipient cell can undergo
to synthesize threonine (thr -) but capable of leucine synthe-
homologous recombination with the recipient chromosome.
sis (leu+) and resistant to streptomycin (str R). The selective Exconjugants are detected on selective growth media, such as the
medium necessary to grow and isolate exconjugants in this minimal medium shown here.
case is a minimal medium plate with added streptomycin.
The streptomycin in the selective medium kills str S donor Q Explain why the statement in the last message box of this
figure that “only thr + leu+ str R exconjugants grow” is correct.
cells, and the absence of threonine prevents growth of non-
recombinant recipient cells. All growing cells on the selec- recipient chromosome with a homologous segment of DNA
tion plate are thr + leu+ str R, a genotype that could occur from the donor chromosome. In the case shown here, two
only in exconjugants. crossovers transfer thr + from the donor DNA into the recipi-
In Figure 6.10, a segment of donor DNA containing ent chromosome, so that the resulting exconjugants have
thr + leu- is shown aligning with its homologous counterpart the genotype thr + leu+ str R. Only these cells are able to
in the recipient bacterial chromosome, containing thr - leu+. grow on the plate containing minimal medium plus strep-
Homologous recombination can replace a segment of the tomycin shown in Figure 6.10, since donors are killed by
198 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
Evaluate
1. Determine the topic this problem 1. The problem concerns conjugation between an Hfr donor and an F- recipient.
addresses and the nature of the Answer (a) requires identification of growth medium constituents for a his+, str R
required answer. exconjugant; answer (b) requires a map of the donor genes based on their time
of entry.
2. Identify the critical information given 2. Donor and recipient genotypes are given. A time-of-entry profile identifies the
in the problem. minutes of conjugation needed to transfer each donor gene to the recipient.
Deduce
3. Consider the significance 3. Very early transfer of his+ indicates the gene is close to oriT and for this rea-
of the very early transfer son is the first of the genes in the experiment to cross the conjugation tube.
of his+ in the context of
developing a time-of- TIP: Genes that are closer to oriT have earlier
and more frequent opportunities to transfer
entry map. to the recipient and to appear as recombi-
nants in exconjugants than do genes that are
distant from oriT.
Solve Answer a
4. Identify the compounds needed to allow 4. The growth plate used to select these markers would contain streptomycin
growth of exconjugants with the selected and the amino acids threonine, leucine, glutamic acid, and alanine. The plate
markers his+ and str R, irrespective of the would lack histidine, thus requiring the growing strain to be his+.
genotypes for the other genes.
TIP: To select exconjugants that are his+ and str R,
growth plates must provide conditions in which
only the exconjugants that are resistant to strepto-
mycin and able to synthesize histidine can grow.
Answer b
5. Construct a time-of-entry map based on 5. Given that his transfers first, and that gene order and distances are identified
the conjugation data. by the time at which recombinants appear in exconjugants, the Hfr map for
this strain is as follows:
Origin of
transfer
glu thr ala leu
Map
Minutes
0 8 16 29 42
his
For more practice, see Problems 17, 18, and 19. Visit the Study Area to access study tools. Mastering Genetics
199
200 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
(a) Donor allele appearance Figure 6.11 Time-of-entry mapping. (a) Recombinants are
identified by screening exconjugants for donor allele acquisi-
100
tion at regular intervals and plotting their time of entry into the
exconjugant chromosome. (b) Donor alleles leu+ and thr + appear
Frequency of Hfr markers among
thr+ leu+ str R recombinants (%)
40 lac
Consolidation of Hfr Maps
Time-of-entry mapping is an effective approach for mapping
20 gal genes near the 5′ end of the T strand. However, the genetic
mapping information obtainable from a single Hfr strain is
limited. First, because the conjugation pilus is soon broken,
0 causing mating to be interrupted, the likelihood of gene trans-
0 10 20 30 40 50 60 fer drops off quickly with distance from oriT. Second, each
Conjugation time (minutes) Hfr strain can transfer genes in just one direction. On the
(b) Conjugation progression other hand, different Hfr strains, having F factors integrated
gal+ Hfr cell F– cell at different insertion sequences, have different orders of gene
transfer. Furthermore, an F factor can be integrated into an
Start azi Hfr chromosome in either of two orientations, creating the
0
lac+ possibility that two Hfr strains will transfer the same genes
but in opposite orders. In other words, one Hfr might transfer
ton thr leu genes in the order A–B–C, and a different Hfr might transfer
the same genes in the opposite order C–B–A. These two dis-
thr + leu tinctive features of each Hfr strain—the starting point of gene
thr +
5
leu+ transfer and the orientation of gene transfer—are used to join
the information of multiple Hfr strains together to produce
consolidated gene maps of entire bacterial chromosomes.
8
Conjugation time (minutes)
40
Next, the maps are arranged in partial concentric cir- 50
cles by overlapping the segments that have the same genes.
Placing the maps one by one into such an arrangement will
gradually reveal the organization of the circular E. coli chro-
Conjugation with F′ Strains Produces
mosome of the donor strain. The Hfr gene map arrangement Partial Diploids
shown here indicates the location of each integrated frag- Table 6.1 lists a third configuration of the F factor in donor
ment on the circular chromosome, its orientation, and the bacteria, that of the so-called F′ (“F prime”) donor, which
gene order and distances in minutes: contains a functional but altered F factor derived from
imperfect excision of the F factor out of the Hfr chromo-
pheR
some. The integration event that creates an Hfr chromosome
serR
depends on interactions between matching IS elements of
Hfr4
Hfr1 the F factor and of the bacterial chromosome, and when this
process is reversed, the F factor can once again become an
leuY
extrachromosomal F + factor. Occasionally, however, the
cysE excision event is imprecise, and the excised F factor—in
asaB
this case called an F′ factor—contains all of its own DNA
plus a segment of bacterial chromosomal DNA from the
serC
Hfr3 region adjacent to the integration site (Figure 6.13a). An
leuU
F′ factor can carry a variable length of bacterial DNA.
Hfr2 Donor cells carrying an F′ factor are called F′ cells.
tyrT Like the other forms of conjugation described above,
conjugation between an F′ donor and an F - recipient fol-
lows the by-now-familiar process of cleavage of the T strand
nadB proL fumC at oriT and movement of the T strand across the conjugation
pilus with its 5′ end leading the way. DNA replication using
the transferred strand takes place inside the recipient cell.
202 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
azaA
asnU
asnV
acpS
asnT
alkA
amn
serU
chiA
sbcB
dcm
rcsA
rcsA
gnd
non
udk
ara
cps
rfb
flu
his
fli
fli
Time in 43 44 45
minutes
Figure 6.12 Consolidated Hfr map of E. coli. (a) The 100-minute genetic map of E. coli. Genes
of bacterial operons (discussed in Section 12.2) are boxed. The origin of replication (oriC) is seen at
84 minutes. (b) A 2.5-minute segment (minutes 42.5–45) of the E. coli time-of-entry map in comparison
with a segment of approximately 500,000 base pairs of the E. coli genome derived from E. coli genomic
sequencing. Selected genes between 42.5 minutes and 45 minutes on the time-of-entry map (upper) are
aligned with their positions in the genome sequence map (lower) to illustrate the compatibility of the two
mapping approaches.
Q About how many nucleotide base pairs are there in a DNA segment that spans one minute of
conjugation time?
If the entire F′ chromosome is transferred, both parts recipient bacterial chromosome, the resulting exconjugants
of oriT are transferred, allowing the F′ factor to circular- are partial diploids (Figure 6.13b). In other words, the
ize in the recipient cell. At the completion of F′ factor exconjugant is now diploid for (i.e., it has two copies of) the
transfer in such cases, the exconjugant cell, now contain- genes transferred to it on the F′ plasmid.
ing a complete F′ factor, is converted to an F′ donor (see Figure 6.11b illustrates the creation of a partial diploid
Table 6.1). In this process the exconjugant has also exconjugant carrying two alleles of the lac gene. The lac +
acquired copies of the donor chromosomal genes carried allele on the F′ factor enables the cell to use lactose for
on the F′ factor. Because the newly received chromo- growth, whereas the mutant lac - allele on the exconjugant
somal genes are homologs of genes already present on the chromosome is unable to function in lactose utilization. In
6.3 Bacterial Transformation Produces Genetic Recombination 203
(a) Hfr chromosome this partial diploid, the lac + allele is dominant over the lac -
allele. Partial diploids of this type have been used in genetic
oriT
studies to examine the mode of action of genes in bacteria
and to dissect the regulation of coordinated gene action in
bacterial metabolism and growth (see Section 12.3).
Bacterial F factor
chromosome Genetic Analysis 6.2 guides you through an analysis of
lac+
donor and recipient bacterial strains and the identification
Normal excision Aberrant excision of donor types through the analysis of three conjugation
experiments.
A segment of
the bacterial lac+
lac+
DNA loops out
during excision. Plasmids and Conjugation in Archaea
Formation of F+ factor Formation of F¿ factor Research on archaea species is still in its infancy in compari-
son with the many decades of research that exist on bacteria.
Despite this short research history, a number of significant
lac+ observations have been made with regard to archaeal plas-
lac + oriT oriT
mids and conjugation among archaeal cells.
Like bacteria, archaea are single-celled haploid organ-
isms, usually with a single chromosome and various plas-
Bacterial F+ plasmid Bacterial F¿ plasmid mids. All of the genes that are essential for the normal
chromosome chromosome
metabolic and physiologic activities of the cell are carried on
The F¿ factor contains the the archaeal chromosome. Ongoing research on archaea plas-
donor lac+ in addition to a mids that began in the early 1990s has identified dozens of
full set of F factor genes.
different plasmids among archaeal species. Although much
more study is needed, the information available at present
(b) F¿ cell F– cell indicates that most archaeal plasmids replicate by rolling cir-
cle replication. The data further identify numerous instances
lac+ of plasmid-driven conjugation between archaeal donor and
oriT × lac – recipient cells. The genetic composition of archaeal conjuga-
tive plasmids has not been well characterized, nor is there
enough information to be able to describe the details of the
Bacterial F¿ factor Bacterial archaeal conjugation apparatus. To date there is evidence of
chromosome chromosome some similarities to bacterial conjugation, but there is also
Grows on a lactose medium Unable to grow on evidence that some aspects of archaeal conjugation may be
a lactose medium substantially different from bacterial conjugation.
lac +
6.3 Bacterial Transformation
lac – Produces Genetic Recombination
Transformation occurs when a recipient cell takes up a
Transfer complete fragment of donor cell DNA from the surrounding growth
F¿ cell F¿ exconjugant medium. The DNA fragment passes through the wall and
membrane of the recipient cell and is incorporated into the
lac + lac + recipient cell chromosome by homologous recombination.
lac –
A recipient cell that is able to take up transforming DNA is
described as “competent.”
Transformation is a naturally occurring mechanism that
The exconjugant is a lac +/lac – partial diploid and has acquired the can be used to produce accurate maps of bacterial genes,
ability to grow on a lactose medium. Because F¿ plasmid transfer including those that are closely linked and not readily
was complete, the exconjugant can act as an F¿ donor.
mapped by conjugation experiments. Transformation is also
Figure 6.13 F factor excision from Hfr integration. (a) Normal used as a laboratory technique by molecular biologists seek-
excision (left) restores an Hfr to an F+, whereas aberrant excision ing to introduce DNA into microbial cells, plant cells, or
(right) forms an F′ plasmid in an F′ donor cell. (b) F′ * F- conjuga- animal cells as part of the process of creating recombinant
tion produces an exconjugant that is a partial diploid lac +/lac -. DNA or transgenic organisms (see Sections 15.1 and 15.2).
GENETIC ANALYSIS 6.2
PROBLEM In E. coli, the abilities to utilize the sugar lactose, synthesize the amino acid methionine, and
resist the antibiotic streptomycin are conferred by alleles lac + and met + and the R plasmid, respectively.
Bacteria without the R plasmid are susceptible to streptomycin (str S), and mutant alleles lac - and met -
produce bacteria that are unable to grow on media containing lactose as the only sugar and require
methionine supplementation for growth, respectively. E. coli strains are identified as donors or recipients
in the first table presented here, which also contains information on their ability to grow under various con-
ditions. The second table contains growth information for the exconjugants of mating between donor and
recipient strains. In each table, ; +< indicates growth
and ; -< indicates no growth. “Min” signifies a mini- Strain Type Strain Growth
mal medium, and supplemented minimal medium Min Lac Min + met Min + met + str Lac+met + str
plates are indicated by, for example, ;Min +met<
A Donor + + + - -
(minimal medium plus methionine). “Lac” indicates a
plate containing only lactose as the sugar. B Donor + + + - -
a. Use the growth information in the first table to C Donor + + + - -
determine the genotype of each strain at the
D Recipient - - + + -
lac and met genes and for resistance or suscep-
tibility to streptomycin. BREAK IT DOWN: Anabolic and catabolic pathways and the determination of genotypes
for alleles in these pathways are described in Research Technique 6.1, pp. 189–190.
b. Use the growth information in
the second table to determine Mating Exconjugant Growth Are the Exconjugants
the genotypes of exconjugants Donors?
produced by each mating. Min + str Min + met + str Lac+ str Lac+ met + str
c. Compare the genotypes and
mating behavior of donors, recip- A * D + A*D - - Yes
ient, and exconjugants to deter- B*D A*D + - - Yes
mine whether each donor is F+, C*D - + - + No
Hfr, or F′. Explain your rationale
for each donor identification. BREAK IT DOWN: Table 6.1, p. 196, summarizes the potential con-
version of and bacterial gene transfer to exconjugants by donors.
Evaluate
1. Identify the topic this problemad- 1. This is a conjugation problem in which genotypes of donors and a recipient are
dresses and the nature of therequired determined by growth characteristics. Donor types (F+ , Hfr, F′) are to be identified
answer. by growth characteristics of exconjugants. The answers require identifying geno-
types for lac, met, and str for the recipient and each donor and exconjugant.
2. Identify the critical information given 2. The two tables identify growth characteristics. The first table contains growth
in the problem. information on three donors (A, B, and C) and a recipient (D). The second table
contains growth information on the exconjugants of mating between each
donor and the recipient.
Deduce
3. Compare the growth characteristics 3. The growth characteristics of the three donor strains (A, B, and C) are identi-
of donors and the recipient in the first cal on each kind of medium. These three strains have the same genotype. The
table, and deduce which genotypes recipient, strain D, has a different set of growth characteristics and therefore a
are likely the same. different genotype.
4. Examine the exconjugants in the second 4. Donor A and donor B transfer a complete F sequence to the recipient and
table and determine which have been convert the exconjugant to a donor. Donor C does not transfer the complete
converted from recipients to donors. F sequence, so the C * D exconjugant is not converted to a donor.
TIP: When an exconjugant has been con-
verted to a donor state, we know it has
Solve received a complete copy of the F factor.
Answer a
5. Determine the genotypes of the donor 5. The genotype shared by donor strains A, B, and C is met + lac + str S. The minimal
and recipient strains from growth infor- medium contains glucose. Growth of donor strains in this medium indicates their
mation in the first table. prototrophy for methionine (met +). Growth in the lactose–containing medium
indicates they are lac +. The inability of donors to grow in media containing
streptomycin indicates they are str S.
The recipient genotype is met - lac - str R. It is unable to grow on the minimal
(glucose-containing) medium, but it can grow on glucose plus methionine,
indicating it is met -. It also grows on the minimal medium plus methionine
and streptomycin, indicating that it is str R. Lactose utilization is tested on the
medium containing lactose plus methionine and streptomycin. Here it fails to
grow, indicating it is lac -.
204
GENETIC ANALYSIS 6.2 CONTINUED
Answer b
6. Determine the genotypes of exconju- 6. Using analysis similar to that employed above, we conclude that the exconju-
gants from growth information in the gant genotypes are
second table. A * D met + lac - str R, conversion to donor
TIP: Compare the genotypes of exconjugants
to the recipient genotype to determine if one
B * D met - lac - str R, conversion to donor
or more donor alleles have been transferred C * D met - lac + str R, no conversion
during conjugation. Use Table 6.1 for help in
categorizing each donor. Answer c
7. Identify each donor by donor type 7. A * D exconjugants have acquired met + and have undergone conversion to
and explain the rationale for each a donor state. F′ donors can transfer an allele and convert the recipient, so we
identification. conclude that strain A is an F′ donor. Exconjugants of the B * D mating retain
the recipient genotype, but they are converted to a donor state. F+ donors
produce this result, so strain B is an F+ donor. The C * D conjugation produces
exconjugants that have acquired lac + but have not undergone conversion. This
is a characteristic of Hfr donors, so we conclude that strain C is Hfr.
For more practice, see Problems 19 and 23. Visit the Study Area to access study tools. Mastering Genetics
Steps in Transformation geneticists look for two or more genes that are transferred
into the recipient on the same fragment of transforming
Transformation is a four-step process, as illustrated in DNA. Thus, genetic analysis focuses on cotransformation,
Figure 6.14. It is preceded by the lysis, or breakage, of a
the simultaneous transformation of two or more genes. For
donor cell and the release of fragmented DNA from the donor cotransformation to occur, the crossover events must incor-
chromosome. The transforming DNA is double-stranded and porate closely linked genes on a single fragment of trans-
can be taken up by a competent recipient bacterial cell. forming DNA.
The passage of double-stranded transforming DNA
across the recipient cell wall and cell membrane is accom-
panied by degradation of one of the strands (step 1 ). The
remaining strand of transforming DNA aligns with, or
6.4 Bacterial Transduction Is
“invades,” a complementary region of the recipient chro- Mediated by Bacteriophages
mosome 2 . The alignment triggers the action of several
enzymes that excise one strand of the recipient chromo- In transduction, the transfer of genetic material from a
some and replace it with the transforming strand. This donor bacterial cell to a recipient cell occurs by means of
recombination event forms heteroduplex DNA: One a bacteriophage (bacterial virus) acting as a vector to carry
strand is derived from the recipient cell, and the approxi- donor DNA to the recipient cell. A transductant is formed
mately complementary transforming strand is derived when the donated DNA is integrated into the recipient cell’s
from the bacterial donor 3 . After the subsequent DNA chromosome by homologous recombination.
replication and cell-division cycle 4 , one daughter cell is In this section, we review the life cycles of bacterio-
a transformed cell, also called the transformant. It con- phages (phages, for short) that infect E. coli. We then con-
tains a chromosome carrying the transforming strand and sider cotransduction mapping—a powerful technique for
its newly synthesized complementary strand. The other mapping bacterial genomes—and the role of generalized
daughter cell retains the recipient chromosome and is not transduction in this process. We conclude the section with a
genetically altered. discussion of specialized transduction.
205
206 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
DNA
a+ a–
Receptor Sheath
site
Tail
1 Donor DNA binds at the Base plate fibers
receptor site. One strand a+
is degraded as it enters a–
L phage
the recipient cell. Head
a +
DNA
Sheath
Transforming
Donor DNA strand
Degraded
nucleotides
DNA-binding
complex at DNA-degrading
receptor enzyme
Recipient Cytoplasmic
cell wall membrane Figure 6.15 T4 bacteriophage and l phage structures.
Bacteriophages consist of a proteinaceous head filled with DNA, a
2 The transforming strand pairs sheath, and, in some phages, tail fibers.
with the homologous region
of the recipient chromosome. a+ a– Q Bacteriophages, like other viruses, cannot replicate
autonomously, do not have an energy metabolism, and produce
Transforming
no waste products. They can only reproduce and express their
strand
genetic content by invading host cells and using numerous host
Heteroduplex proteins and other host compounds and components. Most
DNA biologists classify viruses as nonliving acellular particles. Do you
3 The transforming strand agree or disagree?
displaces a recipient strand,
forming complementary a+ a–
heteroduplex DNA (a –/a+). numerous enzymes and other compounds found in the host
The excess strand degrades. bacterial cells, as bacteriophages are incapable of autono-
mous DNA replication, transcription, and translation.
Bacteriophages employ a variety of mechanisms to
DNA replication attack bacteria. By whatever specific mechanism they may
and cell division use, however, bacteriophages actively seek out and attach to
host cells, commencing a six-step process called the lytic
cycle, in which infection by a bacteriophage leads to the
lysis (rupture) of the host cell and the release of up to 200
new progeny phage particles. The steps composing the lytic
a– a– a+ a+ cycle are depicted in Figure 6.16.
1 Attachment of the phage particle to the host cell.
2 Injection of the phage chromosome into the host
cell. Injection is quickly followed by circularization of
Nontransformant Transformant
the phage chromosome, to protect it from enzymatic
4 DNA replication and cell division produce one degradation.
transformant and one nontransformant. 3 Replication of phage DNA, using numerous host
Figure 6.14 Transformation of a competent bacterium (a-) by enzymes and other proteins. A copy of the phage
donor DNA (a+). chromosome is required for each of the eventual
6.4 Bacterial Transduction Is Mediated by Bacteriophages 207
DNA
Host
DNA
Lysogenic cycle
6 Progeny phage particles
Phage
are released by lysis
Phage
from host bacteria.
DNA
Phage chromosome
circularizes to protect
it from degradation. 3 Integration of
phage DNA
into the host
chromosome.
Figure 6.16 The lytic and lysogenic life cycles of a temperate bacteriophage. The lytic cycle progresses
directly from infection through phage reproduction to lysis. The lysogenic cycle features the integration of
the phage into the host chromosome, where it resides until excision and resumption of the lytic cycle.
progeny phage particles, which generally number a fragment of the host chromosome into a phage head
between 50 and 200. can follow chromosome fragmentation.
4 Transcription and translation of phage genes, using 6 Lysis of the host cell, resulting in the death of the host
numerous host enzymes, other proteins, and ribosomes. and the release of progeny phage particles.
Heads, sheaths, and tail fibers for all progeny particles Bacteriophages called temperate phages are capable
must be synthesized and assembled. of a temporary alternative life cycle that leads to the tem-
5 Packaging of phage chromosomes into phage heads. porary integration of the phage chromosome into the bac-
This step is commonly accompanied by fragmentation terial host chromosome. The integration process is termed
of the host chromosome. Occasional mispackaging of lysogeny. Environmental and growth conditions are largely
208 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
what initiate a lysogenic cycle. Lysogeny can persist for Phage P1 1 P1 phage infects a met +,
many bacterial replication and division cycles, but it even- his+ donor cell.
tually comes to an end, and the lytic cycle resumes. (We
discuss the details and genetic regulation of this alternation
Donor bacterium
between life cycles in Section 12.6.) Five steps characteriz- P1
DNA his + met + (met +, his+)
ing the lysogenic cycle are shown in Figure 6.16.
1 Attachment of the phage particle to the host cell. Bacterial
chromosome
2 Injection of the phage chromosome into the host
Fragments
cell, followed by phage-chromosome circularization. of bacterial
3 Integration of the phage chromosome into the host chromosome
chromosome. This process is site specific, meaning 2 Phage chromosome is
met + replicated, and phage
that it occurs at a specific DNA sequence found in both
proteins are expressed.
the phage and bacterial chromosomes. Once integrated The donor chromosome
his+
into the host chromosome, the phage DNA is termed fragments.
the prophage. The prophage remains stably integrated
Normal
at the same location for multiple cycles of bacterial Transducing
P1 phage
P1 phage 3 Progeny phage assembly
chromosome replication and cell division.
yields normal phage
4 Excision of the prophage. In response to an envi- carrying the phage
ronmental signal, such as a high dose of ultraviolet chromosome and
irradiation, the prophage reverses its integration and is transducing phages
carrying a fragment of
excised intact. This event is usually an exact reversal the donor chromosome.
of the site-specific integration, but rare mistakes in Progeny
P1 phage 4 Lysis releases normal and
prophage excision lead to a specific kind of abnormal Lysis
transducing progeny phages.
phage that may contain host genetic material.
5 Resumption of the lytic cycle, beginning with phage-
chromosome replication.
Generalized Transduction
In the decades since the 1952 discovery and description of 5 A met + transducing phage infects a
generalized transduction by Norman Zinder and Joshua met – recipient cell and injects the
donor DNA fragment.
Lederberg, numerous kinds of generalized transducing phages
have been identified. Generalized transducing phages are his – Recipient bacterium
met + met – (met –, his–)
formed when a random piece of donor bacterial DNA of Donor
the appropriate length is mistakenly packed into the phage DNA (met +)
head instead of a similarly sized length of phage DNA. This Bacterial
occasional error in DNA packaging occurs because the pack- chromosome
ing mechanism that inserts DNA into the phage head dis-
criminates DNA by its length (in base pairs) rather than by met + 6 Homologous
his –
sequence. Generalized transducing phages can carry any seg- recombination at two
ment of donor DNA, since the process of mistaken packaging met – crossover points
exchanges segments
is random.
between the donor
The phage P1 is a well-studied bacteriophage that fragment and the
infects E. coli and is a prolific producer of generalized trans- recipient chromosome.
ducing phages. This phage was initially chosen for inten-
his – met + 7 The transductant is met +,
sive study of its transduction ability because it has a large his –. Excised DNA
genome of nearly 100,000 bp (100 kb). To produce progeny containing met – is
that are generalized transducing phages, P1 must capture degraded.
segments of donor bacterial DNA that are almost exactly
100 kb, a length that is about 2% of the E. coli chromosome. Transductant bacterium (met +, his –)
Analysis of P1 infections tells us that about 1 in 50 progeny Figure 6.17 An example of transduction by P1 phage. Transduc-
of a P1 infection are generalized transducing phages. ing phages are generated by the mistaken packaging of a fragment of
Figure 6.17 illustrates generalized transduction in seven the donor bacterium’s DNA into a phage head. Transductant bacteria
steps (combining attachment and injection into a single first are produced by homologous recombination between the introduced
step). The outcome of transduction, as noted at the start fragment of donor DNA and the recipient bacterial chromosome.
6.4 Bacterial Transduction Is Mediated by Bacteriophages 209
of this section, is the production of a transductant, a bac- for his +. If the analysis determined that 28 of the 200 met +
terium that has acquired one or more donor genes through transductants were also transduced for his +, the cotransduc-
transduction: tion frequency for those genes would be 14% 1 200 28
2.
To succeed in finding cotransductants in an experi-
1 A normal P1 phage attaches to a donor bacterial cell
ment, researchers may have to genotype large numbers of
and injects its chromosome into the cell.
colonies. To reduce the number of colonies that must be
2 Replication of the phage chromosome is followed by genotyped in such experiments, a two-step strategy is used
transcription and translation to produce phage proteins. that first identifies cells transduced with one donor allele
Fragmentation of the bacterial chromosome precedes and then screens those transductants for the acquisition of
the packaging of phage chromosomes into phage heads. additional donor alleles. The first step employs a selected
3 Assembly of progeny phages, including packing of marker screen, or selection, to identify transductants
phage heads, is largely normal, but a few progeny for one of the donor alleles of interest. Transductants that
phages receive a random fragment of the donor bacte- are selected are then screened a second time, for a second
rial chromosome that is approximately the same length donor allele, in an unselected marker screen. The goal is
as the phage chromosome. These abnormal progeny to determine the percentage of transductants for the selected
phages are generalized transducing phages. marker that are also transduced for the unselected marker,
4 Host-cell lysis releases normal and generalized trans- while reducing unnecessary colony genotyping.
ducing phages.
5 Generalized transducing phages attach to new recipient Cotransduction Mapping
cells and inject the fragment of donor DNA. Genetic map construction in bacteria uses cotransduc-
6 In each recipient cell, homologous recombination tion frequencies to determine the relative order of three or
occurs between the fragment of donor DNA and the more genes. Cotransduction mapping makes use of the
recipient chromosome. Pairs of crossover events are fact, described above, that the frequency of cotransduction
required to splice the donor fragment into the recipient is greater for genes that are close together and is lower for
chromosome and excise a homologous segment of the genes that are farther apart. Any two genes on the donor
chromosome. The excised chromosome fragment is chromosome have two chances to be separated by a chro-
degraded by enzymes. mosomal event. The first separation chance comes when the
7 A stable transductant strain results. donor chromosome is broken into fragments. Genes that are
close together are more likely to end up on the same donor
Cotransduction chromosome fragment than genes that are far apart. The
second chance for separation comes during homologous
The donor cell in the transduction experiment shown in recombination, when genes that are close together on the
Figure 6.17 has the genotype met + his +, and the recipient donor fragment are less likely to be separated by a crossover
is met - his -. The bacterial culture in which this experiment event than genes that are far apart on the fragment.
takes place will contain millions of bacteria, most of which Let’s look at two studies that test the order of the same
are not transduced. In addition, many cells may be transduced four genes in E. coli. Figure 6.18 provides cotransduction
with donor alleles that are not tested for in the experiment. The
transductants detected in this particular experiment are those
in which either the met + or his + allele or both are transduced.
Transductants having either the genotype met + his - or (a) Cotransduction frequencies
Percent
the genotype met - his + offer evidence that each allele can cotransduction
be individually transduced. In addition, a certain number Donor Recipient Selected Unselected of unselected
of transductants will undergo simultaneous transduction genotype genotype marker marker marker with cys +
of both genes to produce met + his + transductants. These cys+ trpE+ cys– trpE– cys+ trpE+ 63
cells have undergone cotransduction of both donor alleles. cys+ trpC+ cys– trpC– cys+ trpC+ 53
The frequency of cotransduction, called cotransduction cys+ trpB+ cys– trpB– cys+ trpB+ 47
cys+ trpA+ cys– trpA– cys+ trpA+ 46
frequency, depends on how close the two genes are to one
another on the donor chromosome. The closer the genes
(b) trp operon map
are, the higher the probability of cotransduction (thus, the
higher the cotransduction frequency), and the farther apart
the genes are, the lower the cotransduction probability. If, cys trpE trpC trpB trpA
for example, an experimenter carried out the transduction Figure 6.18 Yanofsky’s cotransduction frequency analysis and
cross in Figure 6.17 and identified 200 transductants for mapping of trp operon genes in E. coli. (a) Cotransduction fre-
met +, the experimenter could determine the frequency of quencies of cys+ and a gene of the trp operon are determined
cotransduction by then identifying how many of those met + in separate selected marker–unselected marker experiments.
transductants were also transduced (i.e., were cotransduced) (b) Yanofsky’s proposed map of the trp operon.
210 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
distance between cys, which is outside the operon, and trpC from Mendel’s original description of “particulate inheri-
within the operon; at point 3, a very small space in the operon tance” of traits. Before knowing the molecular structure of
between trpC and trpB; or at point 4, a large region to the right DNA, biologists had difficulty describing how recombination
of trpB. Three different double-crossover combinations gener- within a gene could occur. Geneticists knew that different
ate transductant Classes 1, 2, and 3, respectively, and trans- mutations could affect a single gene, and had data showing
ductant Class 4 is produced by a quadruple recombination that different mutations can occupy unique locations within
requiring crossover at all four points. The quadruple crossover a gene. But what remained lacking was a refined understand-
is expected to be the least frequent of the combinations pro- ing of the internal structure, or fine structure, of genes.
ducing cotransductants. This study verifies Yanofsky’s pro- Beginning in the early 1950s, Seymour Benzer helped
posed trp operon map for two reasons. First, cotransduction define how biologists view the structure of genes with a
frequencies for cys–trpC and for cys–trpB are almost identical series of experiments that revealed the existence of a genetic
in the two studies (53% versus 52% for cys–trpC, and 46% fine structure, a phrase referring to the composition of genes
versus 47% for cys–trpB), placing trpC closest to cys in both. at the level of their molecular building blocks. Benzer dem-
Second, the quadruple recombination event is expected to onstrated that the building blocks of genes (later determined
occur less frequently than any of the double crossover events. to be DNA nucleotide base pairs) were responsible for both
Genetic Analysis 6.3 guides you through an analysis of mutation and recombination. The publication of his principal
a transduction to determine gene order in a donor strain. conclusions coincided with the identification of the molecu-
lar structure of DNA. When the functional subunits of DNA
Specialized Transduction were revealed to be nucleotides, it was impossible to miss the
As described above, temperate bacteriophages have the abil- connection between them and Benzer’s fine structure.
ity to lysogenize their host by integrating into the host chro- Benzer focused on two questions. First, was the gene
mosome to create a prophage. The site of integration is a the fundamental unit of mutation, or could components of
DNA sequence called the att site (for “attachment”) that is genes be mutated? Second, was recombination a process
identical in the bacterial chromosome and the phage chro- occurring only between genes, or did recombination also
mosome. The shared 15-bp sequences are called attP in tem- occur between the components of genes? Benzer studied
perate bacteriophage (the P stands for phage) and attB (B for these questions using the rII region of the T4 bacteriophage.
bacteria) in its host E.coli bacterium. There is just one att Genes in the rII region determine whether and how the
site in each bacterial genome possessing one. A specialized phage will lyse its E. coli host.
phage enzyme recognizes the att sites and makes a stag- Lysis is examined using a bacterial lawn, a solid coat-
gered cut there. The complementary single-stranded ends of ing of bacteria on the surface of a growth medium. If the
cleaved att DNA reanneal as the prophage integrates, to cre- growing bacteria are exposed to a bacteriophage, infected
ate an att sequence at each end of the integrated prophage. cells lyse and progeny phages are released. Progeny phages
Because the attB and attP sequences are identical, the infect new host cells, and as the infection-lysis-infection
excision of a prophage is almost always the exact reversal cycle continues, a bacteria-free spot called a plaque—a hole
of prophage integration. Occasionally, however, excision in the bacterial lawn—appears on the growth medium.
is inaccurate: Aberrant excision removes much of the inte- Benzer showed that two genes, rIIA and rIIB, control
grated prophage but along with it a small segment of the the ability of T4 phages to lyse E. coli host cells. Those T4
transductant chromosome that is immediately adjacent to the phages carrying wild-type copies of rIIA and rIIB lyse mul-
att site of integration. Aberrant excision of a prophage forms tiple strains of E. coli, leading to the production of small
what is called a specialized transducing phage because the plaques (Figure 6.20). On the other hand, phages with
chromosomal material of the transductant that is removed
in error is limited to regions immediately to the right or
immediately to the left of the att site. Thus, rather than trans-
ductants carrying random pieces of donor DNA, as in gener-
alized transduction, specialized transductants can only carry
donor DNA located immediately around the att site. Mutant
plaque
Evaluate
1. Identify the topic this problem 1. This is a cotransduction problem in which cotransduction frequencies are to be
addresses and the nature of the used to determine the order of three genes in the donor.
required answer. 2. The results of three transduction experiments are given. Each experiment has a
2. Identify the critical information given different gene or a gene combination as the selected marker(s).
in the problem.
Deduce
3. Keep in mind the advantage of using 3. Selecting for transduction of one of the genes of interest and then evaluating
the selected–unselected marker transductants for the other gene(s) reduces the number of plates that must be
experimental approach. evaluated and simplifies the experimental analysis.
4. Interpret the results of each 4. Experiment 1 indicates close proximity of leu and azi, and a greater distance
experiment. between leu and thr. Experiment 2 suggests the same more-distant relationship
TIP: Cotransduction frequencies are
between thr and leu but also shows no cotransduction between thr and azi.
highest for genes that are closest Experiment 3 informs us that cotransduction of all three donor alleles occurs,
together on the bacterial chromosome. though at a low frequency. We can interpret this to mean that the segment of
chromosome containing these genes is small enough to form a single fragment
for transduction.
Solve
5. Combine your observations to identify 5. Putting the results of these experiments together, we can identify cotransduc-
the order of these three genes. tion of thr and azi (shown at 0% in Experiment 2) as the quadruple-crossover
cotransductant. All other events are a result of double crossover. The quadruple
crossover event is expected to be least frequent among the cotransductants.
TIP: Crossovers occur in pairs during the
homologous recombination that accompanies On this basis, leu can be identified as the middle gene of the three tested. The
transduction. When three genes are involved, gene map is shown below, and the four crossover intervals are identified.
a quadruple crossover is less frequent than any
of the double crossovers.
azi R leu + thr +
Donor
1 2 3 4
Recipient
azi S leu – thr –
For more practice, see Problems 9, 20, and 24. Visit the Study Area for a Video Tutor solution. Mastering Genetics
212
6.5 Bacteriophage Chromosomes Are Mapped by Fine-Structure Analysis 213
mutation of either rIIA or rIIB form large, irregularly shaped (a) Complementation of mutations in different genes
plaques on E. coli strain B, but they are unable to form any Mutation Mutation
plaques on E. coli K-12 (l).
rll locus Gene A Gene B × Gene A Gene B
Benzer used several different mutagens to produce
almost 20,000 rII mutants that he studied in three ways.
Viral A B A B
First, he used genetic complementation analysis, which products: defective functional functional defective
showed that there are two genes in the rII region. Second, he
mapped different mutations of rIIA and different mutations
of rIIB, thus showing that intragenic recombination (within
E. coli K12 (l) lawn Wild-type
the gene) was possible and could be used to establish the T4 plaques
locations of different mutations in each gene. Finally, Benzer
developed deletion mapping to refine the genetic map. The
following discussions explain each of these achievements During simultaneous infection, complementation occurs
because functional forms of both A and B proteins are present.
individually.
(a) Nonoverlapping mutations, (b) Overlapping mutations, Figure 6.23 Deletion mapping
wild-type recombination no wild-type recombinants of mutants in the rII region. Wild-
type recombinants form if the site
rII region rII region
of point mutation does not overlap
the site of deletion, but if the two
A B A B A B A B mutation sites overlap, no wild-type
× ×
recombinants are possible.
Deletion Point Deletion Point
mutation mutation mutation mutation
Coinfection Coinfection
A B A B
A B A B
Recombination Recombination
A B A B A B A B
and and
Double mutant Wild type Deletion mutant Point mutant
Mutational hotspot
B7 Mutational hotspot
B10
B8 B9a B9b
Lateral Gene Transfer and Genome of the same species, but they can also be members of dif-
ferent species or even distinct taxonomic groups. Common
Evolution examples of LGT are the three bacterial transfer processes
Lateral gene transfer (LGT), also known as horizon- discussed in this chapter: conjugation, transformation, and
tal gene transfer (HGT), is the transfer of genetic material transduction. Each of these processes occurs readily in and
between individual bacteria or archaea and other organ- between species. Extensive studies of LGT across a wide
isms. The participating organisms are sometimes members range of bacterial and archaeal species find that on average
216 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
more than 12% of the genes in a genome are the result of transmitted during reproduction. There is, however, recent
LGT. The range in the amount acquired by LGT is quite speculation that DNA transferred by LGT from bacteria
wide, from a high of more than 25% in the genome of the could become inserted into the genomes of somatic cells,
archaeal organism Methanosarcina acetivorans to less than where it could induce mutations. If such insertional muta-
2% of the genome in the bacterium Mycoplasma genital- genesis does in fact occur, it could possibly be a cause of
ium, although the small size of this genome may play a role abnormalities, including the development of cancer. More
in its low rate of LGT. E. coli is relatively high on the LGT information will emerge about this topic in the near future.
percentage-transfer list, with about 17% of the genome
transferred by LGT. Studies of LGT detect a substantial Identifying Lateral Gene Transfer
bias in the biological function of laterally transferred genes.
Genes whose protein products are expressed at the cell
in Genomes
surface, genes encoding DNA-binding proteins, and genes LGT is identified by the presence of DNA-sequence fea-
whose products have pathogenicity-related functions are tures that make certain portions of a genome distinct from
much more likely to undergo LGT. the rest of the genome. These distinctive genome regions are
LGT between bacteria is prevalent, but in addition, there called genomic islands because they occur within a con-
has long been evidence of limited LGT between bacteria and fined portion of the genome. Genomic islands typically are
eukaryotes. Prior to the availability of genome sequence large segments that span 10–200 kb and often include mul-
information, LGT between bacteria and eukaryotes was tiple genes that may have related functions. Two common
thought to be limited to the transfer of a very small number ways to identify a genomic island acquired by LGT are (1)
of genes. From an evolutionary perspective, the most promi- by determining that a group of genes are much more simi-
nent of these earlier known examples of bacteria–eukaryote lar to genes of a distantly related species than to those of
LGT are the presence of mitochondria in eukaryotic cells a closely related species, and (2) by detecting a region of
and the presence of chloroplasts in plant cells. Mitochon- genome that has a ratio of G–C base pairs to A–T base pairs
dria and chloroplasts are essential organelles in eukary- that is substantially higher or lower than the average in the
otic cells. Millennia ago, ancient bacteria invaded ancient rest of the genome.
eukaryotic cells, and through a process of coevolution on Recent evidence points to a significant role for LGT
the part of both cells, mitochondria and chloroplasts estab- in the evolution of genomes. Moreover, in two particu-
lished endosymbiotic relationships with eukaryotic cells. lar ways, some LGT-driven events are of profound medi-
Both organelles carry their own chromosomes that contain cal importance to humans. First, LGT has allowed many
unique genetic information. In animal cells, mitochondrial organisms to adapt rapidly to changing environmental
gene products work with nuclear gene products to produce conditions by acquiring the ability to resist one or more
adenosine triphosphate (ATP) used for energy; and in plant antibiotic compounds. With this ability, drug-resistant bac-
cells, chloroplast gene products are responsible for photo- teria can proliferate in the presence of the antibiotics. LGT
synthesis. The inheritance of mitochondrial and chloroplast within and between bacterial species is a common route
genes differs from that of nuclear genes because the organ- for the rapid dissemination of antibiotic resistance, and
elles are cytoplasmic, not nuclear. We discuss the details of medical practitioners today routinely encounter patients
cytoplasmic heredity and the evolution of mitochondria and with infections produced by bacterial strains resistant to
chloroplasts in Chapter 17. one or more of the commonly used antibiotics. The U.S.
Another well-known example of bacteria–eukaryote Centers for Disease Control and Prevention (CDC) issued
LGT is the transfer of DNA from the bacterium Agrobac- a report in late 2013 highlighting the seriousness of antibi-
terium tumefaciens to plants. Agrobacterium transfers about otic resistance as a prevalent medical problem. The report
10,000 to 30,000 base pairs of DNA from its much larger stated that each year in the United States more than 2 mil-
tumor-inducing (Ti) plasmid to plant cells. In plants, this lion people are infected with antibiotic-resistant bacte-
DNA causes crown gall disease, a type of cancerous tumor. ria and that the annual death rate from these infections is
The natural tendency of Ti plasmid to be transferred into nearly 25,000.
plant cells is utilized in the research laboratory to produce Not only is antibiotic resistance readily transferred
transgenic plants, as we discuss in Chapter 15. between bacteria by LGT, but the prevalence of resistance
In 2007, genome sequencing information demonstrated genes is increased by the extensive use, and misuse, of anti-
extensive LGT between the bacterium Wolbachia and a biotics. The 2013 CDC report attributes a substantial portion
large number of insects. The data indicate that roughly of the increase in antibiotic-resistant strains to the pervasive
one-third of all arthropod genomes contain Wolbachia use of antibiotics in animal agriculture, where they are often
DNA transferred by LGT. Researchers speculate that LGT used to promote growth in animals with no signs of infec-
between bacteria and animals may be much more common tion. These circumstances and the impact of this phenom-
than previously thought. Only some of the transferred genes enon on the practice of medicine are the subject of the Case
appear to actually enter the germ line, where they can be Study at the end of this chapter.
Case Study 217
The second medically relevant consequence of LGT in acquired by transduction. E. coli O157:H7 is found in some
bacteria is the acquisition of pathogenicity islands, a sub- contaminated beef and on some fresh produce, including
type of genomic islands, containing multiple genes for pro- lettuce. Thorough rinsing can, but does not always, remove
teins that promote the ability of the bacteria to invade the the pathogen from lettuce, and undercooking contaminated
body of a host and also containing genes that produce toxic beef does not raise its temperature high enough to kill patho-
compounds. gens that may be present. The pathogenicity island in E. coli
Among the various strains of the common, and usu- O157:H7 contains genes that promote the adhesion of the
ally friendly, intestinal bacterium E. coli are some strains pathogen to intestinal cells and a toxin gene that acts similarly
that are pathogenic. The most common strains of E. coli are to, although not as dramatically as, the Vibrio cholera toxin.
commensal bacteria that inhabit our intestinal tract and pro- Infection with E. coli O157:H7 produces diarrhea that can be
vide benefits without doing harm. Certain strains, however, severe in immune-compromised individuals or in infants and
have acquired pathogenicity islands and cause illnesses such the elderly. The island also contains a gene producing a toxin
as diarrhea and meningitis. The recently identified patho- that blocks translation in cells. This toxin particularly affects
genic strain E. coli O157:H7 contains a pathogenicity island kidney and intestinal cells and contributes to bloody diarrhea.
C A SE S T U D Y
The Evolution of Antibiotic Resistance and Its Impact on Medical Practice
Alexander Fleming got a little sloppy with his sterile technique The second factor is the use and misuse of antibiotics
one day in 1929 and made a mistake that has since saved themselves that establishes an environment in which resis-
millions of lives. Fleming was working with Staphylococcus, a tant strains proliferate at the expense of sensitive strains.
common bacterial strain that causes a serious and potentially Exposing bacteria to antibiotics, which generally leads to
fatal “staph” infection when it enters the body through a cut killing antibiotic-sensitive bacteria, can at the same time
or abrasion. On the fateful day, Fleming unknowingly contam- allow the survival of antibiotic-resistant bacteria. Even when
inated his Staphylococcus culture with a fungus. they are properly used, antibiotics can act as an agent of
Normally, fungal cells reproduce in culture along with artificial selection that fosters the survival of resistant strains
bacterial cells and are noticed when the culture is spread on at the expense of sensitive strains. When antibiotics are mis-
plates. Fleming’s contaminating fungus was different, how- used (as when they are used to encourage growth in live-
ever, because when Fleming spread his contaminated culture stock), when they are not taken for the prescribed period of
on plates, only fungal colonies grew—there were no bacte- time by a patient, or when they are used to treat nonbac-
rial colonies! The fungus had killed the bacterial cells in the terial infections, they eliminate great numbers of antibiotic-
culture. Recognizing this as an important, if inadvertent, dis- sensitive bacteria and promote the proliferation of resistant
covery, Fleming quickly identified the fungus as Penicillium bacteria.
and gave the compound that killed Staphylococcus the name A part of the challenge to physicians dealing with these
penicillin. changing circumstances is that resistance and sensitiv-
In the 1930s, Howard Florey showed that penicillin was ity to antibiotics are not absolute characteristics. A “resis-
an effective antibiotic against a broad spectrum of infec- tant” strain is just that—resistant to an antibiotic but not
tious bacteria. At the beginning of World War II, Florey necessarily impervious to it. It takes more antibiotic to kill
directed a major “scale–up” project to put penicillin into a resistant strain than to kill a sensitive strain. With regard
mass production. Penicillin proved tremendously effective to treating an infected person or animal, the medical ques-
at preventing what otherwise might have been fatal bacte- tion is: At what dosage is the benefit of the antibiotic out-
rial infections. weighed by the harm that might be done to the patient by
Today, although penicillin and other antibiotics con- toxicity of the antibiotic or by too many of the body’s ben-
tinue to save lives, antibiotic-resistant strains of bacteria are eficial bacteria being destroyed? Antibiotic resistance is a
increasingly the cause of difficult-to-treat infections and even rapidly growing problem that has already changed practices
death. This is quickly becoming an acute problem in modern in medical treatment of infectious disease. The future holds
medicine. For example, at present more than 95% of Staphy- more changes, both in patient treatment and in other uses
lococcus strains found in hospitals are resistant to penicillin, of antibiotics.
and some strains carry resistance alleles to multiple antibi- At present, and increasingly in the future, physicians
otics. One such strain is methicillin-resistant Staphylococcus must be acutely aware of the events and behaviors that
aureus (MRSA). What happened to bring about this shift? can lead to bacterial infection, be hypervigilant in spotting
The answer has two parts. One component we have already potential infections by resistant strains, and be prepared to
mentioned—the evolution of antibiotic resistance and the quickly adapt medical treatments and protocols to manage
acquisition of pathogenicity by bacteria through lateral gene resistant strains of bacteria. Physicians must understand how
transfer. Antibiotic resistance can be readily transferred within and why antibiotic resistance has evolved if they are going
a species and between bacterial species by conjugation, to be successful in dealing with its ramifications for their
transduction, or transformation, and by LGT. patients.
218 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
SU MMA RY Mastering Genetics For activities, animations,and review quizzes, go to the Study Area.
6.1 Specialized Methods Are Used for Genetic 6.3 Bacterial Transformation Produces Genetic
Analysis of Bacteria Recombination
❚❚ Bacteria can be propagated in liquid growth media or on ❚❚ Extracellular fragments of DNA released when a donor
semisolid growth media. bacterial cell lyses can be absorbed across the cell mem-
❚❚ Replica plating allows the bacterial colonies on one plate brane of a competent recipient cell as transforming DNA.
to be transferred to additional plates in the same relative ❚❚ Transforming DNA undergoes homologous recombination
positions, thus facilitating genetic analysis of individual with the recipient chromosome to produce transformants
colonies. that have acquired donor DNA.
P R E PA R I N G F O R P R O B LEM S O LV I NG
In addition to the list of problem-solving tips and sugges- 5. Be prepared to assess bacterial genotypes based on
tions given here, you can go to the Study Guide and Solu- growth ability in media of various compositions and to
tions Manual that accompanies this book for help at solving apply those assessments in analyzing conjugation and
problems. transduction experiments.
1. Be able to describe or diagram the chromosomes and 6. Be prepared to use the results of time-of-entry experi-
plasmids of F +, Hfr, and F′ bacteria. ments to determine gene order and map distance in
donor bacterial strains.
2. Be able to describe the differences between conjuga-
tion, transformation, and transduction. 7. Be prepared to calculate cotransduction frequencies and
to apply those calculations to gene order determination.
3. Be familiar with basic microbiological laboratory meth-
ods for growing and replica plating bacterial cells. 8. Understand genetic complementation analysis of bacte-
riophages and how to distinguish the results of genetic
4. Be familiar with Table 6.1 (p. 196) and be ready to
complementation from those of recombination.
use the outcomes listed in it to identify types of donor
bacterial strains.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. For bacteria that are F +, Hfr, F′, and F -, perform or 6. Describe the difference between the bacteriophage lytic
answer the following. cycle and lysogenic cycle.
a. Describe the state of the F factor. 7. Describe what is meant by the term site-specific recombi-
b. Which of these cells are donors? Which is the nation as used in identifying the processes that lead to the
recipient? integration of temperate bacteriophages into host bacterial
c. Which of these donors can convert exconjugants to a chromosomes during lysogeny or to the formation of spe-
donor state? cialized transducing phage.
d. Which of these donors can transfer a donor gene to
exconjugants? 8. What is a prophage, and how is a prophage formed?
e. Describe the results of conjugation (i.e., changes in the 9. How is the frequency of cotransduction related to the rela-
recipient and the exconjugant) that allow detection of tive positions of genes on a bacterial chromosome? Draw
the state of the F factor in a donor strain. a map of three genes and describe the expected relation-
f. Describe a “partial diploid” and how it originates. ship of cotransduction frequencies to the map.
2. The flow diagram identifies relationships between bacte-
10. Describe the differences between genetic complementa-
rial strains in various F factor states. For each of the four
tion and recombination as they relate to the detection of
arrows in the diagram, provide a description of the events
wild-type lysis by a mutant bacteriophage.
involved in the transition.
F- S F+ H
1 2 4 11. Among the mechanisms of gene transfer in bacteria,
3
Hfr S F′ which one is capable of transferring the largest chromo-
some segment from donor to recipient? Which process
3. Conjugation between an Hfr cell and an F - cell does not generally transfers the smallest donor segments to the
usually result in conversion of exconjugants to the donor recipient? Explain your reasoning for both answers.
state. Occasionally however, the result of this conjugation
is two Hfr cells. Explain how this occurs.
4. Bacteria transfer genes by conjugation, transduction, and
transformation. Compare and contrast these mechanisms.
In your answer, identify which if any processes involve
homologous recombination and which if any do not.
5. Explain the importance of the following features in conju-
gating donor bacteria:
a. the origin of transfer e. relaxase
b. the conjugation pilus f. T strand DNA
c. homologous recombination g. pilin protein
d. the relaxosome
220 CHAPTER 6 Genetic Analysis and Mapping in Bacteria and Bacteriophages
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
12. What is lateral gene transfer? How might it take place 17. Five Hfr strains from the same bacterial species are ana-
between two bacterial cells? lyzed for their ability to transfer genes to F - recipient
bacteria. The data shown below list the origin of transfer
13. Lateral gene transfer is thought to have played a major
(oriT) for each strain and give the order of genes, with the
role in the evolution of bacterial genomes. Describe the
first gene on the left and the last gene on the right. Use the
impact of LGT on bacterial genome evolution.
data to construct a circular map of the bacterium.
14. Seven deletion mutations (1 to 7 in the table below) are
tested for their ability to form wild-type recombinants Hfr Strain Genes Transferred
with five point mutations (a to e). The symbol ; + < Hfr 1 oriT met ala lac gal
indicates that wild-type recombination occurs, and ; - < Hfr 2 oriT met leu thr azi
indicates that wild types are not formed. Use the data
to construct a genetic map of the order of point muta- Hfr 3 oriT gal pro trp azi
tions, and indicate the segment deleted by each deletion Hfr 4 oriT leu met ala lac
mutation. Hfr 5 oriT trp azi thr leu met
Deletion Mutation 18. An interrupted mating study is carried out on Hfr strains
Point 1 2 3 4 5 6 7 1, 2, and 3 identified in Problem 17. After conjugation
Mutation is established, a small sample of the mixture is collected
every minute for 20 minutes to determine the distance
a - + - - + + - between genes on the chromosome. Results for each of the
b + + + - + - - three Hfr strains are shown below. The total duration of
c + + + + - - - conjugation (in minutes) is given for each transferred gene.
d - + + - + - - Hfr strain 1 oriT met ala lac gal
e + - - - + + - Duration (min) 0 2 8 13 17
15. A 2013 CDC report identified the practice of routinely Hfr strain 2 oriT met leu thr azi
adding antibiotic compounds to animal feed as a major Duration (min) 0 2 7 10 17
culprit in the rapid increase in the number of antibiotic- Hfr strain 3 oriT gal pro trp azi
resistant strains. Agricultural practice in recent decades
Duration (min) 0 3 8 14 19
has encouraged the addition of antibiotics to animal feed
to promote growth rather than to treat disease. a. For each Hfr strain, draw a time-of-entry profile like
a. Speculate about the process by which feeding anti- the one in Figure 6.11a.
biotics to animals such as cattle might lead to an b. Using the chromosome map you prepared in answer
increase in the number of antibiotic-resistant strains to Problem 17, determine the distance in minutes
of bacteria. between each gene on the map.
b. How might the increase in antibiotic-resistant strains c. Explain why azi is the last gene of strain 2 to transfer
of bacteria in cattle be a threat to human health? in the 20 minutes of conjugation time. How many min-
16. Hfr strains that differ in integrated F factor orientation and utes of conjugation time would be needed to allow the
site of integration are used to construct consolidated bac- next gene on the map to transfer from Hfr strain 2?
terial chromosome maps. The data below show the order d. Write out the interrupted mating results you would
of gene transfer for five strains. expect after 20 minutes of conjugation for Hfr strains
4 and 5. Use the format shown at the beginning of this
Hfr Strain Order of Gene Transfer (First S Last) problem.
Hfr A oriT – thr – leu – azi – ton – pro – lac – ade e. In minutes, what is the total length of the chromosome
in the donor species?
Hfr B oriT – mtl – xyl – mal – str – his
19. An Hfr strain with the genotype cys + leu+ met + str S
Hfr C oriT – ile – met – thi – thr – leu – azi – ton is mated with an F - strain carrying the genotype
Hfr D oriT – his – trp – gal – ade – lac – pro – ton cys - leu- met - str R. In an interrupted mating experi-
Hfr E oriT – thi – met – ile – mtl – xyl – mal – str ment, small samples of the conjugating bacteria are with-
drawn every 3 minutes for 30 minutes. The withdrawn
a. Identify the overlaps between Hfr strains. Identify cells are shaken vigorously to stop conjugation and then
the orientations of integrated F factors relative to one placed on three different selection media, composed as
another. follows:
b. Draw a consolidated map of the bacterial chromosome. Medium 1: M inimal medium plus leucine, methionine,
(Hint: Begin by placing the insertion site for Hfr A at and streptomycin
the 2 o’clock position and arranging the genes thr-leu- Medium 2: M inimal medium plus cysteine, methionine,
azi- . . . in clockwise order.) and streptomycin
Problems 221
Medium 3: Minimal medium plus cysteine, leucine, and 21. Penicillin was first used in the 1940s to treat gonorrhea
streptomycin infections produced by the bacterium Neisseria gonor-
a. What donor gene is the selected marker in each rhoeae. In 1984, according to the CDC, fewer than 1% of
medium? gonorrhea infections were caused by penicillin-resistant
b. List all possible bacterial genotypes growing on each N. gonorrhoeae. By 1990, more than 10% of cases were
medium. penicillin-resistant, and a few years later the level of
c. What is the purpose of adding streptomycin to each resistance was at greater than 95%. Almost every year the
selection medium? CDC issues new treatment guidelines for gonorrhea that
The following table shows the number of colonies grow- identify the recommended antibiotic drugs and dosages.
ing on each selection medium. The sampling time indi- a. Why is the CDC so active in making these
cates how many minutes have passed since conjugation recommendations?
began. b. What are the short-term implications of these frequent
changes for physicians and clinics that treat sexually
Sampling Time Number of Colonies transmitted diseases like gonorrhea and for individuals
(minutes) Plate 1 Plate 2 Plate 3 infected with gonorrhea?
c. What are the long-term implications of these frequent
3 0 0 0 changes in treatment recommendations for the patient
6 0 0 0 population?
9 0 62 0 22. An attribute of growth behavior of eight bacteriophage
12 0 87 0 mutants (1 to 8) is investigated in experiments that estab-
lish coinfection by pairs of mutants. The experiments
15 51 124 0
determine whether the mutants complement one another
18 79 210 62 (+ ) or fail to complement (- ). These eight mutants are
21 109 250 85 known to result from point mutation. The results of the
complementation tests are shown below.
24 144 250 111
27 152 250 122 Mutations
30 152 250 122 1 2 3 4 5 6 7 8
1 - + + + - + + -
d. Determine the order of donor genes cys, leu, and met
from the interrupted mating data. 2 - + + + + + +
e. Suppose a fourth selection medium containing leucine 3 - + + + - +
and streptomycin is prepared. At what sampling time
4 - + - + +
do you expect the first-growing colonies to appear?
Explain your reasoning. 5 - + + -
f. Gene mapping information identifies mutations 2 selected for leu+ (Experiment A), for phe+ (Experiment
and 3 as the flanking markers in this group of genes. B), and for ala + (Experiment C). Following selection,
Assuming these mutations are on opposite ends of transductant genotypes for the unselected markers are
the gene map, determine the order of mutations in the identified. The selection experiment results below show
region of the chromosome. the frequency of each genotype.
23. Synthesis of the amino acid histidine is a multistep ana-
bolic pathway that uses the products of 13 genes (hisA Experiment A Experiment B Experiment C
to hisM) in E. coli. Two independently isolated his - E. -
phe ala - 26% -
leu ala - 65% leu- phe- 71%
coli mutants, designated his1- and his2- , are studied in a
conjugation experiment. A his + F′ donor strain that car- phe+ ala- 50% leu+ ala- 48% leu+ phe- 21%
ries a copy of the hisJ gene on the plasmid is mated with phe- ala+ 19% leu- ala+ 0% leu- phe+ 0%
a his1- recipient strain in Experiment 1 and with a his2- + + 3% + + 4% + + 3%
phe ala leu ala leu phe
recipient in Experiment 2. The exconjugants are grown
on plates lacking histidine. Growth is observed among
a. What compound or compounds are added to the mini-
the exconjugants of Experiment 2 but not among those of
mal medium to select for transductants in Experiments
Experiment 1.
A, B, and C?
a. Why is growth observed in Experiment 2 but not in b. Determine the order of genes on the donor
Experiment 1? chromosome.
b. What is the genotype of exconjugants in Experiment 2? c. Diagram the crossover events that form each of the
24. The phage P1 is used as a generalized transducing phage transductants in Experiment A.
in an experiment combining a donor strain of E. coli of d. In Experiment B, why are there no transductants with
genotype leu+ phe+ ala + and a recipient strain that is the genotype leu- ala +?
leu- phe- ala -. In separate experiments, transductants are
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
25. Define the term genetic complementation. 27. Look closely at the consolidated Hfr map and the data
a. Describe how the term applies to an experiment in used to build the map on page 201. Suppose a fifth Hfr
which two lysis-defective bacteriophages are able to strain had the F factor inserted exactly halfway between
coinfect a bacterial cell and produce lysis. cysE and leuU and had an orientation that was the same
b. Locate another example of genetic complementation in as that of Hfr 1. List the order of gene transfer for the first
this book and describe how genetic complementation six genes transferred by this Hfr and the number of min-
works in that case. utes of conjugation at which each gene is expected to be
c. Does the term genetic complementation have the same seen.
meaning in both cases? Explain. 28. Fifty bacterial colonies are on a complete-medium
26. Devise an experiment to identify bacteria that are auxotro- growth plate. The colonies are replica plated to a mini-
phic and unable to produce two amino acids, lysine (lys) mal medium plate, and 46 colonies grow. What can you
and valine (val). The auxotrophic bacteria are in a pool of say about the bacteria from the four colonies that do not
bacteria in which all the other bacteria are prototrophic. grow? Design an experiment and describe the methods
The genotype of the auxotrophs is lys - val -. Describe you would use to determine if any of these four colonies
each step in the experiment, identify the constituents in are leu-, arg-, or val -.
any growth media or growth plates you propose, and iden-
tify the results that will conclusively identify bacteria that
are lys - val -.
Human Hereditary Disease
A
APPLICATION
and GeneticCounseling
Genetic counseling, a central activity in medical genetics, seeks to provide individuals, cou-
ples, and families with medical and genetic information they can use to make informed deci-
sions about genetic testing and medical treatment, in person-to-person meetings involving
physicians, genetic counselors, and consultands.
never heard of ARG and were naturally very upset to learn that their first child
had a genetic disease. In a rapid series of meetings over the next two days, B.K.’s
parents met with a pediatrician, a medical geneticist, a dietician, and a genetic
counselor. What they learned brought them considerable relief and assurance that
with diligent effort they could manage B.K.’s ARG and that there was a strategy
for monitoring future pregnancies for the risk of ARG.
Over those first days, B.K.’s parents learned that ARG is a very rare autoso-
mal recessive condition. Only 1 in 350,000 to 1 in 1,000,000 newborns have the
disease. It is caused by a deficiency of the enzyme arginase that helps break down
the amino acid arginine during the digestion of dietary protein. The main problem
for those with ARG is a buildup of ammonia, a by-product of protein breakdown,
in the blood because of the inability to efficiently break down arginine. At high
levels, ammonia is toxic, especially to the nervous system. Without treatment, B.K.
would experience poor growth that would be evident in his first year or two, poor
muscle control and balance, and significant learning delays.
Fortunately, as B.K.’s parents also learned, ARG is treatable with a combina-
tion of a very low protein diet, specially prepared foods, medication that helps
clear excess ammonia from the blood, and regular blood testing to monitor B.K.’s
blood ammonia level. If these treatments were applied from birth, B.K. would
almost certainly grow up normal and healthy. And if he maintained his treatment
throughout his life, he was likely to live a normal lifespan. In addition, B.K.’s par-
ents learned that because ARG is autosomal recessive, both of them were hetero-
zygous carriers of a mutation of the ARG1 gene and could transmit the disease to
a future child. Their medical geneticist and genetic counselor told them that the
risk of this occurring was 25% but that prenatal testing and monitoring were avail-
able to identify any future children with ARG before birth.
Since B.K.’s birth, his parents have worked hard to maintain his diet, admin-
ister his medication, and monitor his blood ammonia level. They received a great
deal of counseling and referral assistance from their genetic counselor, and they
have been in regular contact with their medical geneticist, their pediatrician (who
has had to learn about ARG herself), and their dietician. They also found a support
group for parents of children with medical conditions. To date, B.K. has hit all of
his motor and mental milestones. He began to babble and form words on sched-
ule and likes to go to the park where he can run on the grass.
first is the identification and classification of various types of has no effect on the condition itself; but it does influence the
hereditary disease, and the second is the role genetic coun- risk that the condition could recur in a subsequent child. We
seling plays in medical genetics. discuss this difference momentarily.
When the population as a whole is considered, it is
common to find that a disease-producing gene has more
A.1 Hereditary Disease than one mutant allele. Since these alleles differ from one
another, they may have different effects on the phenotype.
and Disease Genes In other words, the phenotypic abnormalities or complica-
tions that develop may differ somewhat from case to case
In some cases there is a one-to-one correlation between a
as the result of different mutations of a particular gene. For
hereditary disease and a gene whose mutation causes the dis-
this reason, it is common to refer to Mendelian conditions
ease. In other cases, the disease phenotype can be caused by
as “syndromes,” a term referring to a set of abnormalities,
a mutation of any one of multiple genes; and in still other
some or all of which may appear in a specific patient.
cases, disease onset is influenced by genes along with envi-
ronmental or developmental factors. Chromosomal Conditions The presence of an extra chro-
Typically, the first job of the medical geneticist is to cor- mosome, the absence of a chromosome, the duplication or
rectly diagnose the condition so that genetic information can deletion of a chromosome segment, and certain structural
be used appropriately. But correctly diagnosing a genetic rearrangements of chromosomes can each lead to develop-
disease can be challenging. Hereditary diseases vary widely mental and physical abnormalities. These conditions and
in their onset, their severity, and the frequencies with which their production are topics of discussion in Chapter 10.
they occur in populations. This means that the likelihood of Humans are especially sensitive to changes in the number of
encountering certain genetic diseases can be influenced by copies of their genes, requiring two copies of each autosomal
the population frequency of gene mutations, the population gene and one expressed copy of each X-linked, and in males,
of origin of the patient, the degree of genetic relationship each Y-linked gene for normal development. The presence of
between the parents, and the occurrence of other factors that three copies of genes, as happens in chromosome trisomy,
contribute to or modify the appearance of a disease. when there are three copies of a chromosome instead of the
The list of hereditary diseases and of genes whose muta- normal homologous pair of chromosomes, or as occurs when
tions cause or contribute to hereditary disease grows almost a portion of a chromosome is duplicated, disrupts normal
by the day. In the Case Study at the end of Chapter 2, we development and can produce substantial abnormalities.
discussed the cataloging of hereditary diseases and genes Most of the conditions associated with chromosome
in the online human genetic database known as the Online numerical or structural changes are also classified as syn-
Mendelian Index of Man (OMIM) and also looked at esti- dromes, since the specific characteristics can vary some-
mated rates of human gene mutations. You may refer to that what in different patients. For example, individuals with
case study for details on OMIM and its contents when OMIM trisomy 21, or Down syndrome, collectively display a wide
is mentioned in the discussion below. range of intellectual deficits and physical complications.
One syndrome caused by chromosomal insufficiency is
Types of Hereditary Disease cri-du-chat syndrome (the French term means “cat’s cry”),
resulting from the deletion of a small segment of one copy
Hereditary disease has three major classifications. Within of chromosome 5. The deletion creates a partial monosomy
each of these classifications the conditions differ widely in (one copy of a portion of a chromosome pair). Cri-du-chat
onset, diagnosis, and management, and also in the probabili- syndrome, like most chromosomal conditions, produces a
ties of their recurrence in families. number of abnormalities, but it is recognized by the cat-like
cry of newborn infants with the condition.
Mendelian Conditions Conditions that are caused by the Similar to autosomal gains and losses, the gain or loss
mutation of a single gene are Mendelian conditions. Among of all or part of the X or the Y chromosome also results in
them, six patterns of inheritance are observed: autosomal abnormal development. Changes in the number of sex chro-
dominant, autosomal recessive, X-linked dominant, X-linked mosomes, such as sex chromosome trisomies XXY and
recessive, Y-linked inheritance, and mitochondrial inheri- XXX or the sex chromosome monosomy XO (one X chro-
tance. Up to this point in the text, we have discussed the first mosome and no second sex chromosome), each produce
five of these inheritance patterns (see Sections 2.6 and 3.5 their own unique set of developmental anomalies.
for a review). Mitochondrial inheritance and human mito- For the most part, chromosomal conditions are the result
chondrial diseases are discussed in Chapter 17, and we will of new mutations. Aberrations of chromosome number are
not address them here. most often caused by errors during meiotic cell division
Mendelian conditions either can be inherited through that lead to sperm or eggs whose nuclei contain the wrong
alleles carried by one or both parents or can be the result of number of chromosomes. Recall from Chapter 3 that normal
a new mutation. Whether the mutation is new or inherited human sperm and egg cells carry one chromosome from each
226 APPLICATION A Human Hereditary Disease and Genetic Counseling
homologous chromosome pair. The normal chromosome geneticists, genetic scientists, and genetic counselors. Even
content of sperm or egg is the human haploid chromosome clinicians with a specialty in genetics may need assistance
number n = 23. Errors during meiotic cell division can, for from medical personnel with expertise in particular body
example, generate sperm or egg cells with an extra copy of systems to accurately diagnose and counsel individuals who
a chromosome, such as n + 1 = 24 chromosomes. When may have inherited certain rare genetic conditions.
such a cell is united at fertilization with a cell containing Today, accurate clinical diagnosis of genetic diseases
23 chromosomes, the result is n + n + 1 = 47 chromo- and conditions is greatly aided by the availability of molec-
somes. This is the usual way trisomy 21 is produced. ularly based genetic tests that use blood and tissues as the
Because chromosome aberrations are usually the result source of DNA and RNA for the identification of gene muta-
of spontaneous errors during meiotic cell division, their tions. These tissues can also be sources of proteins for bio-
recurrence risk is generally low. There are certain excep- chemical analysis and of material used to culture cells and
tions, however. As we discuss in Chapter 10, maternal age at examine chromosomes. There are three general categories
conception (or, in in vitro fertilization techniques, the age of of genetic testing: molecular analysis, biochemical analysis,
the ovum) affects the likelihood of the occurrence of a mei- and chromosome analysis. Molecular analysis of DNA or
otic cell division error that is strongly correlated with the RNA is very useful when there is the suspicion of a specific
chance of having a child with trisomy 21. This means that Mendelian condition. This type of analysis is best applied
determining the recurrence risk of trisomy 21 must take the when specific genes known to be involved in the condition
mother’s age (or condition of the ovum) into account. can be tested, and especially when specific gene mutations
can be targeted for examination. Molecular analysis can be
Multifactorial Conditions A number of human diseases effective in identifying both newly occurring mutations and
and conditions occur through the influence of multiple mutations passed from parents to affected children.
genes along with nongenetic (environmental) factors. Con- Biochemical analysis takes blood or tissues from affected
ditions of this type, that include diabetes, many kinds of body organs or systems and assays them for the absence or
heart disease, certain types of cancer, and several other presence of particular proteins, studies them to determine
conditions, are called multifactorial conditions. This name whether levels of particular proteins are within normal ranges,
indicates that inherited genetic variation may play a role or examines them for the presence of protein variants associ-
in making some individuals more likely than others to ated with disease. This approach is used in diagnosing many
develop particular conditions or diseases. It is common kinds of disease, including many genetic diseases.
to refer to an “inherited susceptibility” to certain diseases Chromosome analysis can be used in the diagnosis of
in referring to individuals whose genotype puts them at a fetus or a newborn infant affected by malformations asso-
higher risk for a disease than an average member of the ciated with a chromosomal condition. It also aids in diag-
general population. nosing patients of any age who have otherwise unexplained
Inherited susceptibilities to certain diseases vary between mental impairment or physical abnormalities and in cases of
populations. This is often due to different frequencies of cer- long-term infertility in adults.
tain mutant alleles in different populations. Determining the In addition to their diagnostic applications, these three
risk of a multifactorial disease recurring in a family must take approaches to genetic analysis have other applications as
into account the incidence of diseases associated with partic- well. The molecular details of these testing approaches
ular susceptibility genotypes. A more detailed discussion of are discussed in Application Chapter B: Human Genetic
multifactorial disease is presented in Chapter 17. Screening. All three forms of testing can be made use of in
the following assessment strategies.
Genetic Testing and Diagnosis
Carrier Testing Carrier testing is the use of a molecular,
Clinical observations and examinations are the information- biochemical, or chromosomal analysis to identify indi-
gathering step for diagnosing any condition, including viduals who do not have a genetic condition but who are
genetic disease. Many genetic diseases first manifest symp- heterozygous and carry recessive alleles for autosomal or
toms in infants or children, so it is common for an obste- X-linked conditions in their genotype that might be passed
trician or a pediatrician to be the first clinician to note an to a child, or who carry a chromosome abnormality that
unusual finding. On the other hand, a number of genetic could produce a chromosome condition in a future child.
conditions are very subtle and symptoms may not appear Carrier testing can be done as part of a genetic assess-
until later in life; some are even delayed until adulthood. In ment of a family or it can be done on a population or
these cases, internists and general practitioners may be the community-wide basis as part of a public health effort to
medical personnel who first notice an abnormality. Often, identify carriers.
owing to the large number of different genetic conditions Population-based carrier testing, or community-based
and the relative rarity of many of them, personal physi- carrier testing, usually focuses on individuals of specific
cians feel the need to refer their patients to larger regional backgrounds in which the frequency of a certain genetic dis-
hospitals where medical staff members include clinical ease is high and where a large proportion of the population
A.2 Genetic Counseling 227
are carriers. An example is Tay–Sachs disease (OMIM using microscopy. Conditions such as trisomy 21 and
272800), a fatal autosomal recessive condition that mani- other chromosomal conditions can be identified in utero
fests in infants. Tay–Sachs disease, caused by the absence of by these methods. More frequently today, prenatal testing
the enzyme hexosaminidase A (hexA), is a progressive neu- is performed to determine whether a fetus has a particular
romuscular disorder that is usually fatal in childhood. The genetic disorder. This can involve tissue collection by CVS,
mutant allele of hexA is particularly frequent in Ashkenazi amniocentesis, or another method, isolation of DNA from
Jewish populations originating in Eastern Europe. Carrier the collected tissues, and testing the DNA for mutations of
testing programs targeting teenagers and young adults of specific genes involved in producing a genetic disease or
Ashkenazi descent are designed to identify carriers and to condition.
provide information about the disease and about reproduc-
tive options if two prospective parents are both carriers of
the mutant allele. A.2 Genetic Counseling
Presymptomatic Testing Presymptomatic testing is car- Genetic counseling is an integral part of medical genetics.
ried out for genetic conditions that have a late age of onset. It is provided by specially trained professionals who have
Huntington disease (HD), discussed in Chapter 4 (see strong practical skills and knowledge both in genetics and
Figure 4.11), is one example of a genetic disease that appears in counseling. A genetic counselor may be the first point
later in life and has a variable age of onset. By age 40, only of contact a patient or family has with clinical genetics ser-
about 50% of people who carry the autosomal dominant vices, and there may be multiple contacts with the genetic
mutant allele for HD have symptoms of the disease. Many counselor during and after the process of genetic testing.
people who have a parent with HD wish to know definitively This individual is responsible for communicating all rel-
whether or not they carry the mutation. Presymptomatic test- evant information about upcoming genetic tests or the
ing can make that determination by testing DNA to identify results of genetic tests. Genetic counselors in the United
a mutation of the affected gene. States and Canada complete a training program accredited,
in the United States, by the American Board of Genetic
Newborn Testing Newborn testing consists of a set of Counseling, and in Canada, by the Canadian Board of
mandated genetic tests that together require only a few Genetic Counselling. Europe also has a number of genetic
drops of blood taken by “heel stick” from a newborn infant. counseling organizations that certify counselors in various
(A heel stick is, quite literally, the pricking of a newborn’s countries. At the end of 2016 there were three dozen accred-
heel with a small lancet to collect a small amount of blood.) ited genetic counseling programs in the United States and
Every state in the United States, and many foreign countries three additional programs in Canada. At that time there were
as well, requires this collection of a newborn’s blood, which more than 4000 certified genetic counselors in the United
is then tested for three dozen or more rare genetic diseases States and smaller numbers in Canada, European countries,
that are preventable or can have their symptoms greatly and other countries around the world. The field is expected
ameliorated by early and ongoing treatment, as the case of to grow over the next decade as the need for genetic coun-
ARG described at the beginning of this chapter exemplifies. selors increases.
Treatment regimens include replacing missing or defective Genetic counselors are, most frequently, part of a large
enzymes or other substances, dietary supplementation or medical group that provides genetic services or are on the
dietary restriction, removal of toxic byproducts, or block- staff of hospitals that offer clinical genetics. Increasingly,
ing of a pathogenic process. (See Application Chapter B: however, genetic counselors are sole practitioners or work
Human Genetic Screening for more details on newborn in small groups with business models and structures similar
genetic testing.) to those of psychologists in private practice. The daily work
of genetic counselors includes a great deal of counseling to
Prenatal Testing Prenatal testing is performed during help individuals, identified as consultands, manage their
pregnancy for the purpose of determining whether a fetus concerns and the personal, familial, and social issues related
has a particular condition or disorder. Prenatal testing most to the genetic condition in question. In contrast to the large
commonly either collects a biopsy of tissue from the placen- amount of time invested in counseling, it is fair to say that
tal chorion through chorionic villus sampling (CVS) or col- the genetic component of genetic counseling is an important
lects a small amount of amniotic fluid from the amniotic sac but secondary activity.
through amniocentesis. Other methods of collecting either
fetal tissue for DNA analysis or fluids for biochemical anal- Indicators and Goals of Genetic
ysis can also be used.
In decades past, the principal focus of prenatal testing
Counseling
was to identify chromosome abnormalities. Chromosome There are many reasons to seek genetic counseling. Table A.1
analyses are performed by culturing cells collected by CVS lists the most common situations in which genetic counseling
or amniocentesis and then visualizing the chromosomes may be sought or recommended. Typically, the consultand
228 APPLICATION A Human Hereditary Disease and Genetic Counseling
Table A.1 Common Indicators for Genetic Table A.2 Goals of Genetic Counseling
Counseling Referral
1. Provide comprehensive information before or after a
1. A previous child with a genetic or chromosome condition genetic or chromosomal test or diagnosis, including test
2. A family history of a genetic or chromosome condition results, available particulars about the course of the con
3. Advanced maternal age or other indicator of elevated risk dition, and available medical management options.
in pregnancy 2. Explain risk recurrence, the meaning of the recur-
4. Fetal exposure to a toxic or harmful compound rence risk estimate, and the role genetics plays in the
condition.
5. Prolonged infertility or repeated pregnancy loss
3. Identify the beliefs, values, and relationships that are
6. New diagnosis of a genetic or chromosome condition affected by the presence of a current or future genetic or
7. Consultation for pre- or -postgenetic or chromosome test chromosomal condition.
risk assessment 4. Identify and determine the course of action most appro-
priate for the consultand given the information available.
5. Provide referrals to support groups or services.
is an adult, couple, or family who either has a child with a
genetic or chromosomal condition or has a family history of
a condition. The genetic counselor will be asked to provide taken a maternal serum screen (MSS), in which maternal
detailed, complete, and understandable information about the blood is drawn and tested to establish the circulating lev-
case and will be called upon to provide nondirective counsel- els of four compounds. These can indicate the possibility of
ing that permits and encourages the consultand to understand elevated risk for chromosome conditions, including trisomy
and review the possible courses of action, and to facilitate 21 (Down syndrome) and trisomy 18 (Edward syndrome),
the decision-making process. Providing genetic counsel- as well as two neural tube defects, spina bifida (a serious
ing is rarely a one-time event. Rather, genetic counseling is condition that causes permanent paralysis) and anencephaly
an ongoing process of communication designed to help the (a fatal condition of abnormal brain development). An MSS
consultand address the complex personal, familial, and social result indicating the possibility of any of these conditions
issues associated with a genetic or chromosomal condition. can be followed up with ultrasound to examine the fetus for
Genetic counseling has several goals as enumerated in visual evidence of a neural tube defect, or with amniocente-
Table A.2. Achieving these goals is aided by the active par- sis or CVS to collect fetal cells for chromosome inspection.
ticipation of the genetic counselor in the clinical team man- At age 40, the risk of Down syndrome is the highest
aging the case. For this reason, genetic counselors usually of the four conditions. It increases with maternal age, espe-
work closely with medical geneticists, treating physicians, cially after age 35, and at age 40 is nearly 1 in 100. This risk
medical laboratory personnel, and social service agencies or of Down syndrome was the reason for recommending MSS
groups to coordinate a comprehensive, team-based plan for testing to C.R. We discuss this risk and its possible causes in
aiding the consultand in the near and long term. Chapter 10.
Unfortunately, the MSS result indicates an increased
Assessing and Communicating Risks possibility of Down syndrome, and C.R. is referred to a
genetic counselor for follow-up discussion. When MSS
and Options indicates an increased chance of Down syndrome, it is cor-
A principal goal of genetic counseling is to provide the rect in about 80% of cases. The false-positive rate, the rate
consultand with comprehensive, understandable medical at which the MSS results indicate the possibility of Down
information about a current condition or, where appropriate, syndrome when the condition is not present, is about 9%.
about the risk of recurrence of a condition in a future preg- The medical information collected by C.R.’s team
nancy. With this information in hand, the consultand and the would include her medical history and information on her
genetic counselor can talk through the consultand’s options, current pregnancy, along with her MSS test results. At the
and the consultand can reach a decision regarding immedi- meeting with the genetic counselor, C.R. and her partner
ate needs or begin to prepare for situations that may arise in hear about the results of MSS, are told about the test’s
the future. predictive accuracy and the rate of false-positive results,
and are informed that they have the option of additional
Immediate Decision Making To illustrate the kind of follow-up with either amniocentesis or CVS. The genetic
information and discussion that might occur in a case counselor explains that the results of these tests take about
requiring immediate decision making, let’s look at a hypo- 2 weeks to complete and that counseling services would
thetical situation. be available to C.R. and her partner during the wait. The
Example Case 1: The consultand in this case is C.R., results would be explained as soon as they become avail-
who is 40 years old. She has two healthy children, ages 8 able, and the options at that point would be discussed. The
and 12, and is in her 14th week of an unplanned pregnancy. couple are also told that they can opt not to have additional
As a consequence of her age, C.R. has been offered and has follow-up testing.
A.2 Genetic Counseling 229
1 2 1 2
I I
1 2
II
1 2 3
II
1 1 2 3 4 5
III
III ?
1
IV ?
Figure A.2 The family described in Example Case 3.
of the funding for the Human Genome Project was ear- to be at odds. The three guiding principles for the use of
marked for an initiative supporting research and education genetics in medicine are:
concerning the Ethical, Legal, and Social Implications
1. The likely benefit of medical genetics: Does genetic
(ELSI) of the project. The workshops, research, pre- and
study benefit the patient?
postdoctoral support, and yearly conferences sponsored
by ELSI focused on four specific areas of investigation 2. Respect for individual autonomy: Does genetic study
that are affected by the collection of personal genetic or allow the patient to retain control over decisions
genomic information: regarding his or her health care and to be free from
coercion in making health care decisions?
1. Genetic Research: Examining the design, conduct,
3. Justice: Does the use of genetics preserve the fair and
and analysis of research and the dissemination of per-
equitable treatment of all individuals?
sonal genetic or genomic information, especially with
regard to detailed health information. Ethical dilemmas presented by genetic testing are a
2. Genetic Health Care: Studying the uses of genetic or frequent topic of discussion between genetic counselors
genomic information and the influence this informa- and consultands. Three areas where dilemmas commonly
tion has on health care. In addition, examining the arise are prenatal genetic and chromosome testing, newborn
implications of this use for individuals, families, and genetic screening, and testing for genetic predisposition.
society.
3. Societal Issues of Genetics and Genomics: Investiga- Prenatal Genetic and Chromosome Testing Prenatal
tion of the beliefs, practices, and policies surrounding genetic testing is performed for a variety of reasons and
the collection and use of genetic and genomic informa- for a wide range of genetic and chromosome conditions.
tion. Additionally, studying how information can be Results pointing to the absence of a condition are a wel-
understood with respect to health, disease, and indi- come outcome, although if the condition runs in the family,
vidual responsibility. counseling may be indicated to help individuals deal with
4. Legal, Regulatory, and Public Policy Issues: Exam- “survivor guilt.” Chromosome conditions are very unlikely
ining the impact of current policies and regulations on to be treatable, so a result indicating the presence of such a
genetic and genomic information collection and use, condition in a fetus may well trigger ethical conflicts about
and recommending new regulations and policies as keeping or terminating an affected pregnancy. Or the degree
needed. of physical or mental impairment or the prognosis may be
variable, which can also trigger ethical concerns and present
Beyond this effort, United States federal public pol- difficult choices that require sorting out and discussing with
icy has addressed some elements of the use and sharing of a counselor.
personal genetic information through two laws. In 2008, A number of genetic conditions are amenable to
Congress passed the Genetic Information Nondiscrimi- treatment that can dramatically prolong and improve the
nation Act (GINA) that severely restricts the use of per- quality of life. Cystic fibrosis, an autosomal recessive
sonal genetic information in issuing health insurance and condition that affects respiration and causes chronic and
life insurance, and in employment decisions. This protec- serious respiratory infections, and severe combined immu-
tion was bolstered in 2010 with passage of the Affordable nodeficiency syndrome are examples of genetic diseases
Care Act (ACA) that excluded the use of personal genetic for which effective symptomatic treatments are routine. It
information in issuing health insurance as part of the is an open debate as to whether or not such conditions are
clause eliminating preexisting conditions as a basis for good candidates for prenatal genetic testing. Detecting one
rejection. of these conditions is very unlikely to lead to pregnancy
termination. Aside from facilitating treatment early in
Genetic Counseling and Ethical Issues infancy, there may be little to gain from performing such
tests. On the other hand, with a genetic condition such as
These GINA and ACA regulations help ease the con-
Tay–Sachs disease that is invariably fatal and for which no
cern that personal genetic or genomic information might
effective treatment is known, a good case can be made for
be used by external entities to make decisions about an
prenatal genetic testing.
individual’s employment or insurance coverage, but they
do not address other ethical issues stemming from medi-
cal genetics. Three fundamental principles guide medical Newborn Genetic Screening Much like mandated vac-
genetics, and ethical dilemmas often arise when these prin- cination programs, newborn genetic screening is a pub-
ciples are perceived to be in conflict. One important role lic policy intended to save lives, reduce medical costs to
for genetic counseling is to help individuals and families society, and support the well-being of families. Identifi-
make decisions in situations where these principles seem cation of a targeted genetic condition prompts immediate
A.2 Genetic Counseling 233
initiation of treatment, such as permanent dietary restric- breast cancer can be 60 to 70%. Not all mutations of these
tions or dietary supplements, administration of medication, genes produce the same level of increased risk, and some
or other kinds of biochemical or physical therapies. Few if appear not to increase the lifetime risk of cancer at all. In
any ethical discussions take place around newborn testing, this case, even the most severe mutation in the most at-risk
but consultands and families often benefit from counseling population leaves about a 30 to 40% chance of no cancer
to manage the personal and family dynamics affected by a developing.
sick child. Cases of breast or ovarian cancer that result from
BRCA1 or BRCA2 mutations tend to cluster in families
and to have an age of onset in the 30s or 40s. Women in
Testing for Genetic Predisposition Genetic testing may
these families are frequently aware of the potential risk
also be done to detect the presence of a genetic variant that and may pursue genetic testing to discover if they have
is likely to produce disease in the future (presymptomatic such a mutation. The decision to undergo genetic testing
genetic testing) or the presence of a variant that confers requires careful consideration under the guidance of a
additional risk of disease under particular conditions (test- genetic counselor who will explain the meaning of a posi-
ing for inherited susceptibility to a disease). The ethical tive result (indicating a mutation is present) and a nega-
issues surrounding both of these types of testing can be tive (no mutation) test result. The counselor will also lead
profound. the consultand through the medical options should the
The autosomal dominant condition Huntington disease test result be positive for a mutation, both before testing
(HD) is an example of a condition with a delayed age of and after the result is known. These options include regu-
onset that is virtually certain to manifest devastating symp- lar and intensive monitoring to detect cancer at its early
toms during the person’s life. The average age of onset is stages, and prophylactic mastectomy (surgical removal
nearly 40 years (see Section 4.1 and Figure 4.11). Currently of the breasts) or oophorectomy (surgical removal of the
there is no effective treatment for HD. Among the chal- ovaries) to eliminate the risk of disease in those organs.
lenges HD presents is that, often, because of its dominant We discuss more about cancer genetics in Application
nature, a person inheriting the disease has dealt with the Chapter C: The Genetics of Cancer.
disease in a parent. Further, the average age of onset is old
enough that a person can pass the mutant allele to his or her
child before beginning to experience any of the symptoms.
In Closing
Both of these can be important factors in the decision to The explosion of knowledge in genetics over the past
undergo presymptomatic genetic testing. For young people, 50 years has had profound impacts on the understanding
the implications of finding out that the disease allele is of human heredity and genetic diseases and on the prac-
present can have a profound effect on life choices. Beyond tice of medical genetics. In particular, the Human Genome
the concern about when symptoms might appear and how Project has been instrumental in identifying and locating
rapidly they will progress are lifestyle choices such as human genes, including genes whose mutations lead to
decisions about going to college, marrying, and saving for genetic disease. Human medicine relies much more heav-
retirement. An additional issue raised by genetic testing for ily on genetics and genomics today than it did just a decade
a condition like HD is that if the disease allele is detected or so ago, and the level of reliance is bound to increase in
in a consultand, then the consultand’s siblings, too, each the next decade. The concept of “personalized medicine,”
have a 50% risk of having inherited the mutant allele, and in which the genetic profile of a patient or of the disease
they may not yet be aware or may not wish to be informed in a patient is used to help select the most effective treat-
of their risk. ment regimen, is already beginning to affect the practice
A thorough consideration of all of the implications of pre- of medicine. The scope of this personalization of medicine
symptomatic test results, both personal and familial, by both will likely reduce the notion that “one size fits all” when it
the consultand and by family members, is required before the comes to the treatment of many diseases and conditions.
decision is made to go forward with the testing. Counseling As in recent decades, genetic counselors will con-
is provided again by a genetic counselor both after the test is tinue to play a pivotal role in genetic medicine. In fact,
performed (to allow a reconsideration of the consequences of most experts in the field expect the need for genetic coun-
the result) and after the results are made available (to facilitate selors to increase in the future. Despite the vast increase
decision making in response to the test result). in knowledge about human genetics, the goal of genetic
Conditions such as familial breast and ovarian cancer medicine is not to acquire knowledge for its own sake
present a different set of challenges. Certain mutations of but instead to use the acquired knowledge to improve the
the BRCA1 or BRCA2 genes can increase a woman’s life- health and well-being of patients, to relieve suffering, and
time risk of breast or ovarian cancer. For the general popu- to ensure the fair treatment and dignity of all individuals
lation, this risk is about 11% (about 1 in 9). For women who come into contact with the field of genetic medicine
who carry certain BRCA1 mutations, the lifetime risk of and its practitioners.
234 APPLICATION A Human Hereditary Disease and Genetic Counseling
1. Match each statement (a–e) with the best answer from the woman’s family, including her father, had a condition that
following list: consultand, 50,, prior probability, 66.7,, might be genetic. Although her father is still alive, she has
obligate carrier, 100%. had little contact with him for much of her life and cannot
a. The Mendelian risk that a person is a heterozygous describe or name the condition. Her partner is a healthy
carrier of a recessive condition. man whose family has no history indicating the presence
b. A person who on the basis of family history must be a of a genetic condition. To provide more information about
heterozygous carrier of a recessive mutant allele. this possible genetic condition for the couple, what is the
c. The probability that the healthy brother of a woman first step you recommend?
with an autosomal recessive condition is a heterozy-
4. A man, J.B., has a sister with autosomal recessive galac-
gous carrier.
tosemia (OMIM 230400), and his partner, S.B., has a
d. The person receiving genetic counseling.
brother with galactosemia. Galactosemia is a serious con-
e. The probability that the son of a woman with an auto-
dition caused by an enzyme deficiency that prevents the
somal recessive condition is a heterozygous carrier.
metabolism of the sugar galactose. Neither J.B. nor S.B.
2. Go online to the Mendelian Index of Man (OMIM) web- has galactosemia, but they are concerned about the risk
site. Look up the following genetic conditions and answer that a future child of theirs will have the condition. What
the questions posed about them. is the probability their first child will have galactosemia?
a. Look up Tay–Sachs disease (TSD), OMIM number
5. A woman, S.R., had a maternal grandfather with hemo-
272800, and give the name and abbreviation of the
philia A (OMIM 306700), an X-linked recessive condition
affected gene and the chromosome location of the gene.
that reduces blood clotting. S.R.’s maternal grandmother
b. Go to the “Population Genetics” section discussing the
and paternal grandparents are free of the condition, as is
TSD gene. In a few sentences, summarize the human
her partner, his parents, and his grandparents. S.R. has no
population in which TSD is most frequently found and
siblings. She wants to know the chance that a son of hers
give the approximate frequency of heterozygous carri-
will have the condition. What is that probability?
ers for the TSD mutation in North American Jews.
c. Look up cystic fibrosis (CF), OMIM 602421, and give 6. A 40-year-old woman whose father had Huntington dis-
the gene name and abbreviation and the chromosome ease currently shows no symptoms of the disease. She is
location of the gene. newly pregnant with her first child and seeks your best
d. Go to the “Molecular Genetics” section and describe estimate of the chance her child will inherit the disease.
the most common mutation of the CF gene. What is your estimate and how did you arrive at it?
3. A couple comes into your genetic counseling practice (Hint: See Figure 4.11)
with a question about the chance a future child of theirs
might have a genetic disease. Three or four men in the
DNA Structure
and Replication 7
CHAPTER OUTLINE
7.1 DNA Is the Hereditary Molecule
of Life
7.2 The DNA Double Helix Consists
of Two Complementary and
Antiparallel Strands
7.3 DNA Replication Is
Semiconservative and
Bidirectional
7.4 DNA Replication Precisely
Duplicates the Genetic Material
7.5 Methods of Molecular Genetic
Analysis Make Use of DNA
Replication Processes
ESSENTIAL IDEAS
Rosalind Franklin used the X-ray diffraction method to produce this image
of double-stranded DNA that is known as Photo 51. Photo 51 is the first ❚❚ Seventy-five years of observations and
visual experimental evidence supporting the model that DNA contains analysis culminated in the identification
two strands twisted around one another. of DNA as the hereditary molecule.
❚❚ DNA is a double-stranded molecule
T
consisting of four kinds of nucleotides,
he central dogma of biology identifies DNA as the reposi- abbreviated A, T, C, and G, that is held
tory of genomic information for organisms and describes together by a mechanism of complemen-
tary base pairing.
its key role in the production of RNA transcripts of genes lead-
❚❚ DNA replication faithfully duplicates the
ing to the production of polypeptides (see Figure 1.9). DNA’s genome by a semiconservative process
ongoing role in these processes requires its faithful replication that progresses bidirectionally from each
origin of replication.
in each cell cycle, and that is the subject of this chapter.
❚❚ Origins of replication are defined by
In Chapter 1, we reviewed the primary and secondary their nucleotide sequence. Numerous
structures of DNA and RNA and the fundamentals of DNA proteins and enzymes act in concert to
replication. In this chapter, we discuss the structure of DNA produce two identical DNA duplexes.
in greater detail and extend the earlier description to include ❚❚ Laboratory techniques based on a molec-
ular understanding of DNA replication
the molecular processes occurring in DNA replication. We perform targeted replication of short
also examine two analytical methodologies—polymerase chain DNA sequences and sequence DNA.
reaction (PCR) and DNA sequencing techniques—that were
235
236 CHAPTER 7 DNA Structure and Replication
developed as an outcome of the understanding of substance, the chromatin, is to be regarded as the physical
replication. The Case Study at the end of the chapter basis of inheritance. Now, chromatin is known to be
closely similar to, if not identical with[,] a substance
discusses some human hereditary conditions caused known as nuclein (C29 H49 N9 P3 O22, according to
by mutations of genes for critically important pro- Miescher), which analysis shows to be a tolerably defi-
teins involved in DNA replication. nite chemical composed of nucleic acid (a complex
organic acid rich in phosphorus) and albumin. And thus
we reach the remarkable conclusion that inheritance may,
perhaps, be effected by the physical transmission of a
particular chemical compound from parent to offspring.
7.1 DNA Is the Hereditary
Molecule of Life In 1900, Mendel’s hereditary principles were rediscov-
ered, and their predictions were widely disseminated in biol-
ogy (see Section 2.3). Shortly thereafter, in 1903, Wilson’s
DNA (deoxyribonucleic acid) is the hereditary molecule of
student Walter Sutton and, independently, Theodor Boveri
life. Our contemporary understanding of hereditary trans-
accurately described the parallels between, on the one hand,
mission and the evolution of species is rooted in this fact.
homologous chromosome and sister-chromatid separation
Long before the hereditary role of DNA was established,
during meiotic cell division and, on the other hand, the inher-
however, research had identified five essential characteris-
itance of genes.
tics of hereditary material. The hereditary material must be
Over the next 20 years, the nucleus and chromosomes
1. Localized to the nucleus, and a component of were a focus of biological investigations of heredity. By 1920,
chromosomes the principal constituent of nuclein was identified as DNA,
2. Present in a stable form in cells and the basic chemistry of DNA was deciphered. The mole-
cule was determined to be a polynucleotide consisting of four
3. Sufficiently complex to contain the genetic information
repeating subunits—the four DNA nucleotides—held together
required to direct the structure, function, development,
in a series by covalent bonds. The four DNA nucleotides are
and reproduction of organisms
adenine (A), thymine (T), cytosine (C), and guanine (G).
4. Able to accurately replicate itself so that daughter cells In 1923, conclusive evidence that DNA resides in
contain the same information as parental cells chromosomes made DNA a candidate for the hereditary
5. Mutable, undergoing mutation at a low rate that intro- material. However, DNA is not the sole constituent of
duces genetic variation and serves as a foundation for chromosomes. Proteins are in high concentration in chro-
evolutionary change mosomes, and RNA is present in the nucleus and around
chromosomes, along with lipids and carbohydrates. The
presence of all these compounds meant that they each had to
Chromosomes Contain DNA be considered potential candidates for the hereditary mate-
The weakly acidic substance known today as DNA was first rial. In fact, some early researchers, including, eventually,
noticed in 1869, when Friedrich Miescher isolated it from Edmund B. Wilson himself, thought protein was potentially
the nuclei of white blood cells in a mixture of nucleic acids a better candidate for the hereditary material than DNA.
and proteins he called “nuclein”. At the same time Miescher The proponents of this idea pointed out that protein is com-
was isolating nuclein, microscopic studies were identifying posed of 20 different amino acids, whereas DNA has only
the fusion of male and female nuclei during reproduction. 4 kinds of nucleotides. The protein proponents suggested
In addition, microscopic analysis of cells and reproduction that the “20-letter alphabet” of protein could contain more
identified chromosomes in cell nuclei and also determined information than the “4-letter alphabet” of DNA. It was
that the nuclei of different species contain different num- against this backdrop that the results of three experiments
bers of chromosomes. Furthermore, biologists determined conducted between 1928 and 1952 combined to identify
that the chromosome contributions of males and females to DNA—not RNA, protein, or another chemical constituent
fertilization were equal in terms of chromosome number. of cells—as the hereditary material of organisms.
These and other observations led to the earliest suggestion
that DNA was the hereditary material. The suggestion came A Transformation Factor Responsible
from Edmund Beecher Wilson in 1895. After accurately docu- for Heredity
menting that sperm and egg cells contribute the same number
of chromosomes during reproduction, Wilson speculated, Frederick Griffith, a British physician, studied pneumonia
infection in mice and published a lengthy research report
The precise equivalence of the chromosomes contributed in 1928 describing his findings. Modern biology focuses
by the sexes is a physical correlative of the fact that the on just the last few pages of Griffith’s long report, where
two sexes play, on the whole, equal parts in hereditary he describes infecting mice with different combinations
transmission, and it seems to show that the chromosomal of treated and untreated pneumonia bacteria. Through his
7.1 DNA Is the Hereditary Molecule of Life 237
Conclusion: Hereditary
molecule transformed RII
bacteria into SIII bacteria.
238 CHAPTER 7 DNA Structure and Replication
pneumonia. His tests of blood cultures from the dead mice Direct evidence that DNA was the transformation fac-
revealed living SIII bacteria. Knowing that this outcome could tor came from an experiment performed by Avery and his
not have been the result of a simple mutational event, Griffith colleagues Colin MacLeod and Maclyn McCarty in 1944
proposed that a molecular component he called the “transfor- (Figure 7.3). This experiment identified the role of DNA in
mation factor” was responsible for transforming RII into SIII. transformation by eliminating lipids, polysaccharides, pro-
In Griffith’s proposal, the transforming factor was a com- tein, RNA, and DNA one at a time from the SIII extract.
pound that carried hereditary information. He was unable to In each experimental trial, the SIII extract was treated to
identify his transformation factor, but today we know that it is remove a different component or set of components, and the
DNA. Today biologists also know that the biological process treated extract was then mixed with RII cells. After time was
responsible for the conversion of living RII bacteria by heat- allowed for an in vitro transformation reaction to take place,
killed SIII is the process of transformation that we describe the occurrence or absence of transformation was assessed.
as a mechanism for gene transfer in bacteria in Section 6.4. Figure 7.3 shows that in vitro transformation takes place
in the control experiment 1 and when lipids and polysaccha-
DNA Is the Transformation Factor rides 2 , proteins 3 , or RNA 4 is removed from the extract. In
contrast to the other results, the fifth experiment, which uses
Shortly after Griffith published his report on the transforma-
DNase to specifically degrade DNA, does not result in transfor
tion factor, Martin Dawson, working with Oswald Avery,
mation 5 —a clear indication that transformation is blocked by
developed an in vitro transformation procedure to mix liv-
the destruction of DNA. Based on these observations, Avery,
ing R cells with a purified extract of cellular material derived
MacLeod, and McCarty correctly concluded that DNA is the
from heat-killed SIII cells containing the transformation fac-
transformation factor and the probable hereditary material.
tor. Translated from Latin, in vitro means “in glass.” Com-
monly, this means either an experiment conducted in a test
tube or a procedure that takes place outside the body of an
DNA Is the Hereditary Molecule
organism. Biochemical assays indicated that the SIII extract Avery, MacLeod, and McCarty’s work convinced most biolo-
used in the Dawson–Avery in vitro transformation consisted gists that DNA was the long-sought hereditary material, and
mostly of DNA, along with a small amount of RNA and trace a great deal of research in the late 1940s and early 1950s was
amounts of proteins, lipids, and polysaccharides. devoted to deducing the physical structure of DNA. Biologists
Type RII added Type RII added Type RII added Type RII added Type RII added
No lipids,
No changes polysaccharides No proteins No RNA No DNA
realized that once the structure of DNA was known, the chemi- to reproduce. Infection by a phage proceeds as illustrated in
cal nature of genes would be identified, and biological research Figure 6.17 and culminates in the lysis of the host cell and
would move into the realm of genetic molecular biology. As the release of dozens of progeny phages.
clear and convincing as the work of Avery and his colleagues In their experiment, Hershey and Chase took advantage
seems in retrospect, however, there were several unanswered of an essential difference between the chemical composi-
questions about the role of DNA in heredity. There was also tion of DNA and protein to confirm the hereditary role of
a need to demonstrate directly that the presence of a specific DNA (Figure 7.4). Proteins contain large amounts of sulfur
DNA molecule induces the appearance of a particular pheno- but almost no phosphorus; conversely, DNA contains a large
type. That evidence came in a 1952 report by Alfred Hershey amount of phosphorus but no sulfur. Hershey and Chase
and Martha Chase, who showed that DNA, but not protein, is initially grew phage cultures in different growth media.
responsible for bacteriophage infection of bacterial cells. One growth medium contained 35S, the radioactive form of
Bacteriophages, also known as phages, are viruses that sulfur, to label protein 1 ; the other contained radioactive
infect bacteria. Phages such as T2, for example, consist of a phosphorus, 32P, to label DNA 1 . In parallel experiments,
protein shell with a tail segment that attaches to a host bac- the researchers used radioactively labeled phages—from
terial cell and a head segment that contains DNA. T2 phages the radioactive sulfur medium in one experiment and from
are among the many bacteriophages that do not carry any the radioactive phosphorus medium in the other—to infect
RNA. Like other phages, T2 must infect host bacterial cells unlabeled host bacterial cells 2 2 .
32
P-containing S-containing
35
medium medium.
Figure 7.4 The Hershey-Chase experiment. Experimental results show that DNA is the molecule in bac-
teriophages that is transferred by infection of bacterial cells.
240 CHAPTER 7 DNA Structure and Replication
After a short time, each mixture was agitated in a blender Purine nucleotides
to separate bacterial cells from the now empty phage shells. Phosphate Nitrogenous base
Such empty phage shells are called “ghosts” 3 3 . The rela-
tively large bacterial cells were easily separated from the O– O– O– O–
ghosts by centrifugation. The heavier bacteria collect in a pel- P NH2 P O
H N H N
let at the bottom of the centrifuge tube, while the lighter ghosts O O C8 7
C5 6C O O C8
C 5 6C
7
9
remain suspended in the supernatant. Testing each fraction for H 2C 5’
O N C 4 A 1N H2C 5¿
O N9 C 4 G 1N
H
radioactivity revealed that virtually all the 32P label was associ- 4¿
H
3¿
H
2¿
1¿ 3
N C
2 4¿
H3¿ H 1¿
2¿
3 2
N C
H H H H
ated with newly infected bacterial cells and almost none with H NH2
ghost particles 4 . On the other hand, the 35S label was found in OH H OH H
the ghost-particle fraction, and only trace amounts were found Deoxyribose
associated with the bacterial pellet 4 . This result demonstrates
that phage DNA, but not phage protein, is transferred to host Deoxyadenosine Deoxyguanosine
5¿-monophosphate 5¿-monophosphate
bacterial cells and directs the synthesis of phage DNA and pro- (dAMP) (dGMP)
teins, the assembly of progeny phage particles, and ultimately
the lysis of infected cells. The experiment demonstrated that Pyrimidine nucleotides
the transformation factor identified previously by Griffith was
DNA; it also showed that Avery, MacLeod, and McCarty were Phosphate Nitrogenous base
correct in concluding that DNA is the hereditary material. H 3C H NH2
O– O– O O– O–
P C5 4C P C 5 4C
7.2 The DNA Double Helix O O H C6 T 3N H
1 2
O O H C6 C 3 N
1 2
H2C 5¿ O N C H2C N C
Consists of Two Complementary
5¿
O
4¿ 4¿
H H 1¿ O H3¿ H
1¿
O
H 3’3¿ 2¿ H H 2¿
H
and Antiparallel Strands
OH H OH H
designated, respectively, deoxyadenosine 5′-monophosphate DNA was much more common than A-form DNA, and it is
(dAMP) and deoxyguanosine 5′-monophosphate (dGMP); now known to predominate in all organisms. A third type of
and the nucleotides that carry the pyrimidine bases cyto- DNA is also known, as we describe at the end of this section.
sine and thymine are deoxycytidine 5′-monophosphate The molecular dimensions of DNA are measured using
(dCMP) and deoxythymidine 5′-monophosphate (dTMP). the unit called an angstrom (Å) or in nanometers (nm). One
Collectively, these are identified as the deoxynucleotide angstrom is equal to 10-10 meters, or 1 ten-billionth of a
monophosphates, or dNMPs, where N can refer to any of meter, and 1 nm equals one-billionth of a meter, or 10-9
the four nucleotide bases. In contrast, free (reactive) DNA meters. In B-form DNA, the distance from the axis of sym-
nucleotides in their triphosphate configurations are identified metry to the outer edge of either sugar-phosphate backbone
as dATP, dGTP, dCTP, and dTTP. Collectively, these are the is 10 Å (1 nm), and the molecular diameter is 20 Å (2 nm)
deoxynucleotide triphosphates (dNTPs). at any point along the length of the helix (Figure 7.7a). The
DNA strand formation is catalyzed by the enzyme DNA 20-Å molecular symmetry of the double helix was the key
polymerase. The enzyme catalyzes the formation of a phos- observation that told Watson and Crick that DNA structure
phodiester bond between the 3′ hydroxyl group of one nucle- results from pairing of a purine (A or G) with its comple-
otide and the 5′ triphosphate group of an adjacent nucleotide mentary pyrimidine (T or C). The purine–pyrimidine base-
(Figure 7.6). Two of the three phosphates of the dNTP are pair pattern gives each base pair the same dimension.
removed during phosphodiester bond formation, leaving the A second key observation derived from Franklin’s Photo
nucleotides of a polynucleotide chain in their monophosphate 51 is that nucleotide base pairs are spaced at intervals of 3.4 Å
form. The two discarded phosphates are called the pyrophos- along DNA duplexes. This tight packing of DNA bases in the
phate group. As mentioned before, the resulting strand is duplex leads to base stacking, the slight rotation of adjacent
a polynucleotide chain composed of nucleotides joined by base pairs around the axis of symmetry so that their planes
covalent bonds. The pattern of phosphodiester bond forma- are parallel, imparting a twist to the double helix. Figure 7.7a
tion gives each strand a sugar-phosphate backbone consist- shows that one complete helical turn spans 34 Å. This span is
ing of alternating sugar and phosphate groups along its length. occupied by approximately 10.5 base pairs. Figure 7.7b is a
space-filling model that illustrates base-pair stacking and the
twisting of the sugar-phosphate backbones. Figure 7.7c is a
The DNA Duplex ball-and-stick model illustrating how base pairs twist around
DNA is stable as a double helix. The two polynucleo- the axis of symmetry to create the helical spiral.
tide strands that make up the duplex have a specific rela- Base-pair stacking creates two grooves in the double
tionship that follows two rules: (1) the arrangement of the helix, gaps between the spiraling sugar-phosphate backbones
nucleotides is such that the nucleotide bases of one strand that partially expose the nucleotides. The alternating grooves,
are complementary to the corresponding nucleotide bases known as the major groove and minor groove, are high-
on the second strand (A pairs with T, and G pairs with C), lighted in Figures 7.7b and 7.7c. The major groove is approxi-
and (2) the two strands are antiparallel in orientation (see mately 12 Å wide, and the minor groove is approximately
the opposite-pointing arrows on each side of the diagrams 6 Å wide. The major and minor grooves are regions where
in Figure 7.6). If one strand is, for example 5’-ATCG-3’, DNA-binding proteins can most easily make direct contact
then the complementary strand is 3’-TAGC-5’. with nucleotides along one or both strands of the double helix.
Complementary base pairing joins a purine nucleotide In this chapter and in later chapters, we discuss many of the
on one strand to its complementary pyrimidine nucleotide on important functions DNA-binding proteins perform, such as
the other. The chemical basis of such pairing is the formation regulating the initiation of transcription and controlling the
of a stable number of hydrogen (H) bonds between the bases onset and progression of DNA replication. Most of these func-
of the different strands. Hydrogen bonds are noncovalent tions depend on the presence of characteristic sequences of
bonds that form between the partial charges that are associ- DNA nucleotides. DNA-binding proteins gain access to DNA
ated with hydrogen, oxygen, and nitrogen atoms of the nucle- nucleotides in major and minor grooves of the molecule.
otide bases. As Figure 7.6 shows, two stable hydrogen bonds B-form DNA, overwhelmingly the most common DNA
form for each A-T base pair, and three hydrogen bonds are structure in organisms, has a right-handed twisting of the
formed by each G-C base pair (see also Figure 1.6). sugar-phosphate backbone. A-form DNA also has a right-
Antiparallel strand orientation is essential to the formation handed twist to the helix. It is more compact than B-form
of stable hydrogen bonds. In Figure 7.6, notice that the nucleo- DNA, with about 11 base pairs per complete helical twist,
tides in one strand are oriented with their 3′ carbon toward the although its diameter is a little greater than that of B-form
top and their 5′ carbon toward the bottom. The complemen- DNA (Table 7.1). A-form DNA is occasionally detected in
tary nucleotides in the other strand are antiparallel to these; cells, and it appears to be particularly common in bacterio-
that is, their 5′@to@3′ orientations run in the opposite direction. phage, where its more compact size makes it functional for
A key observation made from Franklin’s research was packaging of bacteriophage DNA. A-form DNA may be less
the recognition of two slightly different forms of DNA. These amenable to binding by DNA-binding regulatory proteins, due
were designated A-form DNA and B-form DNA. B-form to alterations of the major and minor grooves in comparison
(a) New strand Template strand
5¿ 3¿
Hydrogen bond
O–
d+ d- CH3
P H O
N N H
O O d+ H O
d- N T
CH2 O A N H N H
N H H
H H O
H N H O
H H CH2
O– O H H O O
P H d+
N H d- P
O O O O–
H d- N H O
C d+
N H
CH2 O
N N G H
H H d- d+ N H H Phosphodiester
H O H N N O bonds
H H CH2
O H O O
H d- CH3
3¿
O P
Deoxyadenosine d+ H O O–
T
H N
O O O N H
H H
O
–
O P O P O P O –
d+ O CH2
H H
N N H
O –
O –
O O O
d-
CH2 O A N P
dATP recruited by N
H H O–
DNA polymerase H N H
H 5¿
O H
In a reaction catalyzed by DNA polymerase, and
using thymine on the template strand (right) as a
guide, the activated 3¿ OH of the deoxycytidine in
the growing strand (left) attacks the triphosphate
group of the incoming dATP.
5¿ 3¿
O–
d+ d- CH3
P H O
N N H
O O d+ H O
d- N T
CH2 O A N H N H
N H H
H H O
H N H O
H H CH2
O –
O H H O O
P H d+ d-
N H P
O O O O–
H d- d+ N H O
C N H
CH2 O
N N G H
H H N H H
H O
d+
H H N N O CH2
d- H
New O– O H O O Sugar-phosphate
H d+ d- CH3
phosphodiester P H backbone
N N H O P
bond O O
d- d+ H O O–
T
3¿ CH2 O A N H N
N N
H H H H
H N H O
O O H O CH2
H
O H
–
O P O P O – O O
P
O –
O DNA polymerase catalyzes formation of a new O–
phosphodiester bond attaching adenosine
Pyrophosphate monophosphate to the 3’ end of the new strand. 5¿
group (discarded)
Figure 7.6 DNA strand elongation. (a) Complementary nucleotides form hydrogen bonds by the attrac-
tion of positive and negative charges. The nucleotide triphosphate complementary to the template strand
nucleotide is recruited by DNA polymerase. (b) DNA polymerase catalyzes the addition of the new nucleo-
tide to the 3′ end of the growing strand by removing two phosphates (the pyrophosphate group) and form-
ing a new phosphodiester bond.
7.2 The DNA Double Helix Consists of Two Complementary and Antiparallel Strands 243
20 Å (2 nm) 20 Å 20 Å
Figure 7.7 The B-form DNA double helix. (a) Ribbon diagram, (b) space-filling diagram, and (c ) ball-
and-stick diagram show the sugar-phosphate backbones, base pairs, major and minor grooves, and dimen-
sions of the DNA duplex.
Evaluate
1. Identify the topic this problem 1. The question concerns a DNA sequence. It asks for the sequence and polarity
addresses, and the nature of the of the complementary strand and the number of phosphodiester and hydrogen
required answer. bonds present in the fragment.
2. Identify the critical information given 2. The sequence and polarity are given for one strand of the DNA fragment.
in the problem.
Deduce
3. Review the general structure of a 3. DNA is a double helix composed of single strands that contain complementary
DNA duplex and the complementarity base pairs (A pairs with T, and G with C). The complementary strands are
of specific nucleotides. antiparallel (i.e., one strand is 5′ to 3′, and its complement is 3′ to 5′).
4. Review the patterns of phosphodies- 4. One phosphodiester bond forms between adjacent nucleotides on each strand
ter bond and hydrogen bond forma- of DNA. A-T base pairs (joining the two strands) contain 2 hydrogen bonds, and
tion in DNA. G-C base pairs contain 3 hydrogen bonds.
Solve
5. Identify the sequence of the comple- 5. The complementary sequence is TGCTGCGAT.
mentary strand.
6. Give the polarity of the complemen- 6. The polarity of the complementary strand is 3’-TGCTGCGAT-5’.
tary strand.
7. Count the number of phosphodiester 7. Between the adjacent nucleotides of this fragment there are eight phosphodies-
bonds in this DNA fragment. ter bonds per strand for a total of 16 phosphodiester bonds.
8. Count the number of hydrogen bonds 8. There are four A-T bases pairs containing 2 hydrogen bonds each, and five
between the two strands of this DNA G-C base pairs containing 3 hydrogen bonds each, for a total of 8 + 15 = 23
fragment. hydrogen bonds in this DNA fragment.
For more practice, see Problems 5, 8, 9, 16, and 17. Visit the Study Area to access study tools. Mastering Genetics
to form, the negative charge of an oxygen or nitrogen must original version. The high fidelity of DNA replication is essen-
occur opposite the positive charge of a hydrogen. This occurs tial to reproduction and to the normal development of biologi-
when complementary base pairs align in antiparallel strands. cal structures and functions. Without faithful DNA replication,
If a purine and a pyrimidine were aligned in parallel strands, the information of life would become hopelessly garbled by
positively charged hydrogens would be opposite one another, rapidly accumulating mutations that would threaten survival.
as would negatively charged nitrogens and oxygens. These Considering the importance of DNA throughout the
repelling forces would prevent hydrogen bond formation. biological world, it was no surprise to discover that the
Review Genetic Analysis 7.1 to explore complementary general mechanism of DNA replication is the same in all
base pairing and the formation of bonds creating single and organisms. This universal process evolved in the earliest
double strands of DNA. life-forms and has been retained for billions of years. As
organisms diverged and became more complex, however,
an array of differences did develop among DNA replica-
7.3 DNA Replication Is tion proteins and enzymes. Despite the diversification
of these specific components of DNA replication, three
Semiconservative and Bidirectional attributes of DNA replication are shared by all organisms:
Given the role of DNA as an information repository and 1. Each strand of the parental DNA molecule remains
an information transmitter, the integrity of the nucleotide intact during replication.
sequence of DNA is of paramount importance. Each time DNA 2. Each parental strand serves as a template directing the
is copied, the new version must be a precise duplicate of the synthesis of a complementary, antiparallel daughter strand.
244
7.3 DNA Replication Is Semiconservative and Bidirectional 245
3. Completion of DNA replication results in the forma- (Figure 7.8). The 1 semiconservative DNA replication
tion of two identical daughter duplexes, each com- model—which proved to be correct—proposed that each
posed of one parental strand and one daughter strand. daughter duplex contains one original parental strand of DNA
and one complementary, newly synthesized daughter strand.
As we describe DNA replication in bacteria, archaea, and
The 2 conservative DNA replication model predicts that one
eukaryotes in this and the following section, we will point
daughter duplex contains the two strands of the parental mol-
out similarities and differences in DNA replication among the
ecule and the other contains two newly synthesized daughter
domains. The three domains share the features they do because
strands. Lastly, the 3 dispersive DNA replication model pre-
all life evolved from a common origin. At the same time, the
dicts that each daughter duplex is a composite of interspersed
differences in DNA replication between the domains are also
parental duplex segments and daughter duplex segments.
the result of evolution, which favored specific adaptations.
Figure 7.8 Three proposed mechanisms of DNA replication tested by Meselson and Stahl. The results
expected for two cycles of DNA replication are shown for each model.
Q Using the same red and blue color scheme, draw a third cycle of DNA replication for the
semiconservative replication model.
246 CHAPTER 7 DNA Structure and Replication
the nitrogen in these DNA duplexes is 15N. The duplexes Next, some of these 15N@labeled E. coli were transferred
are designated 15N/15N to signify the incorporation of 15N to a new growth medium containing only the normal light iso-
throughout both strands. (By the same token, a DNA duplex tope of nitrogen, 14N. Growth in this medium leads to the incor-
composed of two strands containing only 14N, the normal poration of DNA nucleotides containing the light isotope into
isotope of nitrogen, is designated 14N/14N, and a duplex with newly synthesized strands. At the end of each successive DNA
one strand containing each isotope is designated 15N/14N.) replication cycle, DNA was collected from a few cells growing
DNA collected for CsCl gradient analysis from this start- on the 14N medium and was subjected to CsCl analysis.
ing generation, designated generation 0, was exclusively Figure 7.9 shows the results of CsCl gradient analysis
15 15
N/ N. of DNA collected from three replication cycles, beginning
14
N growth
15
N growth
14 14
N growth N growth
medium medium medium medium
DNA samples
Hybrid
Hybrid
DNA analysis
DNA band Densitometric band
Light
14
N/ N
14
15
N/14N
15
N/15N
Heavy
Results All heavy DNA All hybrid DNA 1:1 light to hybrid DNA 3:1 light to hybrid DNA
Figure 7.9 The Meselson–Stahl experimental results. The semiconservative replication process is illus-
trated for three replication cycles (upper rows). Photographs show the DNA bands in centrifuge tubes along
with densitometry scans in which the amplitudes of peaks indicate the relative concentrations of material in
each centrifuge band (lower rows). These results are consistent only with semiconservative DNA replication.
Q If a fourth cycle of DNA replication takes place, what is the expected ratio of light to hybrid DNA
duplexes?
7.3 DNA Replication Is Semiconservative and Bidirectional 247
Terminus of
replication
New DNA
3¿
5¿
5¿
3¿
Old DNA
Replication
forks
Huberman and Riggs’s results, depicted in Figure 7.12, Figure 7.13a is a snapshot of a moment during DNA
show exactly what was predicted for bidirectional DNA rep- replication, but notice that the replication bubbles in the
lication. Dark regions indicating incorporation of radioac- micrograph are of different sizes. This indicates that repli-
tivity during a pulse alternate with light regions indicating cation was initiated in them at different times. Large rep-
DNA replication during a chase. The alternation is sym- lication bubbles appear to extend from origins that started
metrical on both sides of replication origins, demonstrating replication earlier than those belonging to the smaller rep-
that replication moves away from replication origins in both lication bubbles in this micrograph. Cell biologists have
directions at once. determined that among different types of cells, the length
of S phase is variable. This means that the rate of pro-
gression of DNA replication varies among cells of differ-
Multiple Replication Origins in Eukaryotes ent types. Rapidly dividing cells replicate their DNA more
Replication evidence from Cairns and from Rodriguez and col- quickly (i.e., have a shorter S phase) than do slowly dividing
leagues demonstrates that the E. coli chromosome has a single cells. In addition, experimental evidence identifies “early-
origin of replication, and studies of archaeal species generally replicating” (i.e., early in S phase) and “late-replicating”
indicate a single replication origin, but what about the chromo- (late in S phase) segments of large eukaryotic genomes.
somes of eukaryotes? Certainly each eukaryotic chromosome Early-replicating genome segments appear to contain many
must have its own origin or origins of replication, but are there expressed genes, whereas late-replicating regions contain
one, two, dozens, or thousands of DNA replication origins many fewer expressed genes. In Drosophila, for example,
on each chromosome? Electron micrograph evidence shown late-replicating regions include chromosome segments
in Figure 7.13 shows multiple DNA replication origins in a immediately surrounding centromeres, where few expressed
single Drosophila melanogaster chromosome. The best evi- genes are located.
dence indicates hundreds to thousands of replication origins in Regardless of differences in the timing of initiation at
eukaryotic species. Yeast genomes contain about 400 origins, the multiple origins of replication on a eukaryotic chromo-
Drosophila genomes about 10,000, and the human genome some, each of the replication bubbles emanating from an
may have as many as 50,000 origins of replication. origin of replication expands toward the others to eventually
7.4 DNA Replication Precisely Duplicates the Genetic Material 249
(a) Result of pulse-labeling experiment a high proportion of shared genes and functions as a result
Origin of Origin of of the common ancestry of life, along with a great deal of
replication A replication B modification and specialization that accumulates over the
millennia of evolution and diversification.
Foundation Figure 7.14 serves as a starting point and as
a touchstone for this discussion by providing an overview of
the major steps in bacterial DNA replication. At each step,
Label Chase Pulse Pulse Chase Pulse Pulse Chase the activities of the principal molecular players are iden-
concentration tified. You can refer back to this foundation figure as you
High label concentration (darker) results make your way through the following pages.
from the highly radioactive pulse, and low
concentration (lighter) results from the DNA Sequences at Replication Origins
weakly radioactive chase.
Origins of DNA replication contain sequences that attract
(b) Interpretation according to bidirectional model replication enzymes. The best-characterized origin-of-
Origin of Origin of
replication A replication B replication sequence is from E. coli and is designated oriC.
This sequence, which contains approximately 245 bp of
DNA, is AT-rich (i.e., has a preponderance of adenine and
Label Chase Pulse Pulse Chase Pulse Pulse Chase thymine base pairs). DNA regions containing AT richness
concentration require less energy for their denaturation, a process we will
The symmetry of the pattern on both see happening at oriC early in the initiation of replication.
sides of the two origins of replication OriC is subdivided by three 13-bp sequences, so-called
shown indicates that replication is
proceeding bidirectionally outward from 13-mers, followed by four 9-bp sequences, called 9-mers
each replication origin. (Figure 7.15). Other bacterial species have origin-of-repli-
cation sequences that are similar to oriC. This similarity is
Figure 7.12 Pulse–chase labeling evidence of bidirectional a product of common ancestry and strong evolutionary con-
DNA replication in mammalian chromosomes. (a) Alternating servation of the function of these DNA sequences. Natural
dark and light banding of replicating DNA and (b) diagram illus- selection has acted to maintain sequence similarity because
trating pulse–chase results are consistent with the bidirectional the function of the conserved sequence region is essential to
model of replication.
the survival of the organism.
Comparisons of evolutionarily conserved sequences
merge, resulting in the replication of all of the DNA in each within and among related species can lead to the identifi-
eukaryotic nucleus by the end of S phase (Figure 7.13b). The cation of consensus sequences. Consensus sequences have
end products of replication of each eukaryotic chromosome similar functions, similar overall length, and similarity of
are a pair of identical DNA duplexes that are sister chroma- the pattern of base pairs. They feature nucleotides occur-
tids. The sister chromatids will remain joined through G 2 ring frequently at the same positions in the DNA sequences
and will be separated at anaphase of the upcoming M phase. of many species. Consensus sequences are not, however,
identical to one another. Instead, consensus sequences are
defined by the nucleotides that occur most often at partic-
7.4 DNA Replication Precisely ular positions in the sequence. The sequence making up a
consensus sequence is determined by recognizing the simi-
Duplicates the Genetic Material lar sequences in several related species and identifying the
most common nucleotide at each position. Table 7.2 illus-
A great deal of what molecular biologists know about DNA trates this process for the 9-mer segment of the origin of
replication comes from the study of bacteria, particularly replication for eight bacterial species. Notice the overall
E. coli, but increasingly the processes of DNA replication in sequence similarity and that the nucleotides at six positions
archaeal and eukaryotic genomes are also becoming clear. are identical among the species whereas the nucleotides
Chapter 1 presentsa general overview of some of the basic at three positions—2, 3, and 5—vary among the species.
steps of DNA replication, gleaned primarily from bacte- Based on the 9-mer sequences for the bacterial species listed
rial species. The present section provides additional details in Table 7.2, the consensus sequence is TTATCCACA. Con-
of this process and also offers comparative information on sensus sequences are not unique to DNA replication. They
DNA replication in archaea and eukaryotes. What is revealed are common features identified by comparative genomics in
by comparisons of DNA replication between species repre- the study of numerous regulatory processes. You will see the
sentative of the three domains is the overall similarity of the term used again in subsequent chapters.
process in all the domains, combined with differences that Some archaeal species have single origins of replica-
belong uniquely to each one. These observations conform tion, but others have up to four origins. The DNA sequences
to a common theme in evolutionary biology: the presence of at archaeal origins are termed origin recognition boxes
250 CHAPTER 7 DNA Structure and Replication
(a)
Replication
(b) fork
5¿ 5¿ 5¿
3¿ 3¿ 3¿ Replication bubble
3¿ 3¿ 3¿
5¿ 5¿ 5¿
Old DNA
Replication origin New DNA
Replication is bidirectional
from each replication origin
5¿ 3¿
3¿ 5¿
+
5¿ 3¿
3¿ 5¿
Figure 7.13 Multiple origins of replication on a single chromosome from Drosophila melanogaster.
(a) The arrows indicate replication bubbles, which are expanding bidirectionally. Different-sized replica-
tion bubbles indicate different replication start times. (b) Replication bubbles from multiple origins (upper)
expand bidirectionally (middle) and merge, ultimately forming two sister chromatids (lower).
(ORBs), and they are of two types. Long ORB sequences sequences. In yeast, the multiple origins of replication are
are 22 to 35 nucleotides in length and may be present at two known as autonomously replicating sequences (ARSs). There
or more origins in species with multiple replication origins. is general conservation of DNA sequence in ARSs, and their
Shorter, so-called miniORB, sequences are 12 to 13 nucleo- organization is similar throughout the genome of S. cerevisiae.
tides in length and may occur one or more times in archaeal ARS1 in yeast has been fully sequenced (Figure 7.16). Within
genomes. Long ORBs and miniORBs may also be found in the 95 bp of ARS1 is an 11-bp consensus sequence and three
the same genome. other regions (B1, B2, and B3) of conserved DNA sequences
Among eukaryotic organisms, the yeast Saccharomyces that differ somewhat from one another and from the 11-bp
cerevisiae has the most fully characterized origin-of-replication consensus sequence region.
F O U N D A T I O N F I G U R E 7.14
DNA replication in bacteria
1 Helicase breaks hydrogen bonds. 5 DNA polymerase III elongates the leading
Topoisomerase relaxes supercoiling. strand continuously and the lagging strand
discontinuously.
3¿ Helicase Leading strand
Topoisomerase
3¿ 3¿
5¿ 5¿
3¿ 3¿
5¿ 5¿
Origin of 3¿ 5¿
replication 5¿
Okazaki Okazaki
2 Single-stranded binding (SSB) protein fragment 1 fragment 2
prevents reannealing.
SSB 6 DNA polymerase I removes and replaces
3¿ Helicase nucleotides of the RNA primer.
Topoisomerase
3¿
3¿ 5¿
5¿ DNA polymerase I (pol I) 3¿ 3¿
5¿
5¿ 3¿ 5¿
5¿
3 Primase synthesizes RNA primers. Okazaki Okazaki
RNA primer fragment 1 fragment 2
Primase
3¿
5¿ 3¿ 7 DNA ligase joins Okazaki fragments.
RNA primer 3¿ 3¿
5¿ 5¿
Primase 3¿ 5¿
DNA ligase Primase 3¿ 3¿
5¿ 5¿
5¿ 3¿ 5¿
3¿
5¿
4 DNA polymerase III synthesizes daughter strand.
Okazaki
Leading strand (SSB has been deleted for clarity) Okazaki Okazaki fragment 3
fragment 1 fragment 2
DNA polymerase III
3¿ (pol III)
5¿
3¿
3¿
Lagging strand 5¿
3¿ 5¿
5¿
Okazaki fragment 1
Protein
DNA Helicase
topoisomerase (DnaB) SSB Primase DNA pol III DNA pol I DNA ligase
Icon
Role Relaxes Unwinds the Prevents Synthesizes Synthesizes Removes Joins DNA
supercoiling double helix reannealing RNA primers DNA and replaces segments
of separated RNA primer
strands with DNA
251
252 CHAPTER 7 DNA Structure and Replication
E. coli
OriC chromosome
245 bp
13-mer 13-mer 13-mer 9-mer 9-mer 9-mer 9-mer
Figure 7.15 Origin of replication sequence in E. coli. OriC in E. coli contains three 13-mer and four
9-mer consensus sequences in a region of 245 base pairs of conserved sequence.
95 bp
B3 B2 B1 11 bp
5¿ CAAATTTCGTCAAAAATGCTAAGAAATAGGTTATTACTTTTATTTAAGTATTGTTTGTGCCTTTTGAAAAGCAAGCATAAAAGATCTAAACATAAAATCTGTAAAATAAC 3¿
3¿ GTTTAAAGCAGTTTTTACGATTCTTTATCCAATAATGAAAATAAATTCATAACAAACACGGAAAACTTTTCGTTCGTATTTTCTAGATTTGTATTTTAGACATTTTATTG 5¿
Consensus sequence
Figure 7.16 The yeast ARS1 origin of replication. The origin of replication in yeast contains a
consensus 11-bp segment and regions B1, B2, and B3, spanning 95 base pairs of conserved sequence.
A solidus (/) between nucleotides of consensus sequences (e.g., A/T) indicates that the two nucleotides
are equally common at this position.
7.4 DNA Replication Precisely Duplicates the Genetic Material 253
Molecular Biology of Replication Initiation reforming a DNA duplex and thus keeps them available to
serve as templates for new DNA synthesis (see Figure 7.14,
DNA replication in E. coli requires that replication-initiating steps 1 and 2 ).
enzymes locate and bind to the consensus sequences in oriC. The first steps in DNA replication initiation are simi-
In E. coli, three enzymes, DnaA, DnaB, and DnaC, bind at lar in archaea and eukaryotes. In archaea, a protein com-
oriC and initiate DNA replication (Figure 7.17). The protein plex identified as Orc1/Cdc6 binds to ORB and miniORB
DnaA binds to the 9-mer components of oriC and bends the sequences. Orc1/Cdc6 has helicase activity that separates
DNA, breaking hydrogen bonds in the AT-rich 13-mer region the DNA strands at those sequences. The protein Mcm then
of oriC. This creates an open origin complex, a short region binds to the separated strands, followed by additional pro-
where the DNA strands are separated. A DnaB then binds teins and enzymes that bind to the region, and synthesis
to oriC, and replication initiates. The DnaB is a helicase begins. In eukaryotes, helicase recruitment and activity is
protein that breaks hydrogen bonds to separate the DNA best understood in yeast, where four protein subcomplexes
strands and unwinds the double helix ahead of advancing are involved. At eukaryotic replication origins, a prereplica-
DNA replication. The unwound strands are bound by single- tion complex (preRC) of 14 proteins assembles. An aggre-
stranded binding protein (SSB), which prevents them from gation of six of these proteins form a subunit identified as
the origin replication complex (ORC) that acts as the ini-
tiator of eukaryotic DNA replication by identifying the ori-
13-mer repeats 9-mer repeats gin site. ORC is then bound by the proteins Cdc6 and Cdt1.
This is followed by binding of another eight proteins. The
oriC resulting complex separates the DNA strands at the replica-
tion origin, and DNA replication gets under way.
In all organisms, DNA polymerase enzymes that are
responsible for synthesizing new DNA strands use a template
strand to direct the addition of nucleotides to daughter strands
DnaA protein in a complementary and antiparallel manner. These new
nucleotides are added to the 3′ end of the growing daughter
SSB strand, and the overall direction of daughter strand elongation
13-mer 9-mer
Open 13-mer is 5′ to 3′. Curiously, however, DNA polymerases are unable
complex to initiate DNA strand synthesis on their own. To perform its
DnaA catalytic activity, a DNA polymerase requires the presence
9-mer of a primer sequence, a short single-stranded segment that
13-mer begins a daughter strand and provides an OH end to which
9-mer a new DNA nucleotide can be added by DNA polymerase.
9-mer
To satisfy the requirement for a primer, DNA replication in
DnaA protein binds to 9-mer bacteria is initiated by primase, a specialized enzyme that
region, forcing unwinding of
the 13-mer region to form
synthesizes a short RNA primer (see step 3 of Figure 7.14).
an open complex. Measuring just one dozen to two dozen nucleotides in
length, RNA primers provide the 3′ OH needed for DNA
DnaC DnaB polymerase activity. RNA primers contain the nucleotide
proteins base uracil (U) in place of thymine. Consequently, RNA
primers cannot remain as part of fully replicated DNA.
DnaC delivers DnaB protein
Thus, although they are essential for allowing DNA poly-
to the open complex to
DnaB protein initiate helicase activity. merase to begin its DNA synthesis, RNA primers are tem-
(helicase) porary and are removed from newly synthesized DNA
strands before replication is completed. Primase enzymes
Single-stranded
binding protein also operate in the initial stages of archaeal and eukaryotic
DnaA DNA replication. As in bacterial replication, primases in
archaea and eukaryotes synthesize a short RNA primer that
functions identically to bacterial RNA primers.
DNA polymerase III (DNA pol III), the principal DNA- that early in bacterial replication, newly synthesized DNA
synthesizing enzyme (see Figure 7.14, step 4 ). DNA pol segments on one strand are 1000 to 2000 nucleotides long,
III begins its work at the 3′@OH end of an RNA primer and while later in replication those newly synthesized segments
rapidly synthesizes new DNA by adding one nucleotide at a have become much longer. Okazaki’s discovery suggested
time in a sequence that is complementary and antiparallel to that short segments of DNA are synthesized and then, as
the template-strand nucleotides. Pol III requires a template replication progresses, joined together. The short segments
nucleotide to add a new nucleotide to a daughter strand. of newly replicated DNA are called Okazaki fragments,
Enzymes with functions identical to DNA pol III are found and they are the result of discontinuous synthesis of DNA
in archaea and eukaryotes. on the lagging strand. Okazaki fragments in eukaryotes are
Experimental evidence indicates that most of the much shorter than those in bacteria, 100 to 200 nucleotides
enzymes participating in DNA replication are part of a large in length. Similarly, archaeal Okazaki fragments are short.
protein complex called a replisome. There is one replisome
at each replication fork. Replisomes have numerous com- RNA Primer Removal and Okazaki
ponents, including, in each replisome, two complete mol-
Fragment Ligation
ecules of DNA pol III. One of these DNA pol III molecules
carries out the 5′@to@3′ synthesis of one daughter strand To complete DNA replication, RNA primers must be
continuously, in the same direction in which the replication removed and replaced with DNA, and Okazaki fragments
fork progresses. The second pol III in the replisome carries must be joined together to form complete DNA strands. In
out synthesis of the other daughter strand. The continu- E. coli these tasks are accomplished by the enzymes DNA
ously elongated daughter strand is called the leading strand polymerase I and DNA ligase that are each part of the repli-
(Figure 7.18). Notice that Figure 7.18 divides the replication some complex at each replication fork.
bubble into four quadrants. The upper right and lower left When DNA pol III on the lagging strand reaches an
quadrants contain leading strands. RNA primer, thus running out of template, it leaves a single-
The daughter strands in the upper left and lower right stranded gap between the last DNA nucleotide of the newly
quadrants shown in Figure 7.18 have a 5′@to@3′ direction of synthesized daughter strand and the first nucleotide of the
elongation that runs opposite to the direction of movement RNA primer (Figure 7.19). The pol III, having very low affin-
of the replication fork. These daughter strands are elongated ity for these DNA–RNA single-stranded gaps, is then replaced
discontinuously, in short segments, each of which is initiated by DNA polymerase I (DNA pol I), which has high affin-
by an RNA primer. The discontinuously synthesized daughter ity for such gaps (Figure 7.19, 1 ). The DNA pol I removes
strand is called the lagging strand. Thus in Figure 7.18, the nucleotides of the RNA primer one by one and replaces them
lower right and upper left quadrants of the replication bubble with DNA nucleotides, beginning with the 5′ nucleotide of
contain lagging strands (see also step 5 of Figure 7.14). the RNA primer and progressing in the 3′ direction until all
Reiji Okazaki detected the synthesis of short fragments the RNA nucleotides in the primer have been replaced by
of DNA in the replication of the lagging strand. He observed DNA nucleotides complementary to the template strand.
Bidirectional expansion
of bubble
7.4 DNA Replication Precisely Duplicates the Genetic Material 255
Single-stranded RNA primer The pol I enzyme possesses two activities that accomplish
gap (DNA–RNA) the removal of RNA nucleotides and their replacement by DNA
DNA
1 DNA pol I binds to …GGAUCUGCGGATG… nucleotides. DNA pol I first uses its 5′-to-3′ exonuclease
a single-stranded 5¿ 3¿ Daughter strand activity to remove the 5′@most nucleotide from the RNA
gap between
DNA and an RNA 3¿ 5¿ Template strand primer (see step 6 in Figure 7.14).This creates one open
…CCTAGACGCCTAC…
primer. space opposite the template, which is then filled with the cor-
rect DNA nucleotide by the 5′-to-3′ polymerase activity of
DNA polymerase I DNA pol I. As DNA pol I removes each RNA primer nucleo-
U tide and replaces it with a DNA nucleotide, the pol I continu-
ally pushes the single-stranded gap in the 3′ direction.
2 Pol I removes an …GGA CUGCGGATG… Once the entire RNA primer is replaced, a remaining
RNA primer 5¿ 3¿
nucleotide using single-stranded gap sits between two DNA nucleotides.
its 5¿-to-3¿ 3¿ 5¿ At this point, DNA ligase, having exclusive and very high
…CCTAGACGCCTAC…
exonuclease affinity for DNA–DNA single-stranded gaps, is attracted
capability... to the gap and there performs its single task of forming a
phosphodiester bond between the two DNA nucleotides that
joins two Okazaki fragments (see step 7 in Figure 7.14).
3 …and fills the …GGA CUGCGGATG… Both pol I and DNA ligase are active on leading and lagging
gap with a DNA 5¿ 3¿
strands. The level of activity is greater on lagging strands,
nucleotide using
3¿ 5¿ however, where every 1000 to 2000 nucleotides, they are
its 5¿-to-3¿ …CCTAGACGCCTAC…
polymerase needed to join Okazaki fragments during replication of
capability. E. coli DNA.
C Overall, the pattern of DNA replication involving a
leading strand and a lagging strand is similar in bacteria,
4 Pol I removes …GGAT UGCGGATG… eukaryotes, and archaea. For each domain, Table 7.3 lists
each RNA primer 5¿ 3¿
three DNA polymerases that are principally responsible for
nucleotide…
3¿ 5¿ carrying out the synthesis of RNA primers, DNA synthesis,
…CCTAGACGCCTAC…
and RNA primer removal and replacement.
5′@to@3′ polymerase activity is centered (Figure 7.21). archaea also have proofreading ability to help ensure the
When a replication error occurs, the mismatched DNA accuracy of DNA replication.
bases of the template and daughter strands are unable Genetic Analysis 7.2 checks your understanding and
to hydrogen bond properly. As a result, the 3′ OH end analysis of molecular events at the replication fork.
of the daughter strand becomes displaced, blocking the
further addition of nucleotides and inducing rotation of Supercoiling and Topoisomerases
the daughter strand into the 3′@to@5′ exonuclease site at
the “heel” of the hand. Several nucleotides, including the During DNA replication, the molecule undergoes super-
mismatched one, are then removed from the 3′ end of the helical twisting, referred to as supercoiling. This is a
daughter strand, after which the daughter strand rotates form of twisting that goes beyond the double helical
back to the polymerase site in the palm and replication twists already present. Supercoiling occurs because the
resumes. Like their counterparts in bacteria, the prin- unwinding of portions of the helix to permit replication
cipal DNA replication polymerases in eukaryotes and communicates torsional strain to other parts of the mol-
ecule (Figure 7.22a). It is like holding one side of a rubber
band stationary while twisting the other side. Phospodi-
ester bonds are under particular stress, but random breaks
(a) DNA polymerase error are prevented from occurring by a process that provides
controlled relief of this stress. Circular chromosomes like
Mismatched “Thumb” those in many bacteria and archaea are particularly prone
base pair
Polymerase to supercoiling during DNA replication, as the figure
3¿ OH active site shows. Linear chromosomes of eukaryotes manage this
“Palm” extra twisting more easily, but they also require controlled
5¿ Daughter strand
relief of torsional stress.
“Fingers” 3¿ Template strand Enzymes known as DNA topoisomerases catalyze a
5¿ controlled cleavage and rejoining of DNA to allow over-
wound DNA strands to unwind (Figure 7.22b). Relief of
Exonuclease
active site supercoiling is accomplished by different topoisomerases
in different ways. Some topoisomerases break a phos-
(b) Exonuclease removal of mismatched base pair phodiester bond in just one DNA strand, while others
break both strands of DNA. Either mechanism allows the
Daughter strand rotates out
of the polymerase site and
supercoiled strands to unwind superhelical twists. After
into the exonuclease site. unwinding is complete, broken phosphodiester bonds
reform.
5¿
Replication at the Ends of Linear
5¿
3¿ Chromosomes
3¿ OH Linear chromosomes, like those in the nuclei of your
Exonuclease
cleavage
cells, have an altogether different problem presented
by replication. Whereas the replication of circular chro-
mosomes generates two complete copies of the original
(c) Daughter strand resumes DNA synthesis
parental chromosome, linear chromosomes are unable to
replicate fully and completely all the way to their ends.
Instead, linear chromosome replication falls a little short
of reaching the chromosome ends, and as a result, linear
3¿ OH chromosomes get progressively shorter with each replica-
5¿ tion cycle.
The incomplete replication process occurs at the ends of
3¿ chromosomes due to the presence of RNA primers very near
5¿
the end of the lagging strand template. (Figure 7.23). A seg-
ment of DNA at the end of the lagging strand template is not
Figure 7.21 DNA polymerase proofreading activity. replicated, shortening the chromosome with each replication
(a) A replication error by polymerase. (b) Newly synthesized 3′ end cycle.
of daughter strand shifts into exonuclease site, where nucleotides Although the loss of DNA with each replication cycle
are removed. (c) The polymerase resumes 5′@to@3′ synthesis. sounds potentially disastrous, the problem is solved by the
GENETIC ANALYSIS 7.2
PROBLEM Two strains of E. coli have temperature-sensitive mutations that hamper their ability to BREAK IT DOWN: Temperature-
sensitive mutations are the result
complete DNA replication. At 25°C, both strains are able to complete replication, but neither is able of proteins that have full function
to complete replication at 40°C. At 40°C, temperature-sensitive mutant 1 is able to synthesize DNA at a lower temperature but dena-
by DNA polymerase III activity, and it is able to remove RNA primers and replace them with DNA, ture and lose function at higher
temperatures (see Section 4.1)
but it accumulates many short segments of DNA (Okazaki fragments) that are not joined together. At
40°C, temperature-sensitive mutant 2 also synthesizes DNA by polymerase III activity, but it is unable
to remove RNA primers and replace them with DNA. For each of these mutants, use the information
provided here to identify the molecule that is most likely carrying the temperature-sensitive muta-
tion. Identify which normal major events of DNA replication each mutant can complete at 40°C and
which normal events are altered in each mutant.
Evaluate
1. Identify the topic this problem 1. This problem addresses DNA replication and asks you to identify the function of
addresses and the nature of the particular proteins and enzymes that are active at different stages of replication.
required answer.
2. Identify the critical information given 2. Two E. coli strains with different temperature-sensitive mutations of DNA replica-
in the problem. tion are described. Mutant strain 1 accumulates Okazaki fragments that cannot
be joined together, and mutant strain 2 is unable to remove RNA primers.
Deduce
3. Review the molecular events 3. A review of Foundation Figure 7.14 and of Section 7.4 shows that in E. coli, DNA
and principal molecules TIP: The function of polymerase I is responsible for the removal of RNA primer nucleotides and their
that are involved in RNA principal proteins replacement with DNA nucleotides, and that DNA ligase joins Okazaki fragments
and enzymes in
primer removal and E. coli DNA replica- together.
RNA primer replacement. tion is discussed in
this section.
Solve
4. Identify the molecule affected by 4. Mutant 1 is most likely to have a defect in DNA ligase.
mutation in mutant
5. Identify the molecule affected by 5. Mutant 2 is most likely to have a defect in DNA polymerase I.
mutation in mutant 2.
6. Identify which parts of DNA repli- 6. Mutant 1 is able to synthesize RNA primers by DnaG activity and is able to syn-
cation are completed at 40°C and thesize DNA with polymerase III activity. It is also able to remove RNA primers
which are affected by each mutation. and replace the RNA nucleotides with DNA through polymerase I activity. How-
ever, mutant 1 is defective in its ability to ligate Okazaki fragments together by
DNA ligase activity, and these fragments remain unconnected. Mutant 2 has fully
functional DnaG and polymerase III to synthesize RNA primers and most DNA. It
lacks active DNA pol I, however, and is therefore unable to remove RNA primers
and replace them with DNA.
For more practice, see Problems 14, 15, and 18. Visit the Study Area to access study tools. Mastering Genetics
presence of hundreds to thousands of copies of repetitive DNA documents the progressive, age-dependent shortening
DNA sequences called telomeres at the ends of linear chro- of telomere length.
mosomes. Telomeres do not contain protein-coding genes. Chromosome shortening occurs in the nuclei of most
Instead, telomeres are made up of hundreds to thousands of your somatic cells, and those of many other organisms,
of end-to-end 6-bp repeats at the ends of vertebrate chro- but it does not occur in cells of the germ line, from which
mosomes and of longer repeats, up to about 12 bp each, in sperm and eggs are derived. This feature of germ line cells
plants, yeast, and other eukaryotes. Total telomere length ensures that full-length chromosomes are transmitted dur-
on each chromosome ranges from 2 to 20 kb at the birth ing reproduction. The protection against chromosome
of an organism, and decreases with age after that. Since shortening in germ-line cells (and selected other cells in
telomere sequences are repetitive and contain no genetic the body) is afforded by the DNA-synthesizing ability of
information, a portion of a telomere can safely be lost in the ribonucleoprotein telomerase, a complex consisting
each replication cycle, without consequence to the organ- of several proteins and a molecule of RNA. The RNA in
ism. Gel-based and genome sequence analysis of telomeric telomerase acts as a template for synthesizing a repetitive
258
7.4 DNA Replication Precisely Duplicates the Genetic Material 259
DNA
supercoil
Supercoiled
DNA
Figure 7.22 DNA supercoiling in bacteria (a) and its cutting and release by topoisomerase (b).
telomeric DNA sequence. Elizabeth Blackburn and Carol as the T loop. The T loop protects the telomere from enzy-
Greider discovered both telomeric repeat sequences and matic degradation by joining with a protein complex known
telomerase in 1987, in the ciliated protozoan Tetrahymena. as shelterin (see Figure 7.24 step 6 ). The combination of
Along with Jack Szostak, who described critical elements telomeric repeats and shelterin-protected T loops preserves
of the biochemistry of telomerase activity, they were telomeres for several dozen cycles of DNA replication.
awarded the 2009 Nobel Prize in Physiology or Medi- Inevitably, however, telomere length shortens, and when it
cine for their work. Telomerase is a reverse transcriptase becomes too short, it triggers apoptosis (programmed cell
enzyme, meaning that it transcribes DNA from an RNA death).
template. It is encoded by the TERT (telomerase reverse Apoptosis induced by telomere shortening is associ-
transcriptase) gene. ated with an observation in cell biology called the Hayflick
Figure 7.24 depicts the repetitive telomeric sequence in limit, the apparent limit to the length of a cell’s life span.
Tetrahymena and illustrates the mechanism of telomerase Leonard Hayflick first described this limitation in 1965,
synthesis of a telomere (see steps 1 through 5 ). The repeti- pointing out that vertebrate cells live an average of 50 to
tive sequence 5’-TTGGGG-3’ is the characteristic telomeric 70 cycles before dying. The Hayflick limit appears to be
repeat sequence of Tetrahymena. The template RNA in the explained by the progressive loss of telomere length as
Tetrahymena telomerase contains the repeat AACCCC that is cells age.
used to elongate the telomere of one strand enough to allow Some research has suggested that preserving or length-
new DNA replication to fill out the chromosome ends. All ening telomers, perhaps by activating telomerase activity
eukaryotes follow a similar scheme for telomere production, in somatic cells, may be an avenue to longer life spans. For
although the repetitive telomere sequence differs along spe- example, some research in humans indicates that physical
cies lines. Humans and other vertebrates, for example, have activity, which is associated with prolonged healthy living,
the telomeric repeat sequence 5’–TTAGGG–3’. Yeast, plants, may lengthen telomeres. Complicating this idea of gener-
and other eukaryotes have their own telomere sequences. ating telomerase activity to stabilize telomere length and
At birth, the average human chromosome has about 2000 to potentially prolong life, however, is the finding that the
2500 TTAGGG repeats comprising the telomeres at each chro- activation of telomerase activity in somatic cells is a char-
mosome end. These repeats initially span 12 to 15 kb at each acteristic of cancerous cells in 80 to 90% of all cancers. The
telomere. most likely functional role of telomerase activation in can-
In the decades since Blackburn, Greider, and Szostak cer development is the prevention of programmed cell death.
identified telomere structure and the mechanism for their The ability of cancer cells to evade normal cell death by pre-
maintenance, the picture of telomeres has become more serving telomere length confers an element of immortality
complex. In addition to repetitive DNA sequence, most on cancer cells that is not possessed by normal somatic cells.
eukaryotic telomeres are also characterized by the pres- We discuss this idea more fully in Application Chapter C:
ence of a DNA sequence that forms a knotted fold known The Genetics of Cancer.
Parental
duplex
3’ 5’
1 Attachment of telomerase
To Gap left by RNA Telomerase
centromere primer removal
3¿ …AACCCC …AACCCCAAC…
5¿ 3¿ 5¿
DNA
5¿ 3¿
…TTGGGGTTGGGGTTGGGG
To
telomere 2 Elongation of DNA
…AACCCC …AACCCCAAC…
5’ 3’ 3¿ 5¿ 3¿ 5¿
5¿ 3¿
DNA replication …TTGGGGTTGGGGTTGGGGTT
Replication
cycle 1 5’ 3’ 3’ 5’
New DNA
RNA primer synthesis
3 Translocation of telomerase
Leading Parental Lagging Parental
strand strand strand strand …AACCCC …AACCCCAAC…
3¿ 5¿ 3¿ 5¿
RNA primer 5¿ 3¿
…TTGGGGTTGGGGTTGGGGTT
3’ 5’ 5’ 3’
4 Elongation of DNA
…AACCCC …AACCCCAAC…
5’ 3’ 3’ 5’ 3¿ 5¿ 3¿ 5¿
RNA primer 5¿ 3¿
removed …TTGGGGTTGGGGTTGGGGTTGGGGTTG
and replaced
by DNA
5 Telomere completion (by DNA polymerase)
Single-stranded …AACCCCAACCCCAACCCCAACCCCAACC
overhang left 5’ 3¿ 5¿
by RNA primer
removal at 5¿ 3¿
3’ 5’ telomere 3’ …TTGGGGTTGGGGTTGGGGTTGGGGTTGG
Shelterin
3’ 3’ 5’ 3’
Single-stranded Shortened Single-stranded 7.5 Methods of Molecular Genetic
gap due to primer telomere gap due to
removal primer removal Analysis Make Use of DNA
Figure 7.23 Loss of DNA at telomeres. Leading strands are Replication Processes
synthesized to the ends of linear chromosomes, but lagging
strands are shortened at each replication cycle, when the RNA Molecular biologists have used their understanding of the
primer sequence at the end of the template strand is removed enzymes and processes of DNA replication to develop
but not replaced with DNA nucleotides.
new laboratory methods for molecular genetic analysis.
Q Looking at the results of replication cycle 2, and examining Two widely used methods that developed directly from
the DNA duplex at the right-hand side, would you call the red this knowledge are the polymerase chain reaction (PCR)
DNA strand a leading strand or a lagging strand? and a method for dideoxynucleotide DNA sequencing
260
7.5 Methods of Molecular Genetic Analysis Make Use of DNA Replication Processes 261
5¿ 3¿
The Polymerase Chain Reaction Genomic
DNA 1 Denaturation of DNA
Developed in 1983 by Kary Mullis, the polymerase chain by heating (95°C)
reaction (PCR) is an automated version of DNA rep-
lication that takes place in a test tube containing a total 3¿ 5¿
reaction volume of 20 to 50 microliters (one microliter
is one-millionth of a liter). Despite this very small total
5¿ 3¿
reaction volume, a typical PCR reaction, beginning with
just a few copies of a short, targeted DNA sequence, pro-
duces millions of copies of the sequence in a few hours. 2 Primer annealing
(45°–68°C)
Reproduction of DNA through PCR has innumerable uses
in modern biological research, including the evolutionary
Target region
study of extinct species; the comparison of DNA among
living species; forensic genetic applications such as pater- 3¿ 5¿
nity testing, crime scene analysis, and individual identi-
fication; and production of DNA segments for genome 5¿ 3¿
Primer A
sequencing projects. Primer B
Polymerase chain reactions are in vitro DNA- 3¿ 5¿
replication reactions performed using (1) double-stranded
5¿ 3¿
DNA containing the target sequence that is to be copied,
(2) a supply of the four DNA nucleotides, (3) a heat-stable
DNA polymerase, and (4) two different single-stranded 3 Primer extension
DNA primers (described in the list of steps below). (72°C)
These PCR components are mixed with a buffer solution,
and then the automated reaction is run through 30 to 35 Target region
three-step “cycles.” Each cycle doubles the number of
3¿ 5¿
copies of the targeted DNA sequence. The PCR process
is generally identified as “amplification,” and it is com- 5¿ 3¿
mon to speak of “PCR amplification” in reference to the Newly synthesized DNA
process and of “amplified DNA” as the product of the Newly synthesized DNA
reaction. 3¿ 5¿
PCR reactions are carried out in a device known as a 5¿ 3¿
PCR thermal cycler. Thermal cyclers are programmable,
allowing the length and temperature of each cycle step to First cycle completed. Up to 35 additional
cycles double the amount of replicated
be adjusted to meet the needs of the experimenter. The ther- DNA from the target region in each cycle.
mal cycler takes just a few seconds to change temperature
between steps. Figure 7.25 The three-step cycle of PCR. Amplification by PCR
Figure 7.25 illustrates the three steps of a PCR reac- doubles the number of copies of the targeted DNA sequence each
tion. The steps and functions of each PCR cycle are as cycle.
follows:
1 Denaturation. The reaction mixture is heated to
approximately 95°C, causing double-stranded DNA 24 nucleotides), and one primer binds to each of the
to denature into single strands as the hydrogen bonds denatured DNA strands. The step duration is usually
between complementary strands break down. The step 1 to 2 minutes.
duration is usually 1 to 2 minutes. 3 Primer extension. Raising the temperature of the reac-
2 Primer annealing. The reaction temperature is tion to 72°C allows primer extension, during which
reduced to between about 45°C and 68°C to allow a specialized DNA polymerase known as Taq poly-
primer annealing—the hybridization of the two short, merase synthesizes DNA, beginning at the 3′ end
single-stranded DNA primers to complementary of each primer. Taq polymerase, described in more
sequences bracketing the target sequence. These prim- detail below, synthesizes new DNA at the rate of about
ers have the same function as RNA primers in DNA 1000 bp per minute. This step duration is usually 3 to
replication. They are, as mentioned, short (12 to 5 minutes.
262 CHAPTER 7 DNA Structure and Replication
Figure 7.26 shows the two important features of the locations human DNA sequences they bind to, but the mismatches
of PCR-primer binding. First, the primers bind just outside of need not prevent primer annealing if the temperature of
the target region for amplification, and, second, the primers the PCR reaction is lowered during step 2 of the reaction.
bind to opposite complementary strands. This primer binding The lower temperature can increase the stability of hybrid-
pattern ensures that the target region will be copied during ization of the primers and their target sequences enough to
the PCR procedure, and it establishes the 5′ and 3′ boundar- allow the former to prime the PCR amplification.
ies of the amplified PCR products that will be produced by
the procedure. Each complete PCR cycle doubles the num- Separation of PCR Products
ber of copies of the target DNA sequence, so beginning with
a single copy of double-stranded target sequence, comple- The PCR process selectively amplifies DNA fragments
tion of the first PCR cycle produces two copies of the tar- ranging in size from a few dozen base pairs to several thou-
get sequence, completion of the second cycle produces four sand base pairs in length. The fragments generated are
copies, completion of the third cycle eight copies, and so on. almost all of the same double-stranded target region. So
After 30 PCR cycles the yield would be 230, or more than highly concentrated are the results of PCR amplification
1 billion copies of the target sequence, and completion of that they can be analyzed directly using gel electrophoresis
36 cycles could yield more than 68 billion copies of the target (see Chapter 1 for a discussion of this method).
sequence. Gel electrophoresis separates fragments of DNA by
Taq polymerase is named for the thermophilic bac- their sizes in base pairs. Recall from our discussion in
terial species Thermus aquaticus. This bacterium lives in Chapter 1 that DNA fragments containing fewer base pairs
hot springs at near-boiling temperatures and has evolved move more quickly in the electrical separation field than
heat-stable proteins that remain active at these tempera- fragments with more base pairs. This means that smaller
tures. The heat stability of Taq DNA polymerase is impor- fragments, with higher electrophoretic mobility, migrate
tant to the efficiency of PCR, since step 1 of a PCR cycle farther from the origin of migration than do fragments
raises the reaction temperature to near boiling. The DNA with more base pairs. The use of molecular-weight size
polymerases of most organisms are not heat stable, and markers in gel electrophoresis (DNA fragments contain-
they denature and become inactive at temperatures above ing known numbers of base pairs) allows researchers to
about 45°C. determine the size of DNA fragments of unknown length
The first sample of Thermus aquaticus was collected by comparing their migration with that of the known size
from hot springs in Yellowstone National Park by Thomas markers.
Brock and Louise Brock in 1965. Brock was a microbi- Figure 7.26a shows four hypothetical VNTR (variable
ologist, and his attention was drawn to some brown scum number tandem repeat) alleles of a gene (V1 to V4) that con-
on the hot spring surface. Brock thought the scum looked sist of different numbers of repeats of the same short DNA
like bacteria that live in other bodies of water, so he trans- sequence (see Section 5.5). Genetic markers of this type are
ported a sample back to his laboratory and managed to commonly used in genetic studies, and they are especially
grow it. What he discovered was a new bacterial species, common in forensic genetic analysis applications. We dis-
and in the process he opened new avenues of research on cuss the analysis of VNTR markers in forensic genetic set-
“extremophiles”—organisms that live in extreme environ- tings in Application Chapter E: Forensic Genetics.
ments—and helped pave the way for the use of Taq poly- For an autosomal VNTR gene like the one illustrated in
merase in PCR. Figure 7.26a, the four alleles can form 10 different genotypes
PCR has an enormous variety of applications, but it that each have their own distinctive set of one or two DNA
also has limitations, the most important of which are (1) the fragment lengths (Figure 7.26b). Each homozygous geno-
necessity of having some knowledge of the sequences type has a single band and each heterozygous genotype has
needed for primers and (2) the difficulty of producing two bands. The bands are identified by their repeat number.
amplification products longer than 10 to 15 kb. In most The inheritance of the VNTR alleles follows a codomi-
cases, the length limitations on PCR restrict its use to the nant pattern in which both alleles are detected in heterozy-
study of selected DNA segments or individual genes. The gous genotypes. In the family represented in Figure 7.26c,
requirement for primer sequence information can be sat- the two parents have completely different heterozygous geno-
isfied by informed guesses about the sequences likely to types, and each parent transmits one allele to each child. As a
occur at primer binding sites or by using primers from one consequence of the parents’ completely different genotypes,
species to amplify similar sequences in another species. each allele in each child can be traced to one of the parents,
For example, a biologist wanting to study DNA-sequence and each child has a heterozygous genotype. Notice that there
similarity between species could use a pair of primers that are two DNA bands for each person in this family: VNTRs
amplify a Drosophila gene to examine the human genome and other similar DNA genetic markers display codomi-
for a related gene. There may be one or more base-pair nant inheritance, and heterozygous individuals display DNA
mismatches between the Drosophila primers and the bands corresponding to each allele (see Section 4.1).
7.5 Methods of Molecular Genetic Analysis Make Use of DNA Replication Processes 263
(a) Each allele produces a PCR fragment of a different length. Dideoxynucleotide DNA Sequencing
Allele VNTRs The ultimate description of any DNA molecule is its
Primer A Each numbered sequence of bases. Depending on the purpose of the analysis
block represents
5 repeats
one copy of a short
or application, DNA sequence information may be sought
V1 1 2 3 4 5 for any-length sequence of DNA, from a small series of base
DNA sequence.
pairs to a single chromosome to the genome as a whole. In
Primer A Primer B addition, the phrase “genome sequence” can encompass all
7 repeats coding and regulatory sequences of genes along with all the
V2 1 2 3 4 5 6 7
other sequences in the DNA, including repetitive sequences,
or it can be more limited. Most commonly, a “genome
Primer A Primer B
sequence” includes only those portions of the genome that
9 repeats
are transcribed into RNA. We discuss approaches to creat-
V3 1 2 3 4 5 6 7 8 9
ing and analyzing genomic sequence data in Chapter 16.
In addition to genetics research, DNA sequencing tech-
Primer A Primer B
nology has found broad application in fields like agriculture,
12 repeats
V4 1 2 3 4 5 6 7 8 9 10 11 12
medicine, and evolutionary biology. And at the same time as
its uses have broadened, laboratory and computer technolo-
Primer B gies have combined to make DNA sequencing faster and
cheaper by orders of magnitude.
(b) VNTR band patterns The first DNA sequencing protocols were developed in
Genotype 1977, one by Allan Maxam and Walter Gilbert and another
V1V1 V1V2 V1V3 V1V4 V2V2 V2V3 V2V4 V3V3 V3V4 V4V4 by Sanger. Of the two methods, Sanger’s was more amena-
ble to automation, and it is the basis for the high-throughput
Number of repeats
5¿ 3¿
24 5¿ 18-mer AATGCG
O–
27 5¿ 18-mer AATGCGCTG
P 32 5¿ 18-mer AATGCGCTGCATCG
O O H 3¿O
5¿ O
H
H H
35 5¿ 18-mer AATGCGCTGCATCGTAG
O N H H
H H N
H N N A O 5¿ C2H
T N H H
O– 3¿ O H
N N O (c) ddTTP reaction (”T” reaction)
P O H H P
O O CH3
H O 3¿ O–
Length of Partial
H2C 5¿ O H N N H O H H
synthesized replication
H
H H N
G N H N
H H fragment products
N O 5¿ C H
C
18-mer
2
O H– 3¿ O
N O H O O
21 5¿ AAT
P H H N
ddCTP O O
H PN H
O H 26 5¿ 18-mer AATGCGCT
C N N H O 3¿ O–
O – 3¿ H 2 C 5¿ O
N H N G
30 5¿ 18-mer AATGCGCTGCATT
–
O H H N H H
P H H O H N N O 5¿ C H 33 5¿ 18-mer AATGCGCTGCATCGT
O O– H 2
O
P
3¿ H H O O 38 5¿ 18-mer AATGCGCTGCATCGTAGCT
O O O –
Cannot form P
P phosphodiester bond H O3’ O–
O O
H2C 5¿ O H
H N H (d) ddATP reaction (”A” reaction)
H H
O N
dTTP recruited H H H
N
N A
H
O 5¿ C H
2
Length of Partial
H synthesized replication
by DNA T N H N
N CH O O
polyermerase OH H HO 3
P
O fragment products
H 3¿ O–
O 3’
CH3 H N T 19 5¿ 18-mer A
N H
Incorporation of ddNTP is a O H H
chain-termination reaction H
O C 2
H 20 5¿ 18-mer AA
that stops replication.
O O 29 5¿ 18-mer AATGCGCTGCA
P
O– 34 5¿ 18-mer AATGCGCTGCATCGTA
5¿
38 5¿ 18-mer AATGCGCTGCATCGTAGCTA
Figure 7.27 Nucleotides used in DNA sequencing reactions.
(a) Dideoxynucleotides (ddNTPs) are deoxygenated at both the 2′ Figure 7.28 DNA sequencing reactions. (a) A target region of
and 3′ carbons and cannot form a phosphodiester bond for the DNA is located by binding a single-stranded primer of 18 nucleo-
further elongation of DNA. (b) The incorporation of a dideoxynu- tides (an “18-mer”) that carries a 5′ label. Replication products
cleotide of cytosine (ddCTP) terminates the replication reaction. terminated by ddCTP each have a different length. (b) Replication
products terminated by ddGTP. (c) Termination products generated
Q Circle the feature of the dideoxynucleotide in part (b) that by ddTTP. (d) Termination products generated by ddATP.
prevents it from forming a phosphodiester bond.
7.5 Methods of Molecular Genetic Analysis Make Use of DNA Replication Processes 265
Dideoxy DNA sequencing is carried out in four generated by ddCTP incorporation into C reaction mix-
separate reaction mixtures—one for each of the four ture products.
ddNTPs. Each reaction mixture contains the DNA strand The same process ensues in the three reaction mixtures
to be sequenced, a single-stranded DNA primer, DNA containing, respectively, ddGTP, ddTTP, and ddATP
polymerase, large amounts of each of the four standard (Figures 7.28b–d). Upon the completion of the four parallel
nucleotides (dATP, dGTP, dCTP, and dTTP), and a small sequencing reactions, there will be, for every nucleotide
amount of one dideoxynucleotide, either that of adenine in the sequence, some partial replication DNA fragments
(ddATP), thymine (ddTTP), cytosine (ddCTP), or guanine terminating at that nucleotide.
(ddGTP). Following completion of the ddNTP reactions, the con-
Figure 7.28 shows that in each reaction mixture, tents of each reaction are loaded into separate lanes of a
DNA synthesis terminates at each site where a ddNTP DNA electrophoresis gel, and the contents undergo separa-
is incorporated into the newly synthesized molecule. tion by their length in base pairs (Figure 7.29a). Each DNA
Figure 7.28a shows the DNA fragment being sequenced fragment in the gel can be radioactively labeled for visualiza-
at the top, annealed to the 18-mer primer used to initiate tion, allowing the sequence of the newly synthesized strand
DNA synthesis (18-mer means the primer is 18 nucleo- to be “read” off the gel. Knowing that the smallest DNA
tides in length). It also shows that for the “C reaction fragment migrates the farthest from the origin of migration,
mixture” (the mixture that includes ddCTP), each location and knowing that newly synthesized DNA is elongated in
at which a cytosine can be incorporated into the grow- the 5′@to@3′ direction and that the primer is located at the 5′
ing chain generates some DNA replication fragments that end of the sequenced strand, we can identify the consecu-
terminate at that location. Keep in mind that most of the tive nucleotides by the gel lane in which successively lon-
cytosine in the C reaction mixture is the more highly con- ger DNA fragments are located. Thus, the first incorporated
centrated dCTP, so it is most likely that this nucleotide nucleotide after the primer is A (i.e., ddATP is incorporated
will be incorporated into the growing chain. If so, replica- and terminates replication), followed by another A (ddATP
tion continues. If, on the other hand, the less concentrated incorporated), followed by T (ddTTP incorporated), and so
ddCTP is incorporated, as it will be in a small propor- on. Once the sequenced strand is determined, the comple-
tion of the replicating molecules, replication terminates. mentary strand can be determined, using the knowledge that
Figure 7.28a shows five different DNA fragment lengths DNA strands are antiparallel and display complementary
DNA-sequencing gel
Inferred
strand
5¿
Sequenced
strand from gel
266 CHAPTER 7 DNA Structure and Replication
base pairing. An example of a dideoxy DNA sequencing gel management, and assembly of genome sequencing data
is shown in Figure 7.29b, and a portion of the sequence is generated by NGS.
given. NGS procedures begin with the fragmentation of
Dideoxy sequencing is a slow and labor-intensive genomic DNA. Figure 7.30 illustrates this process as
process that has been supplanted by high-throughput, the first step of one version of NGS known as Illumina
automated DNA sequencing. When manual dideoxy sequencing. In Illumina sequencing, the DNA is frag-
DNA sequencing was used, it could generate 100 to 200 mented 1 , tagged with adaptor molecules attached
base pairs of a sequence per gel. A laboratory technician to both ends of each strand 2 , and then denatured for
could hope to generate sequences for at most a few hun- analysis 3 . The adaptor molecules anchor the strand
dred base pairs in a day’s work. Modern automated DNA in a later step and may contain a PCR primer. The sin-
sequencers, once they are loaded with DNA samples to gle-strand fragments are next placed in a flow cell and
be sequenced and with reaction ingredients, can run 24 amplified to produce clusters of identical strands 4 5 .
hours a day, 365 days a year, and assemble genomic The mixture used for amplification contains DNA poly-
sequence at the rate of 10,000 to 20,000 bp per hour! merase, the four dNTPs, and other necessary compounds.
Genetic Analysis 7.3 tests your skills at interpreting dide- The dNTPs of A, T, G, and C are tagged with different
oxy sequencing results. fluorescent compounds that emit light in specific wave-
lengths when excited 6 . After each new nucleotide is
incorporated into a growing strand, a laser light excites
New Generations of DNA Sequencing the fluorescent compound attached to the base, and a pho-
Technology toreceptor records the emission wavelength to identify
the intensity 7 8 . Software records this information and
New generations of DNA sequencing technologies are con- converts it to identify the nucleotide as either A, T, C, or
tinuing to be developed. These technologies sequence DNA G 9 . This process repeats itself very rapidly as nucleo-
fragments in parallel, meaning that hundreds of thousands tides are added to the strands. The result is a sequence
to millions of DNA fragments are sequenced simultane- for the fragments in each cluster. In this manner, next-
ously. This brings the cost of DNA sequencing down so far generation sequencing identifies the sequence of a DNA
that the goal of the “thousand dollar genome sequence”— strand “by synthesis” rather than “by chain termination”
that is, the availability of genome sequencing as an afford- (the approach in dideoxy sequencing).
able component of everyday medicine—is within reach. It
is likely that most readers of this book will have the oppor-
tunity to have their genomes sequenced. Third-Generation Sequencing Dideoxy DNA sequencing
can be thought of as the first generation of DNA sequenc-
ing and NGS methods as the second generation. Inevitably,
Next-Generation Sequencing In the 40 years since a new generation of sequencing methods, known as third-
Sanger introduced dideoxy DNA sequencing, the process generation sequencing (TGS), have now been developed.
has gotten both faster and cheaper by many orders of mag- TGS and NGS methods differ from first generation methods
nitude. The first human genome sequence, copublished in in two ways that make them even faster and cheaper. First,
the scientific journals Science and Nature in 2001, was the TGS and NGS methods sequence long stretches of single
result of nearly 15 years of work and represented a total DNA molecules that are generated by PCR amplification
investment of approximately $3 billion. Today, the most rather than cloning that is used to produce DNA fragments
rapid automated DNA sequencers can produce nearly 50 for dideoxy DNA sequencing. In NGS and TGS, DNA is
human genome sequences a day for a cost is approaching first PCR amplified and then it is sequenced. This allows
$1000 per genome. These advances have been made pos- sequencing of repetitive DNA that can be difficult to clone,
sible by methods that sequence hundreds of thousands to Second, TGS and NGS are “massively parallel,” meaning
millions of DNA fragments simultaneously in a reaction, in that million of sequencing reads of short DNA sequences of
a process characterized as “massively parallel sequencing.” DNA fragments can be undertaken in each sequencing run.
There are several different versions of methods that In both NGS and TGS methods, the key task is to
take a massively parallel approach, but they are all based compile the sequences of DNA fragments into a complete
on a similar elaboration of dideoxy DNA sequencing. Col- genomic sequence, and it requires managing a great deal
lectively, these advanced methods are identified as next- of raw sequence data—aligning sequences, assembling
generation sequencing, or NGS. Enormous computing them into complete chromosome and genome sequences,
power and sophisticated ways of reconstructing whole and annotating the sequences to identify genes and regula-
genomes from the DNA sequences of fragments are an tory sequences. A discussion of these procedures is beyond
essential part of NGS. Advances in NGS have brought into the scope of this chapter, but we describe them in detail in
being the field of bioinformatics to deal with the gathering, Chapter 16, which presents a broader discussion of genomics.
GENETIC ANALYSIS 7.3
PROBLEM From the dideoxy DNA sequencing gel shown here, deduce the sequence and strand
polarities of the DNA duplex fragment.
BREAK IT DOWN: Chain termination, ddATP ddGTP ddTTP ddCTP
caused by the incorporation of a dide- –
oxynucleotide, produces the partially
replicated DNA fragments detected in a
DNA sequencing gel (p. 263).
Evaluate
1. Identify the topic this problem 1. This question concerns dideoxynucleotide DNA sequencing. The answer
addresses and the nature of the requires interpretation of a DNA sequencing gel to determine the double-
required answer. stranded sequence of a fragment of DNA, including strand polarities.
2. Identify the critical information given 2. A dideoxynucleotide DNA sequencing gel is shown.
in the problem.
Deduce
3. Review the essential steps of dideoxy- 3. DNA polymerase incorporates nucleotides in four parallel reactions. Each reac-
nucleotide DNA sequencing. tion mixture includes the four normal DNA nucleotides (dNTPs) and one labeled
dideoxynucleotide (ddNTP). Incorporation of a dNTP allows continued strand
synthesis, but incorporation of a ddNTP terminates synthesis.
4. Examine the gel and identify the 4. The 3′ end of the primer is used to initiate DNA synthesis. The first nucleotide
“beginning” of DNA synthesis. incorporated during synthesis is cytosine, as determined by identifying the
location of the smallest synthesized fragment: the “C” lane. The second and
TIP: DNA fragments toward the bottom of third nucleotides are both adenine. The first three nucleotides are therefore
the gel (nearer the positive pole) are shorter
than fragments higher up in the gel. The 5’-CAA-3’.
sequence of the synthesized strand shown in
the gel is 5′ at the bottom and 3′ at the top.
Solve
5. Write the rest of the sequence (along 5. The synthesized strand is
with the polarity) of the synthesized 5’-[primer]-CAATAGCTGAGGAGTCGATTCATGCCGATA-3’
strand shown in the gel.
6. Determine the sequence and polarity 6. The template DNA strand is
of the template strand used for DNA 3’-GTTATCGACTCCTCAGCTAAGTACGGCTAT-5’
synthesis.
For more practice, see Problems 28, 29, 30, and 34. Visit the Study Area to access study tools. Mastering Genetics
267
268 CHAPTER 7 DNA Structure and Replication
Genomic 5¿ 3¿
DNA 3¿ 5¿
A A T G C G C T G C A T C G T A C C T A
9 ...software converts the emissions
to identify each new nucleotide.
C A SE S T U D Y
DNA Helicase Gene Mutations and Human Progeroid Syndrome
At the latest count, the human genome contains 95 genes
that each produce a different helicase enzyme. Most of these Table 7.4 Human Progeroid Conditions
genes—64 of them—produce helicases that operate on
RNA. The remaining 31 helicase genes produce DNA heli- Disorder Mutated Gene(s)
cases. DNA helicases have a number of specific functions. RECQ helicase gene mutations
Collectively, they are active in DNA replication, transcription, Bloom syndrome (BS) BLM (RECQL2)
translation, recombination, damage repair, and other pro-
cesses. Any process requiring the separation of two nucleic Rothmund–Thomson syndrome RECQL4
acid strands will involve helicase. (RTS)
The helicase discussed in the body of this chapter Werner syndrome (WS) WRN (RECQL2)
belongs to a class of DNA helicases that are active in initi-
DNA repair-gene mutations
ating DNA replication. In this Case Study, we discuss a dif-
ferent class of DNA helicase, one identified as the RECQ Cockayne syndrome (two types) ERCC6 and ERCC8
class, but some of them do take part in DNA replication and Trichothiodystrophy (three types) ERCC2, ERCC3,
repair. Humans produce five RECQ helicases from five differ- GTF2H5 (three genes)
ent autosomal genes. RECQ helicases are primarily active in
Xeroderma pigmentosum (seven XPA–XPG (seven
meiotic crossing over and recombination. The designation
types) genes)
“REC” for these helicases is short for “recombination.” Dur-
ing meiotic recombination, RECQ helicases participate in the Lamin A (nuclear structure)
unwinding of DNA strands and work along with other proteins mutation
and enzymes to efficiently and accurately achieve reciprocal Hutchinson–Gilford progeria LMNA
recombination of the type we discuss in Chapter 5. Rather syndrome
than focus on the normal activities of RECQ helicases, how-
Unknown mutation
ever, we will now consider the mutations of three RECQ heli-
case genes that lead to three different hereditary conditions. Wiedemann–Rautenstrauch Unknown
In each case, mutation of a RECQ gene inherited in a reces- syndrome
sive homozygous genotype is the cause of the condition.
to inactivate the activity of the helicase. Normally this heli- WERNER SYNDROME Werner syndrome (WS) is a rare
case interacts with several other proteins to carry out and autosomal recessive condition that occurs in about 1 in
regulate specific steps of recombination. There is evidence 100,000 live births worldwide. Fewer than 2000 cases of WS
that BLM mutations lead to defective homologous recom- are currently known in the world. WS is sometimes called an
bination, and also that BLM mutations lead to an excessive “adult onset progeria” because symptoms are not usually
level of recombination between sister chromatids. Both apparent until puberty. The usual growth spurt that occurs
these abnormalities contribute to chromosome defects that to most people during puberty does not occur in individuals
accumulate up to 100 times faster than average. The accu- with WS. This leads to short stature, and is followed by pre-
mulated defects include the loss of chromosomal material, mature graying of the hair, hair loss, wrinkling and atrophy
gene mutations, and chromosome instability. These gene of the skin, loss of body fat, changes in facial shape, meta-
and chromosome defects account for the elevated cancer bolic abnormalities, and a strongly elevated risk of cancer.
risk associated with BS. Due to its onset around puberty, WS is usually diagnosed
in the early 20s, and life expectancy is about 50 years, on
ROTHMUND–THOMSON SYNDROME Rothmund–Thomson average.
syndrome (RTS) is a very rare autosomal recessive condition The WRN gene, also known as RECQL2, produces a
caused by mutation of the RECQL4 gene. Only about 300 cases DNA helicase that functions primarily during DNA replication
of RTS have been reported to date in the medical literature. and during DNA damage repair. As a DNA helicase, its func-
Moreover, mutations of RECQL4 have been identified in only tion is localized to the nucleus, where it separates the strands
about two-thirds of RTS patients, with no mutation of the gene of double-stranded DNA. More than 20 different mutations
detected in the other one-third of cases. Individuals with a of WRN have been identified. These occur throughout the
RECQL4 gene mutation experience difficulty initiating DNA repli- gene, and they have a range of effects on the production and
cation and have errors in homologous recombination. function of the RECQL2 helicase protein. Some mutations
RTS symptoms first appear in infancy and include a skin completely block production of the helicase, whereas oth-
rash that occurs in response to sun exposure. Abnormalities of ers severely reduce the level of function of the helicase. The
bones and teeth are also present in infancy. Often, cataracts RECQL2 helicase interacts with numerous other proteins as
appear in childhood. RTS patients have short stature, gastroin- it carries out its normal activities, and these interactions are
testinal abnormalities, and an elevated risk of cancer, particularly altered or prevented in WS. The consequent accumulation of
the bone cancer osteosarcoma. Most of these abnormalities are gene and chromosome mutations and DNA damage leads to
manageable with intensive medical treatment, and unless can- the disease symptoms.
cer occurs, a life span approaching normal is possible.
SU MMA RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
7.1 DNA Is the Hereditary Molecule of Life ❚❚ Complementary base pairs consist of a purine and a pyrimi-
dine. In DNA, A and T form two stable hydrogen bonds,
❚❚ Griffith determined in 1928 that a molecular transformation whereas G and C form three stable hydrogen bonds.
factor was responsible for transformation of living R bacte- ❚❚ Complementary nucleic acid strands are antiparallel.
ria into an S form.
❚❚ The stacking of base pairs in DNA imparts helical twisting
❚❚ In 1944, Avery, MacLeod, and McCarty’s study of in vitro that creates major grooves and minor grooves in the duplex.
transformation caused by an S-cell extract identified DNA
as the transformation factor and strongly suggested it is the
hereditary material. 7.3 DNA Replication Is Semiconservative
❚❚ Hershey and Chase determined in 1952 that bacteriophage and Bidirectional
T2 uses DNA, not protein, to reproduce within host E. coli
❚❚ Experimental evidence demonstrates that DNA replication is
cells.
semiconservative, meaning each daughter molecule receives
one parental strand and one newly synthesized strand that
7.2 The DNA Double Helix Consists of Two was produced using the parental strand as a template.
Complementary and Antiparallel Strands ❚❚ Most DNA replication is bidirectional. A replication bubble
with replication forks at each end expands as replication
❚❚ The DNA nucleotides consist of the five-carbon sugar progresses.
deoxyribose, a phosphate group, and one of four nitrogen-
❚❚ Bacterial genomes have a single replication origin, whereas
containing nucleotide bases.
eukaryotic genomes have many origins of replication.
❚❚ The DNA nucleotide bases are the purines adenine and gua-
❚❚ Eukaryotic replication origins initiate asynchronously dur-
nine, and the pyrimidines cytosine and thymine.
ing S phase.
❚❚ Phosphodiester bonds form between 5′ phosphate and 3′
❚❚ Eukaryotic DNA replication produces sister chromatids.
OH groups to join nucleotides into polynucleotide chains.
Problems 271
7.4 DNA Replication Precisely Duplicates ❚❚ Telomerase is a ribonucleoprotein that synthesizes telo-
the Genetic Material meric repeat sequences to maintain telomere length in
germ-line and stem cells.
❚❚ Bacterial, archaeal, and yeast DNA replication begins at
specific locations that bind replication initiation proteins.
7.5 Methods of Molecular Genetic Analysis
Specific conserved sequences are found in bacteria, but
other mechanisms direct replication initiation in eukaryotes. Make Use of DNA Replication Processes
❚❚ DNA replication begins with the synthesis of an RNA ❚❚ The polymerase chain reaction (PCR) is a method for pro-
primer by primase, followed by synthesis of leading and ducing large numbers of copies of target DNA sequences.
lagging DNA strands by DNA polymerase operating in ❚❚ Dideoxynucleotide DNA sequencing is a method for dis-
replisome complexes. covering the sequence of DNA fragments.
❚❚ To complete replication, RNA primers are removed by ❚❚ Next-generation and third-generation DNA sequencing are
DNA polymerase, and DNA segments are joined by DNA much faster and far cheaper methods that have paved the
ligase. way for large numbers of genome sequencing projects and
❚❚ DNA polymerases not only replicate DNA but also proof- personal human genome sequencing.
read newly synthesized DNA for accuracy.
❚❚ Eukaryotic chromosomes have repetitive sequences called
telomeres at their ends that shorten with each replication in
somatic cell cycles.
PRE PA R I N G F O R P R O BLEM S O LV I NG
In addition to the list of problem-solving tips and sugges- 5. Be able to identify the major enzymatic activities dur-
tions given here, you can go to the Study Guide and Solu- ing DNA replication.
tions Manual that accompanies this book for help at solving
6. Be prepared to use an understanding of DNA replica-
problems.
tion processes and biochemical activities to analyze
1. Be familiar with and able to describe the structure of DNA. and predict the results of experiments involving DNA
replication.
2. Know the four DNA nucleotide bases and be able to
describe complementary base pairing and the antiparal- 7. Understand the polymerase chain reaction (PCR) pro-
lel alignment of strands. If required by your instructor, cess and results.
know the structure of the DNA bases.
8. Be able to describe dideoxy DNA sequencing and to
3. Be able to describe the evidence that identified DNA as analyze DNA sequencing results.
the hereditary material.
4. Understand the overall process of DNA replication and
be able to diagram the general structure of a replication
bubble.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. What results from the experiments of Frederick Griffith T2 contains protein and DNA, but not RNA. Explain why
provided the strongest support for his conclusion that a T2 was a good choice for this experiment.
transformation factor is responsible for heredity?
4. Explain how the Hershey and Chase experiment identified
2. Explain why Avery, MacLeod, and McCarty’s in vitro DNA as the hereditary molecule.
transformation experiment showed that DNA, but not
5. One strand of a fragment of duplex DNA has the sequence
RNA or protein, is the hereditary molecule.
5’-ATCGACCTGATC-3’.
3. Hershey and Chase selected the bacteriophage T2 for their a. What is the sequence of the other strand in the
experiment assessing the role of DNA in heredity because duplex?
272 CHAPTER 7 DNA Structure and Replication
b. What is the name of the bond that joins one nucleotide 11. There is a problem completing the replication of linear
to another in the DNA strand? chromosomes at their ends.
c. Is the bond in part (b) a covalent or a noncovalent a. Describe the problem and identify why telomeres
bond? shorten in each replication cycle.
d. Which chemical groups of nucleotides react to form b. What is the function of telomerase, and how does it
the bond in part (b)? operate to synthesize telomeres?
e. What enzymes catalyze the reaction in part (d)?
12. Explain how RNA participates in DNA replication.
f. Identify the bond that joins one strand of a DNA
duplex to the other strand. 13. A sample of double-stranded DNA is found to contain
g. Is the bond in part (f) a covalent or a noncovalent 20% cytosine. Determine the percentage of the three other
bond? DNA nucleotides in the sample.
h. What term is used to describe the pattern of base
pairing between one DNA strand and its partner in a 14. Bacterial DNA polymerase I and DNA polymerase III
duplex? perform different functions during DNA replication.
i. What term is used to describe the polarity of two DNA a. Identify the principal functions of each molecule.
strands in a duplex? b. If mutation inactivated DNA polymerase I in a strain
of E. coli, would the cell be able to replicate its DNA?
6. The principles of complementary base pairing and
If so, what kind of abnormalities would you expect to
antiparallel polarity of nucleic acid strands in a
find in the cell?
duplex are universal for the formation of nucleic acid
c. If a strain of E. coli acquired a mutation that inacti-
duplexes. What is the chemical basis for this
vated DNA polymerase III function, would the cell be
universality?
able to replicate its DNA? Why or why not?
7. For the following fragment of DNA, determine the num- 15. Diagram a replication fork in bacterial DNA and label the
ber of hydrogen bonds and the number of phosphodiester following structures or molecules.
bonds present:
a. DNA pol III g. topoisomerase
5’-ACGTAGAGTGCTC-3’ b. helicase h. SSB protein
3’-TGCATCTCACGAG-5’ c. RNA primer i. lagging strand (label
8. Figure 1.6 presents simplified depictions of nucleotides d. origin of replication its polarity)
containing deoxyribose, a nucleotide base, and a phos- e. leading strand (label j. primase
phate group. Use this simplified method of representa- its polarity) k. Okazaki fragment
tion to illustrate the sequence 3’-AGTCGAT-5’ and its f. DNA pol I
complementary partner in a DNA duplex. 16. Which of the following equations are true for the percent-
a. What kind of bond joins the C to the G within a single ages of nucleotides in double-stranded DNA?
strand? a. (A + G)/(C + T)= 1.0 d. (A)/(C) = (G)/(T)
b. What kind of bonds join the C in one strand to the G in b. (A + T)/(G + C)= 1.0 e. (A)/(G) = (T)(C)
the complementary strand? c. (A)/(T) = (G)/(C)
c. How many phosphodiester bonds are present in this 17. Which of the following equalities is not true for double-
DNA duplex? stranded DNA?
d. How many hydrogen bonds are present in this DNA
a. (G + T) = (A + C)
duplex?
b. (G + C) = (A + T)
9. Consider the sequence 3’-ACGCTACGTC-5’. c. (G + A) = (C + T)
a. What is the double-stranded sequence? 18. List the order in which the following proteins and
b. What is the total number of covalent bonds joining the enzymes are active in E. coli DNA replication: DNA pol I,
nucleotides in each strand? SSB, ligase, helicase, DNA pol III, and primase.
c. What is the total number of noncovalent bonds
joining the nucleotides of the complementary 19. Two viral genomes are sequenced, and the following per-
strands? centages of nucleotides are identified:
10. DNA polymerase III is the main DNA-synthesizing Genome 1: A = 28%, C = 22%, G = 28%, T = 22%
enzyme in bacteria. Describe how it carries out its role of Genome 2: A = 22%, C = 28%, G = 28%, T = 22%
elongating a strand of DNA.
Are the DNA molecules in each genome single-stranded
or double-stranded?
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
20. Matthew Meselson and Franklin Stahl demonstrated that 21. Raymond Rodriguez and colleagues demonstrated conclu-
DNA replication is semiconservative in bacteria. Briefly sively that DNA replication in E. coli is bidirectional. Explain
outline their experiment and its results for two DNA rep- why locating the origin of replication on one side of the
lication cycles, and identify how the alternative models of circular chromosomes and the terminus of replication on the
DNA replication were excluded by the data. opposite side of the chromosome supported this conclusion.
Problems 273
22. Joel Huberman and Arthur Riggs used pulse labeling to 29. The following dideoxy DNA sequencing gel is produced
examine the replication of DNA in mammalian cells. in a laboratory.
Briefly describe the Huberman–Riggs experiment, and
identify how the results exclude a unidirectional model of
DNA replication. Origin
Migration
– +
23. Why do the genomes of eukaryotes, such as Drosophila,
need to have multiple origins of replication, whereas bac-
terial genomes, such as that of E. coli, have only a single G
origin?
C
24. Bloom syndrome (OMIM 210900) is an autosomal reces-
sive disorder caused by mutation of a DNA helicase.
Among the principal symptoms of the disease are chro- T
mosome instability and a propensity to develop cancer.
Explain these symptoms on the basis of the helicase A
mutation.
25. How does rolling circle replication (see Section 6.2) differ
from bidirectional replication?
26. Telomeres are found at the ends of eukaryotic What is the double-stranded DNA sequence of this mol-
chromosomes. ecule? Label the polarity of each strand.
a. What is the sequence composition of telomeres? 30. Using an illustration style and labeling similar to that in
b. How does telomerase assemble telomeres? Problem 29, draw the electrophoresis gel containing dide-
c. What is the functional role of telomeres? oxy sequencing fragments for the DNA template strand
d. Why is telomerase usually active in germ-line cells but 3’-AGACGATAGCAT-5’.
not in somatic cells?
31. A PCR reaction begins with one double-stranded segment
27. A family consisting of a mother (I-1), a father (I-2), and of DNA. How many double-stranded copies of DNA are
three children (II-1, II-2, and II-3) are genotyped by PCR present after the completion of 10 amplification cycles?
for a region of an autosome containing repeats of a 10-bp After 20 cycles? After 30 cycles?
sequence. The mother carries 16 repeats on one chromo-
some and 21 on the homologous chromosome. The father 32. DNA replication in early Drosophila embryos occurs
carries repeat numbers of 18 and 26. about every 5 minutes. The Drosophila genome con-
tains approximately 1.8 * 108 base pairs. Eukaryotic
a. Following the layout of Figure 7.28c, which aligns
DNA polymerases synthesize DNA at a rate of approxi-
members of a pedigree with their DNA fragments
mately 40 nucleotides per second. Approximately how
in a gel, draw a DNA gel containing the PCR frag-
many origins of replication are required for this rate of
ments generated by amplification of DNA from
replication?
the parents (I-1 and I-2). Label the size of each
fragment. 33. What would be the effects on DNA replication if muta-
b. Identify all the possible genotypes of children of this tion of DNA pol III caused it to lose each of the following
couple by specifying PCR fragment lengths in each activities:
genotype. a. 5′ to 3′ polymerase activity
c. What genetic term best describes the pattern of inheri- b. 3′ to 5′ exonuclease activity
tance of this DNA marker? Explain your choice.
34. A sufficient amount of a small DNA fragment is available
28. In a dideoxy DNA sequencing experiment, four separate for dideoxy sequencing. The fragment to be sequenced
reactions are carried out to provide the replicated material contains 20 nucleotides following the site of primer
for DNA sequencing gels. Reaction products are usually binding:
run in gel lanes labeled A, T, C, and G.
5’-ATCGCTCGACAGTGACTAGC-[primer site]-3’
a. Identify the nucleotides used in the dideoxy DNA
sequencing reaction that produces molecules for the A Dideoxy sequencing is carried out, and the products of the
lane of the sequencing gel. four sequencing reactions are separated by gel electropho-
b. How does PCR play a role in dideoxy DNA resis. Draw the bands you expect will appear on the gel
sequencing? from each of the sequencing reactions.
c. Why is incorporation of a dideoxynucleotide during
DNA sequencing identified as a “replication-terminat-
ing” event?
274 CHAPTER 7 DNA Structure and Replication
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
35. You are participating in a study group preparing for an 37. The following diagram shows the parental strands of a
upcoming genetics exam, and one member of the group DNA molecule undergoing replication. Draw the daughter
proposes that each of you draw the structure of two strands present in the replication bubble, indicating
DNA nucleotides joined in a single strand. The figures a. the polarity of daughter strands
are drawn and exchanged for correction. You receive the b. the leading and lagging strands
accompanying diagram to correct: c. Okazaki fragments
a. Identify and correct at least five things that are wrong d. the locations of RNA primers
in the depiction of each nucleotide.
b. What is wrong with the way the nucleotides are joined?
c. Draw this single-stranded segment correctly.
Origin
O
Base
–
O P O– 3¿ 5¿
OH CH C H 5¿ 3¿
H H
H OH Replication Replication
H O Origin
–
O P O– Base
OH CH C H
H H
38. Go to the OMIM website (https://www.ncbi.nlm.nih
H OH
.gov/omim) and type “dyskeratosis congenita autosomal
H O dominant 1” (DKCA1) into the search bar. The result will
include a clickable link to the disorder that has an OMIM
36. Suppose that future exploration of polar ice on Mars number of 127550. Review the OMIM information you
identifies a living microbe and that analysis indicates retrieve and notice that this disorder is caused by a muta-
the organism carries double-stranded DNA as its genetic tion of a telomerase gene that results in abnormally rapid
material. Suppose further that DNA replication analysis shortening of telomeres and the appearance of disease
is performed by first growing the microbe in a growth symptoms at progressively younger ages in successive
medium containing the heavy isotope of nitrogen (15N), generations of the affected families. Use this and other
that the organism is then transferred to a growth medium information on OMIM to assist with this problem.
containing the light isotope of nitrogen (14N), and that Go the reference number 15 at the bottom of the
the nitrogen composition of the DNA is examined by OMIM page for a link to a 2004 paper by Tom Vulliamy
CsCl ultracentrifugation and densitometry after the first, and colleagues that appeared in the journal Nature
second, and third replication cycles in the 14N@containing Genetics. Click on the “Full text” option and download a
medium. The results of the experiment are illustrated here copy of the paper. Look at Table 1 of the paper on page
for each cycle. The control shows the positioning of the 448. This table lists the lengths of telomeres measured in
three possible DNA densities. Based on the results shown, members of the families in this study. Telomeres shorten
what can you conclude about the mechanism of DNA rep- with age, and the telomere lengths in Table 1 are age-
lication in this organism? (Hint: See the description of the adjusted. The negative numbers for telomere lengths in
Meselson and Stahl experiment on pp. 245–247.) the table indicate that telomeres are shorter than average
for age, and the more negative the number, the shorter the
Lighter telomere. Based on Table 1, discussion in the Vulliamy
N14/N14 et al. (2004) paper, and information available on OMIM
answer the following:
N15/N14
a. How do telomere lengths in children compare with
N15/N15 telomere lengths of their parents?
Heavier b. Why are telomeres of people with DKCA1 shorter
Control Cycle 1 Cycle 2 Cycle 3 than average?
Molecular Biology
of Transcription
and RNA Processing
8
CHAPTER OUTLINE
8.1 RNA Transcripts Carry the
Messages of Genes
8.2 Bacterial Transcription Is a Four-
Stage Process
8.3 Eukaryotic Transcription Is More
Diversified and Complex than
Bacterial Transcription
8.4 Posttranscriptional Processing
Modifies RNA Molecules
ESSENTIAL IDEAS
❚❚ Ribonucleic acid (RNA) molecules are
transcribed from genes and are of several
The molecular basis of sex determination in fruit flies (Drosophila melano-
types. The most common types are
gaster) involves variations in splicing of the precursor mRNA transcript of
messenger RNA (mRNA), transfer RNA
the Tra gene. One pattern of splicing helps direct female sex development,
(tRNA), and ribosomal RNA (rRNA), but
and an alternative splicing pattern helps direct male sex development.
other types have important functions
A
as well.
H 2¿
H H H N C incoming ribonucleotide triphosphate in the process, just as
H
OH OH OH OH
NH2 in DNA synthesis. Compare Figure 8.2 to Figure 7.6 to see
the similarity of these nucleic acid synthesis processes.
Ribose
Adenosine Guanosine
5¿-monophosphate 5¿-monophosphate Experimental Discovery
(AMP) (GMP) of Messenger RNA
Pyrimidine nucleotides In their search for the RNA molecule responsible for trans-
Phosphate Nucleotide base mitting the genetic information content of DNA to the site
of protein production, researchers utilized many techniques.
O– H O O– H NH2 Among the methods used was the pulse–chase technique (see
O– O–
P C5 4C P C5 4C Section 7.3) to follow the trail of newly synthesized RNA in
O O H C6 U 3N H O O H C6 C 3 N cells. Recall that the “pulse” step of this technique exposes
1 2 1 2
H 2C 5¿ O N C H C 5¿ O N C cells to radioactive nucleotides that become incorporated
4¿ 4¿
H H 1¿ O H H 1¿ O into newly synthesized nucleic acids. After a short incuba-
H 3¿ 2¿ H H 3¿ 2¿ H
tion period to incorporate the labeled nucleotides, a “chase”
OH OH OH OH step replaces any remaining unincorporated radioactive
Uridine Cytidine nucleotides by introducing an excess of unlabeled nucleo-
5¿-monophosphate 5¿-monophosphate
(UMP) (CMP) tides. An experimenter can then observe the changing loca-
tion of labeled nucleic acid to determine the pattern of its
Figure 8.1 The four RNA ribonucleotides. Shown in their mono- movement and its ultimate destination and fate.
phosphate forms, each ribonucleotide consists of the sugar ribose, In 1957, microbiologist Elliot Volkin and geneticist
a phosphate group, and one of the RNA nucleotide bases adenine, Lazarus Astrachan used the pulse–chase method to study
guanine, cytosine, and uracil.
transcription in bacteria immediately following infection
Q Examine these four RNA nucleotides in comparison to the by a bacteriophage. Exposing newly infected bacteria to
four DNA nucleotides illustrated in Figure 7.5 and identify radioactive uracil, they observed rapid incorporation of the
one chemical difference and one nucleotide base difference label, indicating a burst of transcriptional activity. In the
between the nucleotides making up DNA and those making up chase phase of the experiment, when radioactive uracil was
RNA. removed, Volkin and Astrachan found that the radioactivity
quickly dissipated, indicating that the newly synthesized
RNA broke down rapidly. They concluded that the synthesis
The second chemical difference between RNA and of a type of RNA with a very short life span is responsible
DNA nucleotides is the presence of the sugar ribose in for the production of phage proteins that drive progression
RNA rather than the deoxyribose occurring in DNA. The of the infection.
ribose gives RNA its name (ribonucleic acid). Compare the Similar pulse–chase experiments were soon con-
ribose molecules shown in Figure 8.1 with deoxyribose in ducted with eukaryotic cells. In these experiments, radio-
Figure 7.5, and notice that ribose carries a hydroxyl group activity was concentrated in the nucleus immediately after
(OH) not found in deoxyribose at the 2′ carbon of the ring. the pulse. This indicated that RNA was synthesized in
Except for this difference, ribose and deoxyribose are iden- the nucleus. Over a short period of time, however, radio-
tical, having a nucleotide base attached to the 1′ carbon and active RNA migrated to the cytoplasm, where translation
a hydroxyl group at the 3′ carbon. takes place. The radioactivity dissipated after lingering in
The similarity of the sugars of RNA and DNA leads the cytoplasm for a period of time. These experiments led
to the formation in RNA of phosphodiester bonds between researchers to conclude that the RNA synthesized in the
nucleotides of a strand and to a sugar-phosphate backbone nucleus was likely to act as an intermediary carrying the
that is identical to that of DNA. RNA-strand phosphodiester genetic message of DNA to the cytoplasm for translation
bond formation takes place by the same general mechanism into proteins.
as found in DNA (Figure 8.2). RNA is synthesized from a The discovery of mRNA was capped in 1961 when an
DNA template strand using the same purine–pyrimidine experiment by the biologists Sydney Brenner, François,
278 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
(a) (b)
DNA template strand DNA template strand
5¿ 5¿
3¿ O O– 3¿ O O–
O O– O O–
O O –
H CH2 P O O –
CH2 P
H CH2 P H CH2 P
H CH2 P O H CH2 P O
O O H O O H
O O H O O H
HO H O HO H O
O O
O HO H O HO H
HO H H HO H H
HO H H N HO H H N
H N H N
N N N N
CH3 N CH3 N
T T
O N A O N A
N H N H
N G N G
O N O N
H O N H O N
N N H H N N H H
H H H H H
N H H N H H
N H H N H H O
N N N
H A N N H A N N O
H O H O U
C C H
H H N
N N N N
N N H
H H H OH
H H H OH H H H OH
H OH H OH O
O H O
O O H OH
H OH O H O O
H O O N H O O
O O O CH2 H P CH2 H
P CH2 H P
P CH2 H U P CH2 H O– O
O –
O N H O –
O
O –
O O –
O 3¿
3¿ H OH
H
5¿ 5¿
RNA transcript strand O– O RNA transcript strand New
H OH O
Phosphodiester O P O phosphodiester
bond CH2 H O P O– bond
O Pyrophosphate O
UTP recruited by O P O– group (discarded)
RNA polymerase O P O–
O –
O
O P O–
O–
Triphosphate
Figure 8.2 RNA synthesis. RNA polymerase catalyzes the formation of a phosphodiester bond to join
a new RNA nucleotide to the 3′end of a growing RNA strand. Two phosphate molecules are cleaved as a
pyrophosphate group.
Jacob, and Matthew Meselson identified an unstable form “messenger” RNA with a short half-life is responsible for
of RNA that acted as the genetic messenger. Brenner and protein synthesis during infection.
his colleagues knew from experimental evidence presented
by George Palade in 1958 that ribosomes are composed of
Categories of RNA
RNA and protein and function as the site of protein syn-
thesis. They designed an experiment that used bacterio- In addition to messenger RNA (mRNA), a wide variety of
phage infection of bacterial cells to determine whether new other RNAs are found in cells. These are RNA molecules
phage protein synthesis that is part of a bacterial infec- that are not translated into proteins but perform their own
tion required newly constructed ribosomes or whether particular functions. The two most prominent of them are
phage proteins could be produced using existing bacterial ribosomal RNA (rRNA) and transfer RNA (tRNA). We
ribosomes. The experiment found that newly synthesized discuss ribosomal and transfer RNA to some degree in
phage RNA associates with existing bacterial ribosomes to this chapter and describe their functions in the following
produce phage proteins and that newly formed ribosomes chapter. The major and best understood forms of RNA are
are not responsible for phage protein synthesis. The RNA listed and briefly described in Table 8.1. RNAs that are
that directed phage protein synthesis formed and degraded listed there but are not discussed in this chapter will be
quickly, leading the experimenters to conclude that a phage described in more detail in later chapters.
8.2 Bacterial Transcription Is a Four-Stage Process 279
All types of RNA are generated by the transcription expression, controlling the stability or translatability of cer-
of genes. Genes whose transcription yields messenger tain mRNAs. This component of regulated gene expression
RNA (mRNA), the short-lived intermediary form of RNA is described in Section 13.3.
described by Brenner and his colleagues that conveys the
genetic message of DNA to be translated, are protein-
producing genes. The RNA transcripts of these genes 8.2 Bacterial Transcription
direct protein synthesis by the process of translation that is
described in the next chapter. Messenger RNA is the only Is a Four-Stage Process
form of RNA that undergoes translation. Transcription of
mRNA and posttranscriptional processing of mRNA are Transcription is the synthesis of a single-stranded RNA mol-
principal areas of focus in this chapter. ecule by RNA polymerase. It is most clearly understood and
Ribosomal RNA combines with numerous proteins to described in bacteria, and E. coli is the model experimental
form the ribosome, the molecular machine responsible for organism from which the majority of our knowledge of bac-
translation. Specific segments of rRNA molecules interact terial transcription has been derived. In this section, we exam-
with mRNA to initiate translation. Transfer RNA is the ine the four stages of transcription in bacteria: (1) promoter
RNA that carries amino acids to the ribosomes for construc- recognition and identification, (2) the initiation of transcript
tion of proteins, and it is encoded in dozens of different synthesis, (3) transcript elongation, and (4) transcription
forms in all genomes. Each tRNA is responsible for binding termination.
a particular amino acid that it carries to the ribosome. At Like all RNA polymerases, bacterial RNA polymerase
the ribosome a group of nucleotides of a tRNA temporarily uses one strand of DNA, the template strand, to assemble
base pair with nucleotides of mRNA. The tRNA deposits its the transcript by complementary and antiparallel base pair-
amino acid that is added to the protein chain being produced ing of RNA nucleotides with DNA nucleotides of the tem-
there. plate strand (see Figure 1.9 for a review). The coding strand
Four types of RNA perform specialized functions of DNA, also known as the nontemplate strand, is com-
in eukaryotic cells only. We discuss telomerase RNA plementary to the template strand. The gene—that is, the
in Section 7.4, where its role in providing a template for stretch of DNA regions that produces an RNA transcript—
synthesis of the repeating DNA sequence composing telo- contains several segments with distinct functions (Figure
meres is described. Small nuclear RNA (snRNA) of vari- 8.3). The promoter of the gene is immediately upstream—
ous types is found in the nucleus of eukaryotic cells, where that is, within a few nucleotides of the 5′ start of tran-
it participates in mRNA processing and intron removal scription, which is identified as corresponding to the +1
(Section 8.4). Micro RNA (miRNA) and small interfering nucleotide. The promoter is not transcribed. Instead, the pro-
RNA (siRNA) are recently recognized types of regula- moter sequence is a transcription-regulating DNA sequence
tory RNA that are particularly active in plant and animal that controls the access of RNA polymerase to the gene.
cells. Micro RNAs and siRNAs have a widespread and The coding region is the portion of the gene that is tran-
important role in the posttranscriptional regulation of gene scribed into mRNA and contains the information needed to
280 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
synthesize the protein product of the gene. The termination RNA synthesis without a sigma subunit. The joining of the
region is the portion of the gene that regulates the cessation sigma subunit to the core enzyme to form a holoenzyme
of transcription. The termination region is located immedi- induces a conformational shift in the core segment that
ately downstream—that is, immediately 3′ to the coding enables it to bind specifically to particular promoter consen-
segment of the gene. sus sequences.
Because this single RNA polymerase is responsible for
Bacterial RNA Polymerase all bacterial transcription, the bacterial RNA polymerase
must recognize promoters for protein-coding genes as well
A single type of E. coli RNA polymerase catalyzes tran- as for genes that produce otherRNAs, such as tRNA and
scription of all RNAs. The initial experimental evidence rRNA. But not all promoters of bacterial genes are iden-
supporting this conclusion came from analysis of the tical. There is great diversity among bacterial promoter
effect of the antibiotic rifampicin on bacterial RNA syn- sequences, permitting certain genes to be expressed only
thesis. Rifampicin inhibits RNA synthesis by preventing under special circumstances. Bacteria manage the recogni-
RNA polymerase from catalyzing the formation of the tion of the promoters of these specialized genes by produc-
first phosphodiester bond in the RNA chain. In rifampicin- ing several different types of sigma subunits that can join
sensitive (rif S) bacterial strains, synthesis of all three the core polymerase. These so-called alternative sigma
major types of RNA (mRNA, tRNA, and rRNA) is inhib- subunits alter the specificity of the holoenzymes for pro-
ited in the presence of rifampicin. In contrast, rifampicin- moter regions by imparting distinct conformational changes
resistant (rif R) bacteria actively transcribe DNA into the to the core. These differences enable transcription of spe-
three major RNAs when rifampicin is present. Molecular cific genes under the appropriate conditions, or at the cor-
analysis identifies a single mutation of RNA polymerase rect time.
in rif R strains that allows it to remain catalytically active
when exposed to rifampicin. Subsequent molecular stud-
Bacterial Promoters
ies have confirmed the presence of a single bacterial RNA
polymerase. Promoters are double-stranded regulatory DNA sequences
Bacterial RNA polymerase is composed of a pentam- that bind transcription proteins such as RNA polymerase
eric (five-polypeptide) RNA polymerase core that binds to
a sixth polypeptide, called the sigma subunit (s), which
induces a conformational change in the core enzyme that RNA polymerase Sigma RNA polymerase
switches it to its active form. In its active form, the RNA core enzyme subunit holoenzyme
polymerase is described as a holoenzyme, a term meaning 36.5 kD 4 kD 155 kD
an intact complex of multiple subunits, with full enzymatic aII
v
aII
v
capacity. Figure 8.4 shows a common type of sigma subunit b¿ b¿
known as s70, but there are also other sigma subunits in aI + s 70 aI
s70
E. coli. b b
The RNA polymerase core consists of two a subunits, 36.5 kD
designated aI and aII, two b subunits, and an v (omega) 151 kD
subunit. The molecular weight of the five-subunit core RNA 390 kD One of four kinds 430 kD molecular
polymerase is approximately 390 kD (kiloDaltons), and molecular in E. coli; molecular weight. Alternative
with the sigma subunit added, the holoenzyme has a molec- weight. weights are from sigma subunits give
ular weight of 430 kD. Each of these subunits is evolution- 27 to 70 kD. the holoenzyme
specificity for
arily conserved in archaea and in eukaryotes. different promoters.
By itself, the core RNA polymerase can transcribe
DNA template-strand sequence into RNA sequence, but the Figure 8.4 Bacterial RNA polymerase core plus a sigma (s)
core is unable to efficiently bind to a promoter or initiate subunit forms the fully active holoenzyme.
8.2 Bacterial Transcription Is a Four-Stage Process 281
and direct the RNA polymerase to the nearby start of tran- UTR) separate the 59 mRNA end from the start codon and
scription. RNA polymerase is attracted to promoters by the the stop codon from the rest of the mRNA, respectively.
presence of consensus sequences, short regions of DNA Natural selection has operated to retain strong sequence
sequences that are highly similar, though not necessar- similarity in consensus regions and to retain the position of
ily identical, to one another and are located in the same the consensus regions relative to the start of transcription. The
position relative to the start of transcription of different effectiveness of evolution in maintaining promoter consen-
genes (see Section 7.4 for an introduction to consensus sus sequences is illustrated by comparison with the sequences
sequences). between and around -10 and -35, which are not conserved
Although promoters are double stranded, promoter and which exhibit considerable variation. In addition, the spac-
consensus sequences are usually written in a single- ing between the sequences and their placement relative to the
stranded shorthand form that gives the 5′@to@3′ sequence +1 nucleotide is stable. RNA polymerase is a large molecule
of the coding (nontemplate) strand of DNA (Figure 8.5). that binds to -10 and -35 consensus sequences and occupies
The most commonly occurring bacterial promoter con- the space between and immediately around the sites. Crystal
tains two consensus sequence regions that each play an structure models show that the enzyme spans enough DNA to
important functional role in recognition by RNA poly- allow it to contact promoter consensus regions and reach the
merase and the subsequent initiation of transcription. +1 nucleotide. Once bound at a promoter in this fashion, RNA
These consensus sequences are located upstream from the polymerase can initiate transcription. Genetic Analysis 8.1
+1 nucleotide (the start of transcription) in a region flank- guides you through the identification of promoter consensus
ing the gene where the nucleotides are denoted by nega- regions.
tive numbers and are not transcribed. At the -10 position
of the E. coli promoter is the Pribnow box sequence, or
the −10 consensus sequence, consisting of 6 bp having
Transcription Initiation
the consensus sequence 5’-TATAAT-3’. The Pribnow RNA polymerase holoenzyme initiates transcription
box is separated by about 25 bp from another 6-bp region, through a process involving two steps. In the first step,
the −35 consensus sequence, identified by the nucleo- the holoenzyme makes an initial loose attachment to the
tides 5’-TTGACA-3’. The nucleotide sequences that double-stranded promoter sequence and then binds tightly to
occur upstream, downstream, and between these consen- it to form the closed promoter complex ( 1 in F oundation
sus sequences are highly variable and contain no other Figure 8.6). In the second step, the bound holoenzyme
consensus sequences. Thus, in a functional sense, the unwinds approximately 18 bp of DNA around the -10 con-
-10 (Pribnow) and -35 consensus sequences are impor- sensus sequence to form the open promoter complex 2 .
tant because of their nucleotide content, their location rela- Following formation of the open promoter complex,
tive to one another, and their location relative to the start of the holoenzyme progresses downstream to initiate RNA
transcription. In contrast to the consensus sequences them- synthesis at the +1 nucleotide on the template strand of
selves, the nucleotides between -10 and -35 are impor- DNA 3 .
tant as spacers between the consensus elements, but their Bacterial promoters often differ from the consensus
specific sequences are not critical. In the figure, untrans- sequence by one or more nucleotides, and some are different
lated mRNA at the 59 end (59 UTR) and at the 39 end (39 at several nucleotides. Since considerable DNA-sequence
Gene
–10
consensus
–35 sequence
consensus (Pribnow +1
DNA sequence box) RNA-coding region
Coding strand 5¿ TTGACA TATAAT 3¿
Template strand 3¿ AACTGT ATATTA 5¿
Promoter
Transcription Termination
start Transcription region
Start Stop
codon codon
mRNA 5¿ 3¿
5¿ UTR 3¿ UTR
Figure 8.5 Bacterial promoter structure and consensus sequences. Two promoter consensus
sequences—the Pribnow box at - 10 and the - 35 sequence—are essential promoter regulatory elements.
F O U N figure
F oundation D A Tx.x
I O N F I G U R E 8.6
Bacterial Transcription
1 The RNA polymerase core
enzyme and sigma subunit
bind to –10 and –35 Closed promoter Start site
promoter consensus Termination
sequences. +1 Transcription sequence
RNA polymerase
Coding 5¿ 3¿
Template 3¿ s70 5¿
–35 –10
Coding 5¿ 3¿
Template 3¿ 5¿
s70
–35 –10
Open promoter
Coding 5¿ 3¿
Template 3¿ 5¿
5¿ 3¿ s70
–35 –10
RNA
Coding 5¿ 3¿
Template 3¿ 5¿
5¿ 3¿
Coding 5¿ 3¿
Template 3¿ 5¿
5¿ 3¿
RNA transcript
282
GENETIC ANALYSIS 8.1
PROBLEM DNA sequences in the promoter region of 10 E. coli genes are shown. Sequences at the
-35 and -10 sites are boxed. BREAK IT DOWN: Promoter consensus
a. For these 10 genes, what are the -35 and -10 consensus sequences? sequences are similar in different genes
and bind transcriptionally active proteins
b. What would be the expected effects of a mutation in a promoter consensus (p. 281).
region versus a mutation in the sequence between consensus regions? BREAK IT DOWN: Research methods directed at
detecting promoters and assessing their functionality are
described in Research Technique 8.1 and Figure 8.12.
–35 –10 +1
Gene region region
A2 AATGCTTGACTCTGTAGCGGGAAGGCG––TATAATGCACACC–CCGC
bio AAAACGTGTTTTTTGTTGTTAATTCGGTGTAGACTTGT–––AAACCT
his AGTTCTTGCTTTCTAACGTGAAAGTGGTTTAGGTTAAAAGAC–ATCA
lac CAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTG–TGG–AATT
lacl GAATGGCGCAAAACTTTTCGCGGTATGG–CATGATAGCGCCC–GGAA
leu AAAAGTTGACATCCGTTTTTGTATCCAG–TAACTCTAAAAGC–ATAT
recA AACACTTGATACTGTATGAGCATACAG––TATAATTGCTTC––AACA
trp AGCTGTTGACAATTAATCATCGAACTAG–TTAACTAGTACGC–AAGT
tRNA AACACTTTACAGCGGGCCGTCATTTGA––TATGATGCGCCCC–GCTT
X1 TCCGCTTGTCTTCCTAGGCCGACTCCC––TATAATGCGCCTCCATCG
Evaluate
1. Identify the topic this problem addresses 1. This question concerns bacterial promoters. The answer requires
and the nature of the required answer. identification of consensus sequences for -35 and -10 regions of promoters
and speculation about the consequences of promoter mutations.
2. Identify the critical information provided 2. The problem provides promoter sequence information for 10 E. coli genes
in the problem. and identifies the segment of each promoter containing the -10 and -35
regions.
Deduce
3. Examine the -10 and -35 sequences of 3. The -10 and -35 sites are the location of RNA polymerase binding during
these promoters, and look for common transcription initiation. Count the numbers of A, T, C, and G in each position
patterns. in the boxed regions.
TIP: A consensus sequence identifies
the most common nucleotide at each
position in a DNA segment.
Solve Answer a
4. Determine the consensus sequence at 4. At the -10 site, and moving left to right (toward +1), the most common
the -10 and -35 regions. nucleotides in each position in the consensus region, and the number of
times they occur in that position, are
TIP: Identify the most commonly occurring
nucleotide in each position of each 6-nucleotide
consensus region of these genes.
T A T A A T
(9) (9) (6) (5) (5) (9)
At the -35 site, also moving left to right (toward the +1), the most common
nucleotides in each position, and the number of times they occur in that po-
sition, are
For more practice, see Problems 4, 7, and 16. Visit the Study Area to access study tools. Mastering Genetics
283
284 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
variation occurs among promoters, it is reasonable to ask its downstream progression ( 3 in Foundation F igure 8.6).
how RNA polymerase is able to recognize promoters and The sigma subunit itself remains intact and can associate with
reliably initiate RNA synthesis. For an answer, we turn to another core enzyme to transcribe another gene.
the sigma subunits that confer promoter recognition and Downstream progression of the RNA polymerase core
chain-initiation ability on RNA polymerase. is accompanied by DNA unwinding ahead of the enzyme to
Four alternative sigma subunits identified in E. coli maintain approximately 18 bp of unwound DNA 4 . As the
are named according to their molecular weight (Table 8.2). RNA polymerase passes, progressing at a rate of approxi-
Each alternative sigma subunit leads to recognition of a mately 40 nucleotides per second, the DNA double helix
different set of -10 and -35 consensus sequences by the re-forms in its wake. When transcription of the gene is com-
holoenzyme. These different consensus sequence elements pleted, the 5′ end of the RNA trails off the core enzyme 5 .
are found in promoters of different types of genes; thus, the The end product of transcription is a single-stranded
sigma subunit that it becomes attached to determines the RNA that is complementary and antiparallel to the tem-
specific gene promoters a holoenzyme will recognize. plate DNA strand. The transcript has the same 5′@to@3′
The sigma subunit s70 is the most common in bacteria. polarity as the coding strand of DNA, the strand comple-
It recognizes promoters of “housekeeping genes,” the genes mentary to the template strand. The coding strand and the
whose protein products are continuously needed by cells. newly formed transcript also have identical nucleotide
Because of the constant need for their products, housekeep- sequences, except for the presence of uracil in the transcript
ing genes are continuously expressed. Subunits s54 and in place of thymine in the coding strand. For this reason,
s32 recognize, respectively, promoters of genes involved gene sequences are written in 5′@to@3′ orientation as single-
in nitrogen metabolism and genes expressed in response to stranded sequences based on the coding strand of DNA.
environmental stress such as heat shock, and they are uti- This allows easy identification of the mRNA sequence of a
lized when the action of these genes is required. The fourth gene by simply substituting U for T.
sigma subunit, s28, recognizes promoters for genes required Gene transcription is not a one-time event, and shortly
for bacterial chemotaxis (chemical sensing and motility). after one round of transcription is initiated, a second round
The specificity of each type of sigma subunit for dif- begins with new RNA polymerase–promoter interaction.
ferent promoter consensus sequences produces RNA poly- Following sigma subunit dissociation and core enzyme syn-
merase holoenzymes that have different DNA-binding thesis of 50 to 60 RNA nucleotides, a new holoenzyme can
specificities. Microbial geneticists estimate that each E. coli bind to the promoter and initiate a new round of transcrip-
cell contains about 3000 RNA polymerase holoenzymes tion while the first core enzyme continues along the gene. In
at any given time and that each of the four kinds of sigma addition, if the transcript under construction is mRNA, the
subunits is represented to a differing degree among them. 5′ end is immediately available to begin translation (as we
Because sigma subunits readily attach and detach from core see in Section 9.2, this is only true of organisms that don’t
enzymes in response to changes in environmental condi- possess a nucleus). In contrast, transcripts of other RNAs,
tions, the organism is able to change its transcription pat- such as transfer and ribosomal RNA, must await the com-
terns to adjust to different conditions. pletion of transcription before undergoing the folding into
secondary structures that readies them for cellular action.
Transcription Elongation and Termination
Transcription Termination Mechanisms
Upon reaching the +1 nucleotide, the holoenzyme begins
RNA synthesis by using the template strand to direct RNA Termination of transcription in bacterial cells is signaled by
assembly. The holoenzyme remains intact until the first 8 to a DNA termination sequence that usually contains a repeat-
10 RNA nucleotides have been joined. At that point, the sigma ing sequence producing distinctive 3′ RNA sequences. Ter-
subunit dissociates from the core enzyme, which continues mination sequences are downstream of the stop codon; thus,
they are transcribed after the coding region of the mRNA and Termination sequence
so are not translated. Two transcription termination mecha- Inverted Inverted
nisms occur in bacteria. The most common is intrinsic repeat 1 repeat 2
5¿ TTATCGCCCGACTAAATACGGGCGATTTTTT 3¿
termination, a mechanism dependent only on the occurrence DNA
3¿ AATAGCGGGCTGATTTATGCCCGCTAAAAAA 5¿
of specialized repeat sequences in DNA that induce the for-
Spacer sequence Polyadenine
mation in RNA of a secondary structure leading to transcrip- sequence
tion termination. Less frequently, bacterial gene transcription
1 Intrinsic termination sequences contain inverted
terminates by rho-dependent termination, a mechanism repeats separated by a spacer sequence and
characterized by a different terminator sequence and requir- followed by a polyadenine sequence.
ing the action of a specialized protein called the rho protein.
GGCGATTTT
CG TT
TA
Intrinsic Termination Most bacterial transcription termi- A Poly-U string
5¿
TTATCGCCCGACTA 3¿
A
nation occurs exclusively as a consequence of termination A A T A G C G G G C T G A T A AUA
3¿ 5¿
U
U
sequences encoded in DNA—that is, by intrinsic termina- T CGGGCGAUUUU
A
AT A
UA
tion. Intrinsic termination sequences have two features. First, GCCCGCT A A A A
AC
mRNA 5¿ U U A U C G C C C G
they are encoded by a DNA sequence containing an inverted
repeat, a DNA sequence repeated in opposite directions but 2 Transcription of the template strand forms mRNA.
with the same 5′@to@3′ polarity. Figure 8.7 shows the inverted
repeats (“inverted repeat 1” and “inverted repeat 2”) in a ter-
mination sequence, separated by a short spacer sequence that TTTT
T
is not part of either repeat. The second feature of intrinsic
T
5¿ TTATCGCCCGACTAAATACGGGCG 3¿
termination sequences is a string of adenines on the template UU UU 5¿
3¿ AATAGCGGGCTGATTTATGCCCG UU
A
A
DNA strand that begins at the 5′ end of the inverted repeat 2 A AAA
5¿ U U A U A
region 1 . Transcription of inverted repeats produces mRNA C G
G C
with complementary segments that are able to fold into a C G
Stem
short double-stranded stem ending with a single-stranded 3 Inverted repeat sequences C G
loop 2 . This secondary structure is a stem-loop structure, in the transcript fold into C G
a complementary stem G C
also known as a hairpin 3 . A string of uracils complemen- A A
ending in a single-
tary to the adenines on the template strand immediately fol- stranded loop. Loop C U
lows the stem-loop structure at the 3′ end of the RNA. U A
A A
The formation of a stem-loop structure followed imme-
diately by a poly-U sequence near the 3′ end of RNA causes
the RNA polymerase to slow down and destabilize. In addi- 5¿ TTATCGCCCGACTAAATACGGGCGATTTTTT 3¿
tion, the 3′ U-A region of the RNA–DNA duplex contains 3¿ AATAGCGGGCTGATTTATGCCCGCTAAAAAA 5¿
the least stable of the complementary base pairs. The insta-
bility created by RNA polymerase slowing and the U-A base
5¿ UUAU A U U U U U U 3¿
pairs induces RNA polymerase to release the transcript and C G
separate from the DNA 4 . The behavior of RNA poly- G C
4 Hydrogen bonds between C G RNA
merase during intrinsic termination of transcription is like A–U base pairs break, C G transcript
that of a bicycle rider at slow speed. Slow forward momen- releasing the transcript and C G
tum creates instability and eventually the rider loses bal- terminating transcription. G C
A A
ance. In a similar way, RNA polymerase is destabilized as C U
it slows while transcribing inverted repeat sequences, and U A
A A
it falls off DNA when the transcript is released where A-U
base pairs form and then separate. Figure 8.7 Intrinsic termination. Inverted repeat DNA
sequences alone initiate transcription termination.
Rho-Dependent Termination In contrast to the more
common intrinsic termination, certain bacterial genes
require the action of rho protein to bind to nascent mRNA rut site (Figure 8.8 step 1 ). As RNA polymerase contin-
and catalyze separation of mRNA from RNA polymerase ues to elongate the mRNA in the 3′ direction, rho protein
to terminate transcription. Genes whose transcription attaches to the rut site and quickly moves toward the RNA
is rho-dependent have termination sequences that are polymerase 2 . When RNA polymerase reaches and tran-
distinct from those in genes utilizing intrinsic termina- scribes the termination sequence containing inverted repeat
tion. As the mRNA transcript grows, a segment of the sequences, a stem-loop forms in the mRNA, causing the
gene known as the rho utilization site is transcribed. On RNA polymerase to pause so that the rho protein catches
mRNA this produces a segment of sequence known as the up to it 3 . Rho protein then terminates transcription by
286 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
Termination
sequence
5¿
Transcription across the rho
mRNA utilization site produces the
rut 3¿
sequence rut sequence in mRNA that
RNA is the recognition site for
polymerase Rho protein.
5¿
Rho protein binds to the rut
sequence and moves
Rho 3¿ toward the 3¿ end of mRNA.
protein
Termination
5¿
sequence
RNA polymerase pauses
at the termination sequence
3¿ as a stem-loop forms.
Stem-loop
4
Rho protein catches up to
the paused RNA polymerase
and releases mRNA and RNA
polymerase from DNA to
5¿ 3¿ terminate transcription.
Figure 8.8 Rho-dependent transcription termination. Rho protein binds to the rut sequence on mRNA
and proceeds to the termination sequence, where it terminates transcription.
catalyzing the release of mRNA from RNA polymerase The eukaryotic RNA polymerase responsible for the tran-
and causing RNA polymerase to drop off the DNA 4 . scription of most polypeptide-producing genes differs from
the bacterial RNA polymerase, but eukaryotic transcrip-
tion progresses through the same four stages we described
8.3 Eukaryotic Transcription Is for bacteria: promoter recognition, transcription initiation,
transcript elongation, and transcription termination. Several
More Diversified and Complex structural and functional factors make transcription more
than Bacterial Transcription complex in eukaryotes.
First, eukaryotic promoters and consensus sequences
Bacteria use a single RNA polymerase core enzyme and are considerably more diverse than in E. coli, and, as
several alternative sigma subunits to transcribe all genes. indicated above, the three different RNA polymerases in
Eukaryotes, by contrast, each have three RNA polymerases eukaryotes recognize different promoters, transcribe dif-
that are specialized for the transcription of different genes. ferent genes, and produce different RNAs. Second, the
8.3 Eukaryotic Transcription Is More Diversified and Complex than Bacterial Transcription 287
GC-rich box CAAT box TATA box Figure 8.9 Three common eukaryotic promoter con-
5¿ GGGCGG CAAT TATAAA 3¿ sensus sequence elements. The TATA box and the CAAT
DNA box are common; the presence of the upstream GC-rich
3¿ CCCGCC GTTA ATATTT 5¿
–90 –80 –25 +1 box is more variable.
molecular apparatus assembled at promoters to initiate The most common eukaryotic promoter consensus
and elongate transcription is more complex in eukaryotes. sequence, the TATA box, is shown in Figure 8.9 as part of a
Third, eukaryotic genes contain introns and exons, requiring set of three consensus segments that were the first eukaryotic
extensive posttranscriptional processing of mRNA. We promoter elements to be identified. A TATA box, also known
describe this posttranscriptional processing in a later sec- as a Goldberg–Hogness box, is located approximately at
tion. Finally, eukaryotic DNA is permanently associated position -25 relative to the beginning of the transcriptional
with a large amount of protein to form a compound known start site. Consisting of 6 bp with the consensus sequence
as chromatin, the complex of DNA and proteins that makes TATAAA, it is the most strongly conserved promoter element
up the eukaryotic chromosome and plays a central role in in eukaryotes. The figure shows two additional consensus
regulating eukaryotic transcription. sequence elements that are more variable in their frequency
The three different RNA polymerases transcribing in promoters. A 4-bp consensus sequence identified as the
the major types of RNA coded by eukaryotic genomes are CAAT box is most commonly located near -80 when it is
RNA polymerase I (RNA pol I), which transcribes several present in the promoter. An upstream GC-rich region called
ribosomal RNA genes; RNA polymerase II (RNA pol II), the GC-rich box, with a consensus sequence GGGCGG
which is primarily responsible for transcribing messenger located -90 or more upstream of the transcription start, has
RNAs that encode polypeptides, as well as for transcribing a frequency that is less than that of CAAT box sequences.
most small nuclear RNA genes; and RNA polymerase III Comparison of eukaryotic promoters reveals a high
(RNA pol III), which transcribes all transfer RNA genes degree of variability in the type, number, and location
as well as one small nuclear RNA gene and one ribosomal of consensus sequence elements (Figure 8.10). Some
RNA gene. RNA pol II and RNA pol III are also responsible promoters contain all three of the consensus sequences iden-
for miRNA and siRNA synthesis. tified above, others contain one or two of these consensus
elements, some contain none at all, and many contain
Polymerase II Transcription of mRNA other types of consensus sequence elements altogether. For
example, the thymidine kinase gene contains TATA, CAAT,
in Eukaryotes
and GC-rich boxes along with an octamer (OCT) sequence,
RNA pol II transcribes eukaryotic polypeptide-coding genes called an OCT box. The histone H2B gene contains two
into mRNA. The promoters for these genes are numerous and OCT boxes in addition to a TATA box and a pair of CAAT
highly diverse, with different overall lengths and differences boxes. All of these consensus sequence elements play
in the number and type of consensus sequences prominent important roles in the binding of transcription factors, a
among the sources of promoter variation. RNA polymerase group of transcriptional proteins described below.
II (RNA pol II) is a molecule composed of a dozen or more
protein subunits, making it much more complex than the bac-
terial RNA polymerase, with its five subunits. In comparison, Transcription
archaeal RNA polymerase has at least 11 or more subunits, start
making it more similar to RNA pol II than to bacterial RNA 5¿ 3’ b-globin
polymerase. Given the function of RNA pol II, it is reason- 3¿ 5’
able to ask how RNA polymerases locate promoter DNA for
different genes and how researchers determine which regions 5¿ 3’ Thymidine
of a genome function as promoters. 3¿ 5’ kinase
Three lines of investigation help researchers to iden- 5¿ 3¿ Histone
tify and characterize promoters of different polypeptide- 3¿ 5¿ H2B
coding genes: (1) promoters are identified by determining
which DNA sequences are bound by proteins associated 5¿ 3¿ SV40 early
with RNA pol II during transcription, (2) putative pro- 3¿ 5¿ promoter
moter sequences from different genes are compared to
–160 –120 –80 –40 +1
evaluate their similarities, and (3) mutations that alter gene
transcription are examined to identify how DNA base- TATA box GC box
CAAT box Octamer (OCT) box
pair changes affect transcription. Research Technique 8.1
discusses the experimental identification and analysis of Figure 8.10 Selected examples of variability in eukaryotic
promoters. promoters.
288 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
CONCLUSION The band shift assay results shown indicate and each end-labeled fragment produced is located by its
different migration rates and therefore different molecular radioactivity.
weights for the control and experimental DNA fragments.
This is evidence that transcriptional proteins have bound RESULTS In this DNA footprint protection assay, notice that
to a sequence on the experimental DNA fragment, which the experimental DNA lane contains a gap in which no DNA
would be consistent with the sequence being a consensus fragments appear. The gap represents “footprint protection”
sequence and a potential promoter. However, the location of for the portion of the fragment that is protected from DNase I
the bound sequence on the DNA fragment is not known from digestion by bound transcriptional proteins. No such protec-
these results. tion occurs for the control fragment, as there are no transcrip-
tional proteins bound to any part of it.
DNA Footprint Protection CONCLUSION The gap created by footprint protection
MATERIALS AND PROCEDURES This experimental analy- indicates that a DNA sequence on the experimental DNA
sis begins with two identical samples of DNA fragments con- fragment has been bound by transcriptional proteins, and
taining suspected consensus sequences as identified by band the results provide information that can pinpoint where on
shift assay experiments. All fragments are end-labeled with the DNA fragment a protected DNA sequence is located.
32
P to make their detection in gel electrophoresis easier. The The final piece of evidence that a DNA fragment contains a
experimental DNA sample is mixed with transcriptional pro- promoter comes from mutational analysis that identifies func-
teins, but the control sample is not. Both samples are exposed tional changes caused by mutations of specific nucleotides
to DNase I, which randomly cuts DNA that is not protected of promoter consensus sequences. This analysis is described
by protein. The samples are subjected to gel electrophoresis, momentarily and is illustrated in Figure 8.12.
(continued )
8.3 Eukaryotic Transcription Is More Diversified and Complex than Bacterial Transcription 289
32
P
End-labeled DNA
Gel
32
P-labeled fragments electrophoresis 32P-labeled fragments
Protein-protected
region; potential
promoter region
recognition. At the TATA box, a protein called TFIID, a 1 TAF and TBP form TFIID and bind the TATA box.
multisubunit protein containing TATA-binding protein
TBP TAF TFIID
(TBP) and subunits of a protein called TBP-associated
factor (TAF), binds the TATA box sequence. The assem- Initial committed
bled TFIID binds to the TATA box region to form the initial complex
DNA +1
committed complex (Figure 8.11 step 1 ). Next, TFIIA, 5¿ 3¿
TFIIB, TFIIF, and RNA polymerase II join the initial com- IID
3¿ 5¿
mitted complex 2 , which in turn is joined by TFIIE and TATA
TFIIH to form the preinitiation complex (PIC) 3 . This box
complex contains six proteins that are commonly identified
2 The addition of TFIIA, TFIIB, RNA polymerase II, and
as general transcription factors (GTFs). Once assembled, TFIIF forms the minimal initiation complex.
the complex directs RNA polymerase II to the +1 nucleo- Minimal initiation
tide on the template strand, where it begins the assembly of complex
messenger RNA 4 . +1
TFIIA IIB
Although most of the eukaryotic genes that have been 5¿ IIF 3¿
IID
examined have a TATA box and undergo TBP binding, there 3¿ 5¿
is evidence that some metazoan genes may use a related fac- RNA polymerase II
tor called TLF (TBP-like factor). The complexity of TBP,
TLF, and associated proteins is analogous to the different
sigma factors in bacterial systems, thus allowing differential
recognition of promoters in eukaryotes. 3 TFIIE and TFIIH join to form the preinitiation complex.
RNA polymerase II is poised to begin transcription.
Detecting Promoter Consensus Elements Preinitiation complex
The diversity of eukaryotic promoters begs an important
question: How do researchers verify that a segment of DNA TFIIA IIB +1
is a functionally important component of a promoter? The 5¿ IIF 3¿
IID
3¿ IIE 5¿
research has two components; the first, outlined in Research
IIH RNA polymerase II
Technique 8.1, is discovering the presence and location of
DNA sequences that transcription factor proteins will bind
to. The second component involves mutational analysis to General transcription factors
confirm the functionality of the sequence. Researchers pro-
duce many different point mutations in the DNA sequence 4 RNA polymerase II is released from the GTFs in the
under study and then compare the level of transcription gen- preinitiation complex to begin transcription.
erated by each mutant promoter sequence with transcription TFIIA IIB +1
generated by the wild-type sequence.
5¿ IIF 3¿
Figure 8.12 shows a synopsis of promoter mutation IID RNA 3¿
3¿ IIE 5¿
analysis from an experiment performed by the molecular RNA
biologist Richard Myers and colleagues on a m ammalian polymerase II
IIH
b@globin gene promoter. These researchers produced mRNA 5¿
mutations of individual base pairs in TATA box, CAAT
box, and GC-rich sequences, and of nucleotides between
the consensus sequences, to identify the effect of each Figure 8.11 Eukaryotic transcription. Transcription factor pro-
individual mutation on the relative transcription level of teins bind the promoter region to set the stage for eukaryotic tran-
the gene. The bars in the figure indicate the impact of scription by RNA polymerase II.
base substitution mutations of individual base pairs in
and around the consensus sequences of the promoter. A
relative transcription level of 1.0 represents the w
ild-type
promoter; thus, a bar that is lower than 1.0 indicates a substitutions in the CAAT box region that significantly
decrease in transcription level, and a bar that is higher increased transcription. In contrast, mutations outside the
than 1.0 indicates an increased level of transcription. consensus regions had nonsignificant effects on transcrip-
The dots at nucleotide positions along the sequence indi- tion level. These results show the functional importance of
cate that no data are available since no mutation was specific DNA sequences in promoting transcription and con-
made. firm a functional role in transcription for TATA box, CAAT
The researchers found that most base-pair mutations box, and GC-rich sequences. Notice that the sequences of
in the three consensus regions significantly decreased the these regulatory regions in this particular gene differ slightly
transcription level of the gene, and they found two base from the consensus sequences shown in Figure 8.9. This is
8.3 Eukaryotic Transcription Is More Diversified and Complex than Bacterial Transcription 291
1.0
0
–100 –80 –60 –40 –20 +1 20
5¿ CGTAGAGCCACACCCTGGTAAGGGCCAATCTGCTCACACAGGATAGAGAGGGCAGGAGCCAGGGCAGAGCATATAAGGTGAGGTAGGATCAGTTGCTCCTCACATTTGCTTCTGACATAGT 3¿
GC-rich CAAT box –37 TATA box Transcription
start (+1)
Figure 8.12 Mutation analysis of the b@globin gene promoter. The bars indicate that mutations in
regions containing TATA box, CAAT box, and upstream GC-rich box sequences substantially reduce the rela-
tive transcription level. Orange dots indicate sites where no mutations were made and for which no data are
available.
Q In this figure, the TATA box begins at –26 and ends at –30. In two or three sentences, describe
the effect of mutations at positions –27 and –28 on relative transcription compared with mutations at
–47 and –48. Explain the reason for the difference in mutation effect.
because the precise regulatory sequence of any gene may the gene given their different distances from the start of
vary slightly from the consensus sequence. transcription?
One answer is that enhancers bind activator proteins and
Other Regulatory Sequences and associated coactivator proteins to form a protein “bridge” that
Chromatin-Based Regulation of RNA bends the DNA and links the transcription complex at the pro-
moter to the activator–coactivator complex at the enhancer
Pol II Transcription (Figure 8.13). The bend produced in the DNA may contain
Often, promoters alone, while necessary, are not sufficient dozens to thousands of base pairs. The action of enhancers and
to initiate transcription of eukaryotic genes. In such cases, the proteins they bind dramatically increases the efficiency of
additional regulatory sequences, and additional transcrip- RNA pol II in initiating transcription, and as a result increases
tion-activating proteins, are needed to drive transcription. the level of transcription of genes regulated by enhancers.
This is particularly the case for multicellular eukaryotes that At the other end of the transcription-regulating spec-
have many different types of cells with distinctive patterns trum are silencer sequences, DNA elements that act to
of gene expression, including patterns that change as the repress transcription of their target genes. Silencers bind
organisms grow and develop. This type of transcriptional proteins that bend DNA in such a way that genes become
regulation is discussed in Section 13.2. sequestered in the folded segment and thus are shielded
Enhancer sequences are one important group of DNA from transcription activation by RNA pol II.
regulatory sequences that increase the level of transcrip- Overlying the operation of transcription-regulating DNA
tion of specific genes. Enhancer sequences bind specific sequences and their interactions with DNA-binding proteins
proteins that interact with the proteins bound at gene pro- is the chromatin structure of eukaryotic DNA. “Chroma-
moters, and together promoters and enhancers drive tran- tin,” as mentioned earlier, is the name applied to the mixture
scription of certain genes. In many situations, enhancers are of DNA and proteins that constitutes eukaryotic chromo-
located upstream of the genes they regulate; but enhancers somes, and its structure is both integral to the chromosome
can be located downstream as well. Some enhancers are and dynamic. Specifically, chromatin can change to become
relatively close to the genes they regulate, but others are more compact or less compact, either permitting or block-
thousands to tens of thousands of base pairs away from ing RNA polymerase II and its transcription factor access to
their target genes. Thus, important questions for molecu- promoters and thus controlling the accessibility of regions of
lar biologists are: What proteins are bound to enhancers, DNA to transcription. Different patterns of chromatin state
and how do enhancer sequences regulate transcription of occur in different types of cells; moreover, chromatin state for
292 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
1 The core element initiates transcription, and the 1 This internal promoter contains box A and box C
upstream control element increases transcription from +55 to +80.
efficiency. Transcription
Upstream control start Box A Box C
element Core element 5¿ 5¿
5¿ 5¿ 3¿ 3¿
3¿ 3¿ +1 +55 +80
–150 –100 –45 +1 +20 2 TFIIIA binds to box C
and facilitates binding
of TFIIIC to box A.
2 UBF1 and SL1 bind to upstream control and core TFIIIC TFIIIA
elements. 5¿ 5¿
–150
3¿ 3¿
5¿ +1 +55 +80
3¿
SL1 3 TFIIIB binds to TFIIIA
UBF1
and TFIIIC. TFIIIB
UBF1 TFIIIC TFIIIA
SL1 5¿ 5¿
5¿
–100 3¿ 3¿ 3¿
–45 +1 +20 +1 +55 +80
4 RNA polymerase III
binds to TFs and is
positioned at +1. TFIIIB
RNA
3 RNA pol I is recruited to the core element to initiate polymerase III TFIIIC TFIIIA
transcription. –150 5¿ 5¿
5¿ 3¿ 3¿
3¿ +1 +55 +80
SL1
UBF1
UBF1
SL1
5’ Figure 8.15 An internal promoter for transcription by RNA
–100 3’ polymerase III.
–45 +1 +20
HO HO 3 Guanine monophos- HO HO
1 The 5¿ (g) phosphate of the 3¿ phate is joined to the 3¿
OH O O– OH O O–
P P
O O O O
5¿ 3¿
(3) facilitating subsequent intron splicing, and (4) enhanc- cleavage factors, CFI and CFII, and polyadenylate poly-
ing translation efficiency by orienting the ribosome on merase (PAP) enlarges the complex 1 . The pre-mRNA
mRNA. is then cleaved 15 to 30 nucleotides downstream of the
polyadenylation signal sequence 2 . The cleavage releases
a transcript fragment bound by CFI, CFII, and CStF,
Polyadenylation of 3′ Pre-mRNA which is later degraded 3 . Through the action of CPSF
Termination of transcription by RNA pol II is not fully and PAP, the 3′ end of the cut pre-mRNA then undergoes
understood, but it appears to be tied to the processing and the enzymatic addition of 20 to 200 adenine nucleotides
polyadenylation of the 3′ end of pre-mRNA. It is clear that that form the 3′ poly-A tail 4 . After addition of the first
the 3′ end of eukaryotic mRNA is not generated by tran- 10 adenines, molecules of poly-A-binding protein II (PABII)
scriptional terminating sequence as it is in bacteria. Rather, join the elongating poly-A tail and increase the rate of ade-
the 3′ end of the pre-mRNA is created by enzymatic action nine addition 5 . The 3′ poly-A tail has several functions,
that removes a segment from the 3′ end of the transcript and including (1) facilitating transport of mature mRNA across
replaces it with a string of adenine nucleotides, the poly- the nuclear membrane, (2) protecting mRNA from degrada-
A tail. This step of pre-mRNA processing is thought to be tion, and (3) enhancing translation by enabling ribosomal
associated with subsequent termination of transcription. recognition of messenger RNA.
Figure 8.18 illustrates these steps. Polyadenylation Certain eukaryotic mRNA transcripts do not undergo
begins with the binding of a factor called cleavage and poly- polyadenylation. The most prominent of these are tran-
adenylation specificity factor (CPSF) near a six-nucleotide scripts of genes producing histone proteins, which are key
mRNA sequence, AAUAAA, that is downstream of the components of chromatin (see Section 10.6). On these and
stop codon and thus not part of the coding sequence of other “tailless” mRNAs, the 3′ end contains a short stem-
the gene. This six-nucleotide sequence is known as the loop structure reminiscent of the ones seen in the intrinsic
polyadenylation signal sequence. The binding of cleavage- transcription termination mechanism of bacteria. There
stimulating factor (CStF) to a uracil-rich sequence several may be an evolutionary connection between bacterial tran-
dozen nucleotides downstream of the polyadenylation sig- scription termination and stem-loop formation on “tailless”
nal sequence quickly follows, and the binding of two other eukaryotic mRNAs.
296 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
Polyadenylation
Polypeptide-coding sequence signal sequence Cleavage site
CPSF
5¿ A A A A A A A A A A A A A 3¿
PAP
CPSF
5¿ A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A 3¿
PABII PAP
PABII
Consensus
Polypeptide-coding sequence sequence Poly-A tail
The Torpedo Model of Transcription enzymatic action. Once the RNase destroys the residual
Termination mRNA and catches up to RNA pol II, it triggers dissociation
of the p olymerase from template strand DNA to terminate
The connection between polyadenylation and transcription transcription.
termination lies in the activity of a specialized RNase (an
RNA-destroying enzyme) that attacks and digests the resid-
Introns
ual RNA transcript that has remained attached to RNA pol
II after 3′ transcript cleavage. Following polyadenylation Most eukaryotic genes contain two kinds of segments. One
and 3′ cleavage, the residual segment still attached to RNA kind, the exons, become part of mature mRNA and encode
pol II has no cap protecting its 5′ end. This end is attacked segments of proteins. The other kind, the introns, are inter-
by the specialized RNase that rapidly digests the remaining vening segments that separate exons. Introns are removed
transcript. from pre-mRNA by processes that excise the introns and
The RNase is thought of as a “torpedo” aimed at the splice together the exons.
residual mRNA attached to RNA pol II (Figure 8.19). Introns are common in eukaryotic genes, are rare in
Studies have shown that the torpedo RNase is a highly bacterial genes, and are found occasionally in archaeal
processive enzyme, meaning that it rapidly carries out its genes. There is also evidence of the presence of introns in a
8.4 Posttranscriptional Processing Modifies RNA Molecules 297
Figure 8.19 The torpedo model of eukaryotic transcription termination. Eukaryotic transcription 5
1 leads to torpedo RNase association with mRNA 2 , Enzymatic cleavage near the poly-A-signal sequence
RNA polymerase II
releases the mature mRNA. The torpedo RNase attacks the uncapped 5′ end of the residual mRNA 3 and separation
digests it 4 , leading RNA polymerase II to dissociate from the DNA and the torpedo RNase 5 .
RNA
polymerase II
Poly-A signal
1
sequence 2 3 +
4
Torpedo
3’ cleavage Uncapped
AU
digestion
residual
Torpedo RNase transcript
discovery reported independently by the molecular biolo- single-stranded R-loop sequences are introns that do not
gists Richard Roberts and Phillip Sharp in 1977. Nothing pair with mRNA.
known about eukaryotic gene structure at the time suggested
that most eukaryotic genes are subdivided into intron and
exon elements. Roberts and Sharp shared the 1993 Nobel Splicing Signal Sequences
Prize in Physiology or Medicine for their codiscovery of Eukaryotic pre-mRNA contains specific short sequences
“split genes” in the eukaryotic genome. that define the 5′ and 3′ junctions between introns and
Sharp’s research group discovered the split nature of their neighboring exons. In addition, there is a consensus
eukaryotic genes by using a technique known as R-looping. sequence near each intron end to assist in its accurate iden-
In this method, DNA encoding a gene is isolated, denatured tification. The 5′ splice site is located at the 5′ intron end,
to single-stranded form, and then mixed with the mature where it abuts an exon (Figure 8.22). This site contains a
mRNA transcript from the gene. Regions of the gene that consensus sequence with a nearly invariant GU dinucleo-
encode sequences in mature mRNA will be complemen- tide forming the 5′@most end of the intron. The consensus
tary to those sequences in the mRNA and will hybridize sequence includes the last three nucleotides of the adjoin-
with them to form a DNA–mRNA duplex. However, DNA ing exon, as well as the four or five nucleotides that follow
segments encoding introns will not find complementary the GU in the intron. At the 3′ splice site on the opposite
sequences in mature mRNA and will remain single stranded, end of the intron, a consensus sequence of 11 nucleotides
looping out from between the hybridized sequences. contains a pyrimidine-rich region and a nearly invariant
Figure 8.21 shows a map of the hexon gene studied AG dinucleotide at the 3′@most end of the intron. A third
in R-looping experiments by Sharp and colleagues. The consensus sequence, called the branch site, is located 20 to
experimental results, photographed by electron micros- 40 nucleotides upstream of the 3′ splice site. This consen-
copy, reveal four DNA–mRNA hybrid regions where exon sus sequence is pyrimidine-rich and contains an invariant
DNA sequence pairs with mature mRNA sequence. Three adenine, called the branch point adenine, near the 3′ end.
Mutation analysis shows that these consensus sequences
are critical for accurate intron removal. Mutations altering
(a) nucleotides in any of the three consensus regions can produce
Introns A B C abnormally spliced mature mRNA. The abnormal mRNAs—
Hexon too short if exon sequence is mistakenly removed, too long if
Exons 1 2 3 4 gene
intron sequence is left behind, or altered in other ways that result
in improper reading of mRNA sequence—produce proteins
(b) with incorrect sequences of amino acids (see Section 11.2).
5¿ C Introns are removed from pre-mRNA by an snRNA–
protein complex called the spliceosome. The spliceosome is
something like a molecular workbench to which pre-mRNA
is attached while spliceosome subunit components cut and
splice it in a four-step process that, first, cleaves the 5′ splice
site; second, forms a lariat intron structure that binds the
5′ intron end to the branch point adenine; third, cleaves the
A
3′ splice site; and finally, splices exons and releases the lar-
iat intron to be degraded to its nucleotide components.
Exon Figure 8.22 illustrates the steps of nuclear pre-mRNA
splicing, beginning with the aggregation of five small
nuclear ribonucleoproteins (snRNPs; pronounced “snurps”)
to form a spliceosome. The snRNPs are snRNA–protein
subunits designated U1, U2, and U4 to U6. The spliceo-
some is a large complex made up of multiple snRNPs, but
B 3¿ its composition is dynamic; it changes throughout the dif-
Figure 8.21 R-loop experimental analysis. (a) The hexon gene
ferent stages of splicing when individual snRNPs come and
contains four exons (1 to 4) and three introns (A to C). (b) Electron go as particular reaction steps are carried out.
micrographs show hybridization of mature hexon gene mRNA
with denatured hexon DNA. Exon regions of DNA hybridize with
A Gene Expression Machine Couples
mature mRNA, but intron sequences do not hybridize and appear
as single-stranded loops. Transcription and Pre-mRNA Processing
Q The electron micrograph in part (b) has a pointer indicating Each intron–exon junction is subjected to the same
an “exon.” Which specific exon does this pointer most likely spliceosome reactions, raising the question of whether
indicate? Justify your answer. there is a particular order in which introns are removed
8.4 Posttranscriptional Processing Modifies RNA Molecules 299
Figure 8.22 Intron removal from eukaryotic pre-mRNA by a spliceosome. The solidus between A and G
in the 5′ splice site (A/G) indicates that these two nucleotides are about equally frequent in this consensus
sequence. In the 3′ splice site, Py indicates either of the pyrimidines (C or U) and N indicates that any nucleo-
tide can be present.
from pre-mRNA—or whether U1 and U2 search more or The three steps of pre-mRNA processing are tightly
less randomly for 5′ splice-site and branch-site consen- coupled. In comprehensive models developed over the past
sus sequences, inducing spliceosome formation when they decade or so, the carboxyl terminal domain (CTD) of RNA
happen to encounter an intron. The answer is that introns polymerase II plays an important role in this coupling by
appear to be removed one by one, but not necessarily in functioning as an assembly platform and regulator of pre-
order along the pre-mRNA. A study of intron splicing of the mRNA processing machinery. The CTD is located at the site
mammalian ovomucoid gene demonstrates this feature of of emergence of mRNA from the polymerase and contains
intron removal. The ovomucoid gene contains eight exons multiple heptad (seven-member) repeats of amino acids that
and seven introns. The pre-mRNA transcript is approxi- can be phosphorylated. Binding of processing proteins to the
mately 5.6 kb, and the mature mRNA is reduced to 1.1 kb. CTD allows the mRNA to be modified as it is transcribed.
Analysis of ovomucoid pre-mRNAs at various stages of Current models propose that “gene expression
intron removal illustrates that each intron is removed sepa- machines” consisting of RNA polymerase II and an array
rately, rather than all introns being removed at once, but of pre-mRNA–processing proteins are responsible for
the order of intron removal does not precisely match their the coupling of transcription and pre-mRNA processing.
5′@to@3′ order in pre-mRNA. Foundation Figure 8.23 illustrates this gene expression
F O U N D A T I O N F I G U R E 8.23
P
P
SF
5¿ Cap P
SF
TF SF
pA
RNase
5 Polyadenylation proteins identify RNA
the pA signal sequence and carry polymerase II
out polyadenylation. Transcrip- 5¿ 3¿
tion terminates. Splicing continues 3¿ NNNNN 5¿
to completion. Torpedo RNase NNN
NN
digests the residual mRNA. P P
N
AA
TF P
A
U
AA pA SF
5¿ Cap RNase
SF
SF
SF
RNA
polymerase II
6 Fully processed mature mRNA
dissociates from RNA pol II, is
released through nuclear pores,
and is transported to cytoplasm
5¿ 3¿
for translation. RNA pol II
dissociates from DNA. 3¿ 5¿
Nucleus AAA…
Mature mRNA AAAA
Poly-A tail
Cytoplasm
5¿ Cap
300
8.4 Posttranscriptional Processing Modifies RNA Molecules 301
given gene may lead to the production of several differ- the alternative exons 1a, 2b, 6b, 9a, and 9b. In contrast,
ent mature mRNAs in different types of cells, and to their tropomyosin in smooth muscle cells utilizes promoter P1
translation into distinct proteins in each of those cell types. and polyadenylation site A5, and its mature mRNA contains
A comprehensive example of a single gene for which exons 1a, 2a, 6b, and 9d. Brain cells produce three differ-
all three alternative mechanisms operate to produce ent tropomyosin proteins, each of which are translated from
distinct polypeptides in different cells is that of the rat differentially spliced pre-mRNAs that also utilize different
a@tropomyosin (a@Tm) gene that produces nine different polyadenylation sites. In addition, two forms of the brain cell
mature mRNAs and, correspondingly, nine different tropo- tropomyosin proteins are translated from mRNAs that utilize
myosin proteins from a single gene. Figure 8.25a shows promoter P2, and one from an mRNA utilizing P1. Among the
a map of a@Tm. The gene contains 14 exons, including four different tropomyosin proteins produced in fibroblasts,
alternatives for exons 1, 2, 6, and 9. The gene has two pro- the mRNAs all use polyadenylation site A5, but they differ
moters (identified as P1 and P2) as well as five alternative in selection of P1 versus P2, and alternative splicing occurs as
polyadenylation sites (identified as A1 to A5). The nine dis- well. Genetic Analysis 8.2 guides you through analysis of the
tinct mature mRNAs from a@Tm are produced in muscle results of alternative mRNA processing.
cells (two forms), brain cells (three forms), and fibroblast
cells (four forms; Figure 8.25b). Each different mature
Self-Splicing Introns
mRNA illustrates a unique pattern of promoter selection,
intron splicing, and choice of polyadenylation site. All In addition to introns that are excised by spliceosomes,
mature mRNAs, and their corresponding tropomyosin pro- certain other RNAs can contain introns that self-catalyze
teins, contain the genetic information of exons 3, 4, 5, 7, their own removal. Two categories of self-excising introns,
and 8; however, they may contain distinct information in designated group I introns and group II introns, have been
the alternative exons that depends largely on the cell-type– identified. The molecular biologist Thomas Cech and his
specific selection of promoter and polyadenylation site. colleagues discovered group I introns in 1981, when they
In striated muscle cells, for example, promoter P1 and observed that a 413-nucleotide precursor of an rRNA
polyadenylation site A2 are used. The mature mRNA includes gene from the protozoan Tetrahymena could excise itself
Smooth
muscle 5¿ 3¿
TMBr-1,
brain 5¿ 3¿
TMBr-2,
brain 5¿ 3¿
TMBr-3,
brain 5¿ 3¿
TM-2,
fibroblast 5¿ 3¿
TM-3,
fibroblast 5¿ 3¿
TM-5a,
fibroblast 5¿ 3¿
TM-5b,
fibroblast 5¿ 3¿
GENETIC ANALYSIS 8.2
PROBLEM The JLB-1 gene, expressed in several human organs, contains seven exons (1 to 7) and six introns (A to F).
Three labeled oligonucleotide (i.e., small polynucleotide) probes (I to III), hybridizing to exons 2, 4, and 7, respectively, are
indicated by asterisks below the gene map:
Mature mRNA is isolated from three tissues expressing the JLB-1 gene—blood, liver, and kidney—and examined by gel
electrophoresis using the three oligonucleotide probes indicated above. The probes bind to complementary sequences in
mRNA. Probe I and probe II bind to blood cell mRNA, but probe III does not. Probes II and III bind to liver cell mRNA, but
probe I does not. And, probes I and III bind to kidney cell mRNA, but probe II does not. Use the information on these distinct
probe-binding patterns to answer the following questions.
a. Thinking about pre-mRNA versus mature mRNA in these cells, explain the BREAK IT DOWN: Molecular probes
bind only to their target sequences.
meaning of the different probe-hybridization patterns. A band appears in the gel only if the
b. Identify the biological process or processes accounting for the observed exon target of a probe is present in the
mRNA (p. 17).
patterns of probe hybridization.
Evaluate
1. Identify the topic this problem 1. This problem concerns the production of mature mRNAs from a single human
addresses and the nature of the gene expressed in different organs. The answer requires identification of the
required answer. specific mechanisms responsible for the data obtained from each organ.
2. Identify the critical information pro- 2. The problem gives gene structure, the binding location of each of three molec-
vided in the problem. ular probes hybridizing the gene, and the results of three electrophoretic gel
analyses of mature mRNA from different organs.
Deduce
3. Identify the regions of JLB-1 that 3. Pre-mRNA from this gene is anticipated to include all intron and exon
are anticipated to be part of the sequences.
pre-mRNA.
4. Identify the regions expected to be 4. Some or all of the exon segments are expected in mature mRNA, along with
found in mature mRNA. modification at the 5′ mRNA end (capping) and the 3′ end (poly-A tailing).
Solve Answer a
5. Interpret the hybridization pattern of 5. Blood: Probes I and II hybridize, but probe III does not. This result indicates that
molecular probes in each tissue. exons 2 and 4 are present in the mature mRNA in blood, but exon 7 is not.
Liver: Probe I fails to hybridize to mRNA from liver, indicating that exon 1 is
missing from the liver mRNA. Probes II and III hybridize liver mRNA, indicating
TIP: Hybridization of a probe occurs when the
probe finds its target sequence. The absence
that exons 4 and 7 are included in the mature transcript.
of hybridization indicates that the target Kidney: Probe II does not hybridize the kidney mRNA, indicating that exon 4 is
sequence for a probe is not present.
missing from it. Probes I and III find hybridization targets, indicating that exons
2 and 7 are present in the transcript.
Answer b
6. Interpret the hybridization patterns in 6. Blood: The absence of exon 7 is most likely due to either the use of an alterna-
each tissue and identify the process or tive polyadenylation site that generates 3′ cleavage of pre-mRNA ahead of
processes that reasonably account for exon 7 or to differential splicing that removes exon 7 from pre-mRNA during
the observed patterns. intron splicing.
Liver: The absence of exon 2 is most likely due either to use of an alternative
TIP: Alternative promoters, alternative
promoter that initiates transcription at a point past exon 2 or to differential
polyadenylation sites, and alternative splicing splicing of liver pre-mRNA.
are three mechanisms that lead eukaryotic
genomes to generate distinct proteins from Kidney: The absence of exon 4 is most likely the result of differential splicing of
the same gene. pre-mRNA.
For more
For morepractice,
practice,see
seeProblems
Problems2, 3, and
Visit8.the
StudyVisit
Areathe
forStudy
a VideoTutor solution.
Area to access study tools. Mastering Genetics
303
304 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
without the presence of any protein. Following up on this stem-loop arrangements. Their self-excision takes place
initial observation, Cech and others have shown that group in a lariat-like manner utilizing a branch point nucleotide
I introns are large, self-splicing ribozymes (catalytically that in many cases is adenine. It is thought that nuclear
active RNAs) that catalyze their own excision from certain pre-mRNA splicing may have evolved from group II self-
mRNAs and also from tRNA and rRNA precursors in bac- excising introns.
teria, simple eukaryotes, and plants. Self-splicing of introns Beyond these three major types of intron splicing, sev-
takes place by way of a two-step process that excises the eral others have been identified, including those associated
intron and allows exons to ligate (Figure 8.26), Cech and with the transcripts of ribosomal RNA and transfer RNA
Sidney Altman shared the 1989 Nobel Prize in Physiology that are processed to produce the nucleic acids that function
or Medicine for their contributions to the discovery and in translation.
description of the catalytic properties of RNA.
Group II introns, which are also self-splicing ribo- Ribosomal RNA Processing
zymes, are found in transcripts of archaea and bacteria,
and in the transcripts of genes in the eukaryotic organ- In bacteria, archaea, and eukaryotes, rRNAs are transcribed
elles mitochondria and chloroplasts. Group II introns form as large precursor molecules that are cleaved into smaller
highly complex secondary structures containing many RNA molecules by removal and discarding of spacer
sequences intervening between the sequences of the dif-
ferent RNAs. The E. coli genome, for example, contains
seven copies of an rRNA gene. Each gene copy is tran-
1 Exon–intron base pairing. The G-binding site nucleotide scribed into a single 30S precursor RNA that is processed
attacks the UpA bond, bonding to the adenine and by the removal of intervening sequences to yield 5S, 16S,
cleaving exon A.
and 23S rRNAs, along with several tRNA molecules
(Figure 8.27a; RNA molecules and subunits are described
G in Svedberg units, abbreviated S, which give an idea of their
G-binding site
OH size). All seven gene copies produce the same three rRNAs,
Exon A Exon B but each gene generates a different set of tRNAs. There is
G PU
Exon A Exon B
Transfer RNA Processing
CUCUCU UCC
The production of tRNA, whether in bacteria, archaea,
Spliced exons
or eukaryotes, also requires posttranscriptional process-
Figure 8.26 Self-splicing of group I introns. ing. Each type of tRNA has distinctive nucleotides and
8.4 Posttranscriptional Processing Modifies RNA Molecules 305
(a) E. coli
RNA-coding gene
DNA 5¿ 3¿
16S tRNA 23S rRNA 5S tRNA
rRNA rRNA
1 Transcription produces Intervening
a 30S pre-RNA. sequence
+ + Ribosomal RNA
16S 23S 5S
and
+ Transfer RNA
tRNA tRNA
(b) Human
rRNA transcriptional unit, 13 kb Intergenic spacer
~27 kb
ETS ITS1 ITS2
DNA 5¿ 3¿
18S 5.8S 28S
1 Transcription synthesizes
a 45S pre-rRNA transcript.
2 Pre-RNA cleavage
produces three rRNAs.
5.8S
+ Ribosomal RNA
18S 28S
Figure 8.27 The processing of ribosomal and transfer RNA. (a) A large transcript is cleaved to produce
rRNA and tRNA in E. coli. (b) Human rRNA genes are part of 40-kb repeating sequences that each produce
three rRNAs.
a specific pattern of folding, but all tRNAs have similar different tRNAs varies, but it is usually substantially less
structures and functions (Figure 8.28). Some bacterial than 61, the number of codons found in mRNA. At a mini-
transfer RNA molecules are produced simultaneously mum, each species must have at least 20 different tRNAs,
with rRNAs, as described above (see Figure 8.27a). Other one for each amino acid, but most produce at least 30 to 40
tRNAs are transcribed as part of a large pre-tRNA tran- different tRNAs. The low number of different tRNAs (com-
script that is then cleaved to yield multiple tRNA mol- pared with the number of codons) results from a phenom-
ecules. In eukaryotes, tRNA genes occur in clusters on enon called third-base wobble, a relaxation of the “rules”
specific chromosomes. Each eukaryotic tRNA gene is indi- of complementary base pairing at the third base of codons
vidually transcribed by RNA polymerase III, and a single (see Section 9.4). Although third-base wobble plays a role
pre-tRNA is produced from each gene. in reducing the number of distinct tRNA genes needed in
The number of different tRNAs produced depends eukaryotic genomes, eukaryotes nevertheless produce a
on the type of organism. In bacteria, the exact number of larger number of different tRNAs than bacteria do. Some
306 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
NH2
Amino acid
H3C C H (alanine)
O C
H
H N
N OH O Amino acid
N A H attachment site
N H H
H N O C2H
H
H O O
H P
H N
OH O O–
N C
H
N H H
(a) Alanine (b)
O O C2H
H
—
3¿
Four double-stranded A O O T°C arm 3¿ end
H (CCA terminus)
stems, three of them C H P and loop
C H N A
with single-stranded OH O O–
5¿ A 5¿ end C
loops, form the N C
G C C
secondary structure N H H
G C
of tRNA molecules. O O C2H
G U H
C G
3¿ binding
G C site for
T°C arm and loop amino acid
D arm and loop U U D arm
G C and
A U U Py U
G A G G C C A loop
G C C C C
D G
U G G C G C
C A G G G G C T°
G A D
G D GMe2 C G G
U A Extra arm
C G
C G
C G Anticodon arm
Anticodon arm
U °
U IMe
C G C Anticodon
Anticodon
Figure 8.28 Transfer RNA structure. Each tRNA has a similar but distinctive structure. The tRNA carrying
alanine is illustrated in two dimensions (a) and three dimensions (b).
eukaryotic genomes contain a full complement of 61 dif- four double-stranded stems, three of which are capped
ferent tRNA genes, one corresponding to each codon of the by single-stranded loops; each stem and loop constitutes
genetic code. an “arm” of the tRNA molecule. Fifth, tRNAs undergo
Bacterial tRNAs require processing before they post-transcriptional addition of bases. The most common
are ready to assume their functional role of transport- addition is three nucleotides, CCA, at the 3′ end of the
ing amino acids to the ribosome. The precise process- molecule. This region is the binding site for the amino
ing events differ somewhat among tRNAs, but several acid the tRNA molecule transports to the ribosome. Fig-
features are common. First, many tRNAs are cleaved ure 8.28 shows tRNAAla, which carries alanine. The CCA
from large precursor tRNA transcripts to produce sev- terminus is indicated, along with chemically modified
eral individual tRNA molecules. Second, nucleotides are nucleotides in each arm that are characteristic of this
trimmed off the 5′ and 3′ ends of tRNA transcripts to tRNA. Both a two-dimensional and a three-dimensional
prepare the mature molecule. Third, certain individual representation are shown.
nucleotides in different tRNAs are chemically modified Eukaryotic and archaeal tRNAs undergo processing
to produce a distinctive molecule. Fourth, tRNAs fold modifications similar to those of bacterial tRNAs. In addi-
into a precise three-dimensional structure that includes tion, however, eukaryotic pre-tRNAs may contain small
Case Study 307
introns that are removed during processing. For example, 5¿ A A A A G G C T T T A A 3¿ Coding strand
DNA
an intron 14 nucleotides in length is removed from the 5¿ Template strand
precursor molecule by a specialized nuclease enzyme that
cleaves the 5′ and 3′ splice sites of tRNA introns. The Transcription
cleaved tRNA then folds into its functional form.
mRNA 5¿ A A A A G G C U U U A A 3¿
RNA Editing
Pairing with guide RNA
A firmly established tenet in the central dogma of biology
is the role of DNA as the repository and purveyor of genetic Single-stranded guide RNA pairs with a portion of messenger
information. Notwithstanding the modifications made to RNA. Note adenine nucleotides in unpaired loops.
precursor RNA transcripts after transcription, a fundamental
principle of biology is that DNA dictates the sequence of mRNA 5¿ A A A A G G C U U U A A 3¿
gRNA 3¿ U U U U C C G A A A U U 5¿
mRNA nucleotides and controls the order of amino acids in A A A A
proteins. And yet, in the mid-1980s, a phenomenon called A AA
RNA editing was uncovered that is responsible for post-
transcriptional substitutions of some of the nucleotides of RNA editing
an mRNA.
The mRNAs from some nuclear genes in eukaryotes, Nuclease enzyme cuts mRNA, and RNA polymerase uses
some plant mitochondrial genes, and some mitochon- unpaired adenines of guide RNA to add uracils to mRNA.
drial genes of trypanosomes are edited by a specialized
mRNA 5¿ AAAUUUAGGUUUUCUUUAA 3¿
RNA called guide RNA (gRNA). A portion of a guide
gRNA UUUAAAUCCAAAAGAAAUU 5¿
RNA contains a sequence complementary to the region
of mRNA that it edits. With the aid of a protein com- Release of edited mRNA
plex, a portion of guide RNA pairs with complementary
nucleotides of pre-edited mRNA and acts as a template to RNA-edited mature mRNA contains uracil nucleotides not encoded
direct the insertion (and occasionally the deletion) of ura- by DNA.
cil (Figure 8.29). Guide RNA releases edited mRNA after
mRNA 5¿ AAAUUUAGGUUUUCUUUAA 3¿
editing is complete. The protein translated from edited
mRNA may differ from the protein produced from uned- Figure 8.29 Guide RNA (gRNA) directs RNA editing.
ited transcript.
RNA editing is responsible for producing two different
apolipoprotein B proteins from a single gene in human liver of the mRNA in intestinal cells, and this produces a stop
and intestinal cells. The same mRNA transcript is initially codon part way through the mRNA that stops translation
produced in both types of cells. In liver cells, the mRNA early and results in an apolipoprotein B that is 2152 amino
is used to produce an apolipoprotein B protein containing acids in length. These two proteins function differently in
4563 amino acids. RNA editing substitutes one nucleotide their respective cell types.
C A SE S T U D Y
Sexy Splicing: Alternative mRNA Splicing and Sex Determination in Drosophila
What causes pre-mRNA to be edited in one way in one type of deeper than simply the number of X chromosomes present
cell and in another way in a different type of cell? The answer and depends on a series of steps that begins with the tran-
has to do with differential gene expression in cells, leading to scription activation of the sex-lethal (Sxl) gene. The process
the presence or absence of specific proteins that determine includes alternative splicing of the pre-mRNA transcript of
which pattern of pre-mRNA splicing will take place in a given a second gene, the transformer (Tra) gene and to additional
nucleus. A well-characterized example of the molecular basis differential gene expression that directs sex development.
of this kind of differential pre-mRNA splicing is provided by The X/A ratio in fly embryos initially influences the level
a part of the mechanism that determines female versus male of transcription and translation of two X-linked activator pro-
sex in the fruit fly Drosophila melanogaster. teins called SisA and SisB compared with that of an autoso-
In Section 3.4 we described the X/autosome ratio mal gene producing a transcription repressor protein called
(X/A ratio) that causes fruit fly embryos with one X chromo- Deadpan (Figure 8.30). Since the genes producing SisA
some to develop as males and those with two X chromo- and SisB are X-linked, early female embryos produce twice
somes to develop as females. The molecular explanation of as much of each activator as do early male embryos, and
why this ratio causes Drosophila sex determination is much the ratio of SisA + SisB to Deadpan differs between female
308 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
1 X/A ratio determines 2 Sxl transcription 3 Sxl protein directs Tra 4 Alternative Dsx
activator–repressor and translation pre-mRNA splicing to produce pre-mRNA
ratio. in female but Tra protein in female embryos, splicing is
not in male not male embryos. controlled by
2 X chromosomes embryos. Tra protein.
Intron Intron
SisA SisA A B Tra protein
Female embryo SisB + SisB
Tra gene
Exon 1 Exon 2 Exon 3 +
(X/A = 1.0) Deadpan + Deadpan Sxl protein pre-mRNA
Tra-2 protein
Female-specific
Tra gene Dsx activates
2 autosomes mature mRNA Exon 1 Exon 3
female genes and
represses male
genes.
Figure 8.30 The X/A ratio determines gene transcription and transcript splicing pattern to determine
sex in fruit flies.
and male embryos 1 . In early female embryos, the ratio of pre-mRNA of Double sex (Dsx) gene along with a second
SisA + SisB protein to Deadpan protein leads to transcrip- protein known as Tra-2 4 . In female embryos, Tra protein
tion of the Sex lethal (Sxl) gene and to the production of Sxl and Tra-2 protein splice Dsx pre-mRNA in one alternative
protein. Sxl transcription is repressed in male embryos and variant, which when translated produces female-specific
no Sxl protein is produced 2 . Dsx protein. Female-specific Dsx activates transcription
Sxl protein is a pre-mRNA splicing regulator protein of female-specific genes and represses transcription of
that operates on the pre-mRNA transcript of the Transformer male-specific genes to produce female flies. Tra protein is
(Tra) gene. In female embryos, Tra pre-mRNA is spliced to absent in male embryos, and Dsx pre-mRNA is spliced in
produce a functional Tra protein 3 . In male embryos, the the other alternative variant. Dsx protein in male embryos
absence of Sxl protein leads to alternative Tra pre-mRNA represses female-specific genes and allows transcription
splicing that does not produce functional Tra protein. The of unrepressed male-specific genes, leading to male sex
Tra protein is also a splicing regulator; it operates on the development.
SU MMA RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
8.1 RNA Transcripts Carry the Messages five-subunit core enzyme and a sigma subunit that aids the
of Genes recognition of different forms of bacterial promoters.
❚❚ Bacterial promoters have two consensus sequence regions
❚❚ RNA molecules are synthesized by RNA polymerases using located upstream of the transcription start at approximately
as building blocks the RNA nucleotides A,G,C, and U to - 10 and - 35.
form single-stranded sequences complementary to DNA ❚❚ The core enzyme of bacterial RNA polymerase carries
template strands. out RNA synthesis following chain initiation by the
❚❚ Messenger RNA is the transcript that undergoes transla- holoenzyme.
tion to produce proteins. The many other forms of RNA are ❚❚ Transcription of most bacterial genes terminates by an
also transcribed, and may undergo modification, but are not intrinsic mechanism that depends only on DNA terminator
translated. sequences. Certain bacterial genes have a rho-dependent
mechanism of transcription termination.
8.2 Bacterial Transcription Is a
Four-Stage Process 8.3 Eukaryotic Transcription Is More
Diversified and Complex than
❚❚ Transcription has four stages: promoter recognition, chain
initiation, chain elongation, and chain termination.
Bacterial Transcription
❚❚ A single RNA polymerase transcribes all bacterial ❚❚ Eukaryotic cells contain three types of RNA polymerases
genes. This polymerase is a holoenzyme composed of a that transcribe mRNA and the various other classes of RNA.
Problems 309
❚❚ RNA polymerase II transcribes mRNA by interaction with 8.4 Posttranscriptional Processing Modifies
numerous transcription factors that lead the enzyme to rec- RNA Molecules
ognize promoters controlling transcription of polypeptide-
coding genes. ❚❚ 5′ capping of eukaryotic messenger RNA adds a methyl-
❚❚ Promoters recognized by RNA polymerase II have a ated guanine through the action of guanylyl transferase
TATA box and additional regulatory elements that bind shortly after transcription is initiated.
transcription factors and RNA pol II during transcription ❚❚ Polyadenylation at the 3′ end of eukaryotic messenger
initiation. RNA is signaled by an AAUAAA sequence and is accom-
❚❚ Tissue-specific and developmental modifications in plished by a complex of enzymes.
transcription are regulated by enhancer and silencer ❚❚ RNA splicing is controlled by cellular proteins that identify
sequences. introns and exons and form spliceosome complexes that
❚❚ RNA polymerase I uses exclusive transcription factors remove introns and ligate exons.
to recognize upstream consensus sequences of ribo- ❚❚ Consensus sequences at the 5′ splice site, the 3′ splice
somal RNA genes. Ribosomal RNAs are processed in the site, and the branch point serve as guides during RNA splicing.
nucleolus. ❚❚ Alternative splicing is regulated by cell-type–specific varia-
❚❚ RNA polymerase III recognizes promoter consensus tion of proteins that identify introns and exons.
sequences that are upstream and downstream of the start of ❚❚ Some RNA molecules have catalytic activity and are able to
transcription for tRNA genes. self-splice introns without the aid of proteins.
❚❚ Archaeal transcription is a simplified version of eukary- ❚❚ Ribosomal and transfer RNA molecules are generated by
otic transcription and has less in common with bacterial cleavage of large precursor molecules transcribed in bacte-
transcription. rial, archaeal, and eukaryotic genomes.
❚❚ Comparative studies of transcription reveal that the three ❚❚ RNA editing is a post-transcriptional altering of nucleotide
domains of life share common transcriptional mecha- sequence, causing the transcripts to differ from the corre-
nisms that are attributable to their sharing of a common sponding template DNA sequence.
ancestor.
P R E PA R I N G F O R P R O BLEM S O LV I NG
In addition to the list of problem-solving tips and sugges- 3. Understand the two mechanisms of transcription
tions given here, you can go to the Study Guide and Solu- termination in bacteria, and the connection between
tions Manual that accompanies this book for help at solving transcription termination of eukaryotic genes and post-
problems. transcriptional processing.
1. Understand the structure of bacterial and eukaryotic 4. Be prepared to describe the posttranscriptional process-
promoters; be familiar with the structure of genes and ing events that modify eukaryotic pre-mRNA.
the relative positions of their landmarks (promoter, start
5. Understand the experimental approaches that can iden-
of transcription, etc.); and be able to identify the tem-
tify promoters and their functional sequences.
plate and nontemplate strands of a gene.
6. Be prepared to interpret the results of experiments ana-
2. Be prepared to describe the mechanisms of bacterial
lyzing DNA binding of transcriptional proteins or the
and eukaryotic gene transcription initiation, including
transcription of genes.
the complementary and 5′ and 3′ relationships between
the nucleic acid strands.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Based on discussion in this chapter, 3. Answer these questions concerning promoters.
a. What is a gene? a. What role do promoters play in transcription?
b. Why are genes for rRNA and tRNA considered to be b. What is the common structure of a bacterial promoter
genes even though they do not produce polypeptides? with respect to consensus sequences?
2. In one to two sentences each, describe the three processes c. What consensus sequences are detected in the mam-
that commonly modify eukaryotic pre-mRNA. malian b@globin gene promoter?
310 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
d. Eukaryotic promoters are more variable than bacterial Gene 1 . . . TTCCGGCTCGTATGTTGTGTGG A . . .
promoters. Explain why.
e. What is the meaning of the term alternative promoter? Gene 2 . . . CGTCATTTGATATGATGCGCCCC G. . .
How does the use of alternative promoters affect Gene 3 . . . CCACTGGCGGTGATACTGAGCAC A. . .
transcription? Gene 4 . . . TTTATTGCAGCTTATAATGGTTAC A . . .
4. The diagram below shows a DNA duplex. The template Gene 5 . . . TGCTTCTGACTATAATAGACAGG G. . .
strand is identified, as is the location of the + 1 nucleotide.
Gene 6 . . . AAGTAAACACGGTACGATGTACCAC A . . .
+1
5′ _____________________________ 3′ template strand 8. Bacterial and eukaryotic gene transcripts can differ—in
the transcripts themselves, in whether the transcripts are
3′ _____________________________ 5′ coding strand modified before translation, and in how the transcripts
a. Assume this region contains a gene transcribed in a are modified. For each of these three areas of contrast,
bacterium. Identify the location of promoter consensus describe what the differences are and why the differences
sequences and of the transcription termination sequence. exist.
b. Assume this region contains a gene transcribed to form
mRNA in a eukaryote. Identify the location of the most 9. Describe the two types of transcription termination found
common promoter consensus sequences. in bacterial genes. How does transcription termination
differ for eukaryotic genes?
c. If this region is a eukaryotic gene transcribed by RNA
polymerase III, where are the promoter consensus 10. What is the role of enhancer sequences in transcription of
sequences located? eukaryotic genes? Speculate about why enhancers are not
5. The following is a portion of an mRNA sequence: part of transcription of bacterial genes.
a. During transcription, was the adenine at the left-hand side 12. Draw a bacterial promoter and label its consensus
of the sequence the first or the last nucleotide used to build sequences. How does this promoter differ from a
the portion of mRNA shown? Explain how you know. eukaryotic promoter transcribed by RNA polymerase II?
b. Write out the sequence and polarity of the DNA duplex By RNA polymerase I? By RNA polymerase III?
that encodes this mRNA segment. Label the template 13. For a eukaryotic gene whose transcription requires the
and coding DNA strands. activity of an enhancer sequence, explain how proteins
c. Identify the direction in which the promoter region for bound at the enhancer interact with RNA pol II and
this gene will be located. transcription factors bound at the promoter.
6. Compare and contrast the properties of DNA polymerase 14. Three genes identified in the diagram as A, B, and C are
and RNA polymerase, listing at least three similarities and transcribed from a region of DNA. The 5′@to@3′ transcrip-
at least three differences between the molecules. tion of genes A and C elongates mRNA in the right-to-left
7. The DNA sequences shown below are from the promoter direction, and transcription of gene B elongates mRNA in
regions of six bacterial genes. In each case, the last the left-to-right direction. For each gene, identify the cod-
nucleotide in the sequence (highlighted in blue) is the ing strand by designating it as an “upper strand” or “lower
+ 1 nucleotide that initiates transcription. strand” in the diagram.
a. Examine these sequences and identify the Pribnow box
A B C
sequence at approximately - 10 for each promoter.
5¿ 3¿
b. Determine the consensus sequence for the Pribnow
box from these sequences. 3¿ 5¿
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
15. The eukaryotic gene Gen-100 contains four introns 3’-TGGGTCGGGGCGGATTACTGCCCCGAAAAAAAACTTG-5’
labeled A to D. Imagine that Gen-100 has been isolated 5’-ACCCAGCCCCGCCTAATGACGGGGCTTTTTTTTGAAC-3’
and its DNA has been denatured and mixed with polyad-
a. Draw the mRNA structure that forms during transcrip-
enylated mRNA from the gene.
tion of this segment of the TrpA gene.
a. Illustrate the R-loop structure that would be seen with b. Label the template and coding DNA strands.
electron microscopy. c. Explain how a sequence of this type leads to intrinsic
b. Label the introns. termination of transcription.
c. Are intron regions single stranded or double stranded?
Why? 17. A 2-kb fragment of E. coli DNA contains the complete
sequence of a gene for which transcription is terminated
16. The segment of the bacterial TrpA gene involved in intrin- by the rho protein. The fragment contains the complete
sic termination of transcription is the following; promoter sequence as well as the terminator region of
Problems 311
the gene. The cloned fragment is examined by band shift one of the two is mixed with TFIIB, TFIID, and RNA
assay (see Research Technique 8.1). Each lane of a single polymerase II. The DNA exposed to these proteins is run
electrophoresis gel contains the 2-kb cloned fragment in the right-hand lane of the gel shown below and the con-
under the following conditions: trol DNA is run in the left-hand. Both DNA samples are
Lane 1: 2-kb fragment alone treated with DNase I before running the samples on the
Lane 2: 2-kb fragment plus the core enzyme electrophoresis gel.
Lane 3: 2-kb fragment plus the RNA polymerase holoen- a. What length of DNA is bound by the transcriptional
zyme proteins? Explain how the gel results support this
Lane 4: 2-kb fragment plus rho protein interpretation.
b. Draw a diagram of this DNA fragment bound by the
a. Diagram the relative positions expected for the DNA
transcriptional proteins, showing the approximate posi-
fragments in this gel electrophoresis analysis.
tion of proteins along the fragment. Use the illustration
b. Explain the relative positions of bands in lanes 1 and 3.
style seen in Research Technique 8.1 as a model.
c. Explain the relative positions of bands in lanes 1 and 4.
c. Explain the role of DNase I.
18. A 3.5-kb segment of DNA containing the complete
sequence of a mouse gene is available. The DNA segment 20. Wild-type E. coli grow best at 37°C but can grow effi-
contains the promoter sequence and extends beyond the ciently up to 42°C. An E. coli strain has a mutation of
polyadenylation site of the gene. The DNA is studied by the sigma subunit that results in an RNA polymerase
band shift assay (see Research Technique 8.1), and the holoenzyme that is stable and transcribes at wild-type
following gel bands are observed. levels at 37°C. The mutant holoenzyme is progressively
destabilized as the temperature is raised, and it completely
denatures and ceases to carry out transcription at 42°C.
Lane: 1 2 3 4 5 Relative to wild-type growth, characterize the ability of
the mutant strain to carry out transcription at
a. 37°C
b. 40°C
c. 42°C
d. What term best characterizes the type of mutation
exhibited by the mutant bacterial strain? (Hint: The
term was used in Chapter 4 to describe the Himalayan
allele of the mammalian C gene.)
21. A mutant strain of Salmonella bacteria carries a muta-
Match these conditions to a specific lane of the gel.
tion of the rho protein that has full activity at 37°C but is
a. 3.5-kb fragment plus TFIIB and TFIID completely inactivated when the mutant strain is grown at
b. 3.5-kb fragment plus TFIIB, TFIID, TFIIF, and RNA 40°C.
polymerase II
a. Speculate about the kind of differences you would
c. 3.5-kb fragment alone
expect to see if you compared a broad spectrum of
d. 3.5-kb fragment plus RNA polymerase II
mRNAs from the mutant strain grown at 37°C and the
e. 3.5-kb fragment plus TFIIB
same spectrum of mRNAs from the strain when grown
19. A 1.0-kb DNA fragment from the 5′ end of the mouse at 40°C.
gene described in the previous problem is examined by b. Are all mRNAs affected by the rho protein mutation in
DNA footprint protection analysis (see Research Tech- the same way? Why or why not?
nique 8.1). Two samples are end-labeled with 32P, and
22. The human b@globin wild-type allele and a certain mutant
allele are identical in sequence except for a single base-
bp – pair substitution that changes one nucleotide at the end
of intron 2. The wild-type and mutant sequences of the
1000 affected portion of pre-mRNA are
900
Intron 2 Exon 3
800
wild type 5’-CCUCCCACAG CUCCUG-3’
700
mutant 5’-CCUCCCACUG CUCCUG-3’
600
500 a. Speculate about the way in which this base substitution
400 causes mutation of b@globin protein.
b. This is one example of how DNA sequence change
300
occurring somewhere other than in an exon can produce
200 mutation. List other kinds of DNA sequence changes
100 occurring outside exons that can produce mutation. In
1 each case, characterize the kind of change you would
+ expect to see in mutant mRNA or mutant protein.
312 CHAPTER 8 Molecular Biology of Transcription and RNA Processing
23. Microbiologists describe the processes of transcription DNase I. Lane 2 contains cloned DNA that was exposed
and translation as “coupled” in bacteria. This term indi- only to DNase I. RNA pol II and TFIIs were not mixed
cates that a bacterial mRNA can be undergoing transcrip- with that DNA before adding DNase I.
tion at the same moment it is also undergoing translation. a. Explain why this gel provides evidence that the cloned
a. How is coupling of transcription and translation pos- DNA may act as a promoter sequence.
sible in bacteria? b. Approximately what length is the DNA region pro-
b. Is coupling of transcription and translation possible in tected by RNA pol II and TFIIs?
single-celled eukaryotes such as yeast? Why or why c. What additional genetic experiments would you sug-
not? gest to verify that this region of cloned DNA contains
24. A full-length eukaryotic gene is inserted into a bacterial a functional promoter?
chromosome. The gene contains a complete promoter
sequence and a functional polyadenylation sequence, and 1 2
it has wild-type nucleotides throughout the transcribed
region. However, the gene fails to produce a functional 400
protein.
a. List at least three possible reasons why this eukaryotic 350
gene is not expressed in bacteria. 300
b. What changes would you recommend to permit
expression of this eukaryotic gene in a bacterial cell? 280
Base pairs
for the gene are labeled, and a segment of DNA sequence
is given. For this gene segment:
a. Superimpose a drawing of RNA polymerase as it nears
the end of transcription of the DNA sequence.
b. Indicate the direction in which RNA polymerase 100
moves as it transcribes this gene. 80
c. Write the polarity and sequence of the RNA transcript
from the DNA sequence given. 50
d. Identify the direction in which the promoter for this 1
gene is located.
ATTAACGATCGA
Coding CGC TC
strand 5¿ AT 5¿
Template 3¿ TA G 3¿
27. Suppose you have a 1-kb segment of cloned DNA that
strand
GCG
TAATTGCTAG CTA is suspected to contain a eukaryotic promoter includ-
ing a TATA box, a CAAT box, and an upstream GC-rich
sequence. The clone also contains a gene whose transcript
26. DNA footprint protection (described in Research Tech- is readily detectable. Your laboratory supervisor asks you
nique 8.1) is a method that determines whether proteins to outline an experiment that will (1) determine if eukary-
bind to a specific sample of DNA and thus protect part of otic transcription factors (TF) bind to the fragment and,
the DNA from random enzymatic cleavage by DNase I. if so, (2) identify where on the fragment the transcrip-
A 400-bp segment of cloned DNA is thought to contain tion factors bind. All necessary reagents, equipment, and
a promoter. The cloned DNA is analyzed by DNA foot- experimental know-how are available in the laboratory.
printing to help determine if it has the capacity to act as a Your assignment is to propose techniques to be used to
promoter sequence. The accompanying gel has two lanes, address the two items your supervisor has listed and to
each containing the cloned 400-bp DNA fragment treated describe the kind of results that would indicate bind-
with DNase I to randomly cleave unprotected DNA. Lane ing of TF to the DNA and the location of the binding.
1 is cloned DNA that was mixed with RNA polymerase II (Hint: The techniques and general results are discussed in
and several TFII transcription factors before exposure to this chapter.)
Problems 313
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
28. Assume that a mutation affects the gene for each of the 30. Genomic DNA from a mouse is isolated, fragmented, and
following eukaryotic RNA polymerases. Match each denatured into single strands. It is then mixed with mRNA
mutation with the possible effects from the list provided. isolated from the cytoplasm of mouse cells. The image
More than one effect is possible for each mutation. represents an electron micrograph result showing the
hybridization of single-stranded DNA and mRNA.
RNA Polymerase Mutation Effect(s)
b c
RNA pol I _______________
RNA pol II _______________ d
Possible Effects
a. Pre-mRNA does not have introns removed.
a. Which nucleic acid is indicated by the “a” pointer?
b. Some pre-mRNA is not synthesized.
Justify your answer.
c. Some rRNA is not synthesized.
b. Which nucleic acid is indicated by the “b” pointer?
d. Some tRNA is not synthesized.
Justify your answer.
e. Ribosomal RNA is not processed.
c. What term best identifies the nucleic acid region indi-
29. The DNA sequence below gives the first 12 base pairs cated by the “c” pointer?
of the transcribed region of a gene, and the template and d. What term best identifies the nucleic acid region indi-
nontemplate strands of DNA are identified. The tran- cated by the “d” pointer?
scription start is the thymine nucleotide at the end of the e. Based on this electron micrograph image, how many
sequence given. Use the diagram to answer the list of introns and exons are present in the mouse DNA frag-
questions. Make a copy of the diagram before you begin ment shown?
answering the questions, or have one group member dia-
gram the answers for bacteria and another group member 31. A portion of a human gene is isolated from the genome
diagram the answers for eukaryotes. and sequenced. The corresponding segment of mRNA
is isolated from the cytoplasm of human cells, and it is
Nontemplate strand ___________TTGCTACGGTCA___________ also sequenced. The nucleic acid strings shown here are
from genomic coding strand DNA and the corresponding
Template strand ___________AACGATGCCAGT___________ mRNA.
a. Write the polarity of the two DNA strands shown. mRNA 5’ ACGCAUUACGUGGCUAGACAUUUAGC-
b. Give the mRNA transcript sequence and the polarity of CGAUCAGACUAGACAGCGCGCUAGCG-
the transcript. AUAGCGCUAAAGCUGACUCGCGAUCAGUCUC-
c. Assuming the sequence shown is part of a bacterial GAGGGCACAUAGUCUA 3’
gene, draw the approximate positions of the promoter Genomic Coding 5’ ACGCATTACGTGGCTAGACATTTAGC-
sequence and the termination sequence. Strand DNA CGATCAGACTAGACAGCGCGCTAGCGAGTC-
d. Assuming the sequence shown is part of a bacterial TACCTCAAGCCAUAATAGACAGTAGA-
gene, what consensus sequence(s) would you expect to CATTGAAAGACATAGATAGACATAGAGA-
identify in the promoter? CTTAGACATACGACCGGACATACCAAGAC-
e. Write the anticipated bacterial consensus sequence(s) GAATACGAACACTATACAGCCUCAGTAGCGC-
in the approximate position(s) on the diagram. TAAAGCTGACTCGCGATCAGTCTCGAGGGCA-
f. Assuming the sequence shown is part of a eukaryotic CATAGTCTA 3’
gene, what consensus sequence(s) would you expect
to identify within about 100 base pairs of the start of a. There is one intron in the DNA sequence shown.
transcription? Locate the intron and underline the splice site
g. Write the anticipated eukaryotic consensus sequence(s) sequences.
in the approximate position(s) on the diagram. b. Does this intron contain normal splice site sequences?
9 The Molecular Biology
of Translation
CHAPTER OUTLINE
9.1 Polypeptides Are Amino Acid
Chains That Are Assembled at
Ribosomes
9.2 Translation Occurs in Three
Phases
9.3 Translation Is Fast and Efficient
9.4 The Genetic Code Translates
Messenger RNA into
Polypeptide
9.5 Experiments Deciphered the
Genetic Code
ESSENTIAL IDEAS Ribosomes use codon sequences of messenger RNA to direct the assembly
of polypeptides during translation. This rendering of a ribosome engaged
❚❚ Translation is the cellular process of
in translation shows the large subunit (top), the small subunit (bottom), the
polypeptide production carried out by
path of mRNA through the small subunit, the spaces for E, P, and A sites into
ribosomes under the direction of mRNA.
which tRNAs fit, and the egress of the polypeptide through the large subunit.
❚❚ Ribosomes assemble on mRNA and initi-
L
ate translation at the start codon.
❚❚ Transfer RNA molecules carry amino ong before the discovery that DNA is the hereditary
acids to ribosomes, which assemble
molecule, biologists had established the relationship
polypeptides with the aid of ribosomal
proteins. between genes and proteins. In 1902, Archibald Garrod was
❚❚ Polypeptide elongation and termination the first to explicitly draw this connection when he proposed
are similar in bacteria and eukaryotes. that the human hereditary disorder alkaptonuria was caused
❚❚ A virtually universal genetic code by an inherited defect in the enzyme homogentisic acid
comprising 64 mRNA codons directs
polypeptide assembly. oxidase (see Section 4.3 and Figure 4.17b). As Garrod and
❚❚ Polypeptides undergo posttransla- other biologists expanded their exploration of the gene–
tional folding and processing, and in protein connection, they found evidence that hereditary
eukaryotes are sorted into vesicles for
variation was closely tied to variations in proteins. Principal
transport to cellular destinations or for
secretion. among the biologists who developed this connection were
George Beadle and Edward Tatum, whose research established
314
9.1 Polypeptides Are Amino Acid Chains That Are Assembled at Ribosomes 315
the “one gene—one enzyme” hypothesis (see specific characteristics that allow the amino acid to partici-
Experimental Insight 4.1, pp. 125–126). pate in certain chemical reactions or behave in a hydrophilic
or hydrophobic manner. In part, the common features allow
This chapter discusses translation, the mechanism
amino acids to be joined into polypeptides by covalent bond
by which the messenger RNA (mRNA) transcripts formation between adjacent amino acids in the chain.
of genes are used to assemble amino acids into
polypeptides (strings of amino acids) that form Amino Acid Structure
proteins. Translation is carried out by ribosomes that The shared features of amino acids are a central carbon mol-
bring together mRNA transcripts and transfer RNA ecule known as the a@carbon, an amino (NH3+) group, and
(tRNA) molecules carrying amino acids to facilitate a carboxyl (COO-) group (Figure 9.1). Each amino and car-
boxyl group is joined to the a@carbon. During polypeptide
the assembly of polypeptides. Polypeptides make
assembly, an enzyme in the ribosome catalyzes the forma-
up enzymes, structural proteins, transport proteins, tion of a peptide bond between the carboxyl group of one
signaling proteins, hormones, and other components amino acid and the amino group of the next amino acid in
that are assembled into cell structures or perform the chain. Each amino acid added in this way becomes a
new monomer in the growing polymer that is the elongat-
biological activities in or among cells.
ing polypeptide. The term polypeptide signifies a string of
The story of how polypeptides are produced by amino acids that are joined by peptide bonds. Each protein
translation and of how scientists came to understand has a unique sequence of amino acids, may be composed of
the process offers intriguing insight into the design of one or more polypeptide chains, and generally has its own
characteristic three-dimensional structure.
molecular genetic experiments. In this chapter, we
The distinctive portion of each amino acid is its side
describe some of these experiments and examine the chain, known as an R-group, that is also joined to the
molecular biology of translation. We also look at the a@carbon. The R-groups range in complexity from a single
homology of proteins that are active in translation in hydrogen atom to ringed structures that in themselves con-
tain multiple carbon atoms. Each R-group imparts specific
organisms from the three domains of life and describe
characteristics as shown in Table 9.1. Ten of the amino acids
how this and other features of translation are evidence have nonpolar R-groups, meaning they have no charged
of a single origin of life and of the evolutionary atoms that can participate in formation of hydrogen bonds
relationships between bacteria, archaea, and eukaryotes. with other amino acids. Five other amino acids have polar
R-groups that can carry partial charges and can partici-
pate in hydrogen bond formation with other amino acids.
The five remaining amino acids have electrically charged
R-groups: Three are basic and two are acidic. Electrically
9.1 Polypeptides Are Amino Acid charged R-groups allow these amino acids to form ionic
Chains That Are Assembled at bonds and hydrogen bonds.
Ribosomes
Polypeptide and Transcript Structure
Twenty different amino acids are the basic building blocks Polypeptide assembly is orchestrated by ribosomes, which
of polypeptides. All amino acids have features in common are ribonucleoprotein “machines” containing multiple mol-
and features that are distinct. The distinctive features impart ecules of ribosomal RNA (rRNA) and dozens of proteins.
Carboxyl Amino
group group
H H H Peptide bond H O H
O O formation O
H3N +
C C + H N +
C C H3N
+
C C N C C + H2O
O– O– Peptide O–
R1 H R2 R1 bond H R2
a-carbon
a-carbon
R groups
Figure 9.1 Amino acids and peptide bond formation. The carboxyl group (COO-) of one amino acid
reacts with the amino group (+H3N) of the adjacent amino acid to form a covalent peptide bond that links
amino acids in a polypeptide chain. Amino acids contain a central carbon (the a@carbon) and an R-group,
here identified as R1 and R2.
316 CHAPTER 9 The Molecular Biology of Translation
Exiting tRNA
Table 9.1 Amino Acids Grouped by Their Side (uncharged)
tN
Chain Properties Polypeptide
Me
r
Se
Nonpolar side chains: Have no charged or electronegative u
s Le
atoms at pH 7.0 to form hydrogen bonds. Cy
et
Alanine (Ala or A) Methionine (Met or M) Amino
M
g
acids
Ala Pro Ar
Cysteine (Cys or C) Phenylalanine (Phe or F)
Glycine (Gly or G) Proline (Pro or P)
Large subunit
Isoleucine (Ile or I) Tryptophan (Trp or W)
r
Gln Se
Leucine (Leu or L) Valine (Val or V)
Polar side chains: Have partial charges at pH 7.0 and can Bound
form hydrogen bonds. amino acid
E site P site
Asparagine (Asp or N) Threonine (Thr or T) tRNA tRNA A site
tRNA
Glutamine (Glu or Q) Tyrosine (Tyr or Y)
5¿
Serine (Ser or S) UCG GUU CUG
AGC CAA GAC
Electrically charged side chains: At pH 7.0, can form
hydrogen and ionic bonds. Small subunit
Codons
on mRNA
Basic Side Chains Acidic Side Chains
Incoming
Arginine (Arg or R) Aspartate (Asp or D) tRNA (charged)
Histidine (His or H) Glutamate (Glu or E) mRNA
Amino acid
s om e m
a
3¿
Ribosomes of all organisms are composed of two subunits
ovement
that assemble into a ribosome as translation begins. Ribo- tRNA
(uncharged)
somes bind mRNA and provide an environment for com-
plementary base pairing between mRNA codon sequences
and the anticodon sequences of tRNA. (See Section 1.3 Anticodon
for a basic review of translation.) Figure 9.2 encapsulates
the essential elements of translation. Ribosomes trans- Figure 9.2 Translation overview.
late mRNA in the 5′ S 3′ direction, beginning with the
start codon and ending with a stop codon. At each triplet
of polypeptides (the number of amino acids they contain)
codon, complementary base pairing between mRNA and
are effectively limitless. There are billions of possible amino
tRNA determines which amino acid is added to the nascent
acid sequences. At the same time, the specific order of
(growing) polypeptide. The start codon and stop codon
amino acids in any given polypeptide is critical to its proper
define the boundaries of the translated segment of mRNA.
folding and functioning.
The resulting polypeptides have an N-terminal (amino-
terminal) end corresponding to the 5′ end of mRNA and a
C-terminal (carboxyl-terminal) end that corresponds to the Gene
+1
3′ end of mRNA (Figure 9.3). RNA-coding region
Figure 9.3 identifies two segments of the mRNA 5¿ Coding strand 3¿
transcript that do not undergo translation. Between the 3¿ Template strand 5¿
5′ end of mRNA and the start codon is a segment known DNA Promoter
Transcription Terminator
as the 5′ untranslated region, abbreviated 5′ UTR. Start Stop
The region between the stop codon and the 3′ end of the codon codon
molecule is the 3′ untranslated region, or 3′ UTR. The mRNA 5¿ 3¿
5′ UTR contains sequences that help initiate translation and 5¿ UTR 3¿ UTR
Translation
the 3′ UTR contains sequences associated with transcrip-
tion termination. Polypeptide
Polypeptides have four levels of organization H3N+ COO–
(Table 9.2). The polypeptide primary structure is the Amino terminal Carboxyl terminal
(N-terminus) (C-terminus)
sequence of amino acids contained in the polypeptide. The
differences in the order of amino acids and in the lengths Figure 9.3 Alignment of DNA, mRNA, and polypeptide.
9.1 Polypeptides Are Amino Acid Chains That Are Assembled at Ribosomes 317
One of
hemoglobin’s
Tertiary Overall three-dimensional Bonds and other interac- subunits
shape of a polypeptide tions between R-groups, or
(includes contribution from between R-groups and the
secondary structures) peptide-bonded backbone
Hemoglobin
consists
Quaternary Shape produced by Bonds and other interac-
of four
combinations of polypep- tions between R-groups, polypeptide
tides (each with its own and between peptide subunits
tertiary structure) backbones of different
polypeptides
Polypeptide secondary structure consists of certain that have two or more polypeptides (and therefore a quater-
common configurations adopted by portions of polypeptides, nary structure) are often described as multimers. The indi-
owing primarily to hydrogen bonds and ionic interactions vidual polypeptides of a multimer may be identical or may
that form between amino acids. Hydrogen bond formation be different. A protein composed of four identical polypep-
causes amino acids with polar R-groups to align with one tides, for example, can be called a homotetramer, whereas a
another. This can result in local bending or twisting of the four-polypeptide protein that contains two or more different
polypeptide into one of two possible structures: An a@helix polypeptides can be identified as a heterotetramer. Table 9.2
(alpha helix) is a twisted coil of amino acids stabilized summarizes these four levels of polypeptide structure for
by hydrogen bonds between partially charged R-groups; the red blood cell protein hemoglobin—a heterotetrameric
a b@pleated sheet (beta-pleated sheet) is a roughly protein that is responsible for carrying oxygen.
130-degree bend c reated when hydrogen bonding between
amino acids induces a segment of a polypeptide to fold. Ribosome Structures
A polypeptide’s tertiary structure is the three-
The specific molecules composing bacterial, archaeal, and
dimensional structure of the folded polypeptide as a whole.
eukaryotic ribosomes differ, but the overall structures and
Polypeptides that are active are in their tertiary structure.
functions of the ribosomes are similar, reflecting the fun-
Some polypeptides are capable of assuming two or more
damental nature of the translation process in all forms of
somewhat different tertiary structures. These may include an
life. In all three domains, ribosomes perform three essential
active structure and an inactive structure, or other combi-
tasks:
nations. A range of interactions involving the R-groups—
hydrogen bonds, covalent bonds, ionic interactions, and 1. Bind messenger RNA and identify the start codon
hydrophobic interactions—produce the overall shape of the where translation begins.
protein. 2. Facilitate the complementary base pairing of mRNA
Primary, secondary, and tertiary structures of polypep- codons and tRNA anticodons that determines amino
tides are interdependent—the primary structure leads to acid order in the polypeptide.
certain secondary structure possibilities and these, in turn,
3. Catalyze peptide bond formation between amino acids
lead to the formation of the one or more possible tertiary
during polypeptide formation.
structures of a polypeptide. But some proteins in their active
form consist of two or more polypeptides, and this level of Differences in ribosomal composition between bac-
organization is called the quaternary structure. Proteins teria, archaea, and eukaryotes include the number and
318 CHAPTER 9 The Molecular Biology of Translation
sequence of rRNA molecules and the number and type of Both the large and small subunits contribute to the
ribosomal proteins. Although the archaeal and bacterial formation of three regions that play important functional
ribosomes are similar in size, and somewhat smaller than roles during translation: the peptidyl site, or P site, the
the eukaryotic ribosomes, most of the archaeal ribosomal aminoacyl site, or A site, and the exit site, or E site. The
proteins (and the tRNAs and protein factors involved in P site holds a tRNA to which the nascent polypeptide is
translation) display homology to their eukaryotic counter- attached. The A site binds a new tRNA molecule carrying
parts. In all three domains, ribosomes display key structural the next amino acid to be added to the polypeptide. The E
similarities, beginning with their each consisting of two site provides an avenue of egress for tRNAs as they leave
main subunits, called the large ribosomal subunit and the ribosome after their amino acid has been added to the
the small ribosomal subunit. By convention, subunit polypeptide chain. The small ribosomal subunit contains a
size is measured in Svedberg units (S), which describe the channel to hold the mRNA. In addition, there is a channel in
velocity of their sedimentation when subjected to a centrif- the large subunit through which the nascent polypeptide is
ugal force. Named in honor of Theodor Svedberg, a 1926 extruded from the ribosome (see Figure 9.2).
Nobel Laureate in Chemistry and inventor of the ultracen- Among eukaryotes, mammalian ribosomes are the most
trifuge, higher S values indicate faster sedimentation rates fully characterized. The small 40S ribosomal subunit con-
and larger molecules. It should be noted that Svedberg tains 34 proteins and a single 18S rRNA composed of 1874
units are not additive when ribosomal subunits are com- nucleotides. The large mammalian ribosomal subunit has a
bined, because sedimentation is a composite property that Svedberg value of 60S and contains 49 proteins, along with
is affected by multiple molecular factors, including size, three molecules of rRNA. The rRNA molecules have val-
shape, and hydration state. ues of 5S (120 nucleotides), 5.8S (160 nucleotides), and 28S
The ribosomes of E. coli are the most thoroughly (4718 nucleotides). The intact mammalian ribosome has
studied bacterial ribosomes and serve as a model for general a Svedberg value of 80S. Like the bacterial ribosome, the
ribosome structure (Figure 9.4). The small subunit of these intact mammalian ribosome possesses a P site, an A site, an
bacterial ribosomes has a Svedberg value of 30S. It con- E site, and a channel for polypeptide egress.
tains 21 proteins and a single 16S rRNA composed of 1541 The ribosomes of archaeal species have not been stud-
nucleotides. The large subunit of this bacterial ribosome is ied nearly as fully as those of bacteria and eukaryotes, but
a 50S particle composed of 32 proteins, a small 5S rRNA some information is available. The structure of the ribo-
containing 120 nucleotides, and a large 23S rRNA con- somes of archaeal species reveals strong similarity to bac-
taining 2904 nucleotides. When fully assembled, the intact terial ribosomes. The large subunit of archaea contains a
E. coli ribosome has a Svedberg value of 70S. 23S and a 5S rRNA and 27 proteins. Analysis of the small
P site P site
E site A site E site A site
Figure 9.4 Ribosomes of bacteria, archaea, and eukaryotes. The ribosomes of E. coli and of archaeal
species (such as Sulfolobus solfataricus) are similar in rRNA and protein content, whereas mammalian
ribosomes are somewhat different.
9.1 Polypeptides Are Amino Acid Chains That Are Assembled at Ribosomes 319
(a) (b)
50S 50S
rRNA
rRNA
E-site
tRNA
A-site
Anticodon 30S tRNA
30S site
mRNA Protein
Protein
rRNA rRNA
Figure 9.5 Ribosome structure and tRNA-binding sites interpreted from cryo-EM–generated data.
9.2 Translation Occurs in Three Phases 321
subunit, (3) the large ribosomal subunit, (4) the initiator For most of translation initiation in bacteria, the 30S
tRNA, and (5) three essential initiation factor proteins. The ribosomal subunit is affiliated with an initiation factor
sixth component, GTP (guanosine triphosphate) provides (IF) protein called IF3, which facilitates binding between
energy for this and other steps of translation through the the mRNA and the 30S subunit. IF3 also prevents the
cleavage of individual phosphate molecules. 30S subunit from binding to the 50S subunit (Figure 9.6).
E P A
Polypeptide-coding
IF3 sequence
AUGCG U
Shine–Dalgarno Start
sequence codon
mRNA 5¿ AGGAGGUUCAGGAUAUGCGU 3¿
16S rRNA 3¿ UCCUCC 5¿
3¿
5¿
fMet
IF2
fMet
fMet
GTP GTP
IF2 Initiator tRNA
P IF1
E A
IF3 UAC
IF1
P
E
IF3 UAC A
AUG C
3’
GU
5’
GDP Charged tRNAfMet, IF1, and IF2 join in the formation of the initiation
complex; GTP provides energy.
3 Ribosome assembly
Arg
fMet
GCA
P
Ribosome movement
E UAC A along mRNA
AUGCG
U
The large subunit joins the initiation complex; IFs dissociate. The next
3¿ charged tRNA enters the A site.
5¿
Figure 9.6 Initiation of bacterial translation. The Shine–Dalgarno sequence orients the mRNA on the
small subunit.
Q In a sentence or two describe the mechanism that places the start codon of a bacterial mRNA in
position to begin translation.
322 CHAPTER 9 The Molecular Biology of Translation
The small subunit—IF3 complex binds near the 5′ end of bound to the 30S subunit, tRNAfMet located at the start codon,
mRNA, searching for the AUG sequence that serves as the three initiation factors, and a molecule of GTP, has been
start codon. The preinitiation complex forms when the formed.
authentic start codon sequence is identified by base pairing In the final step of initiation (Figure 9.6, 3 ), the 50S
that occurs between the 16S rRNA in the 30S ribosome and subunit joins the 30S subunit to form the intact ribosome.
a short mRNA sequence located a few nucleotides upstream The energy for the union of the two subunits is derived from
of the start codon in the 5′ UTR of mRNA (Figure 9.6, 1 ). hydrolysis of GTP to GDP (guanosine diphosphate). The
John Shine and Lynn Dalgarno identified the location and dissociation of IF1, IF2, and IF3 accompanies the joining
sequence of this region in 1974, and it is named the Shine– of subunits that creates the 70S initiation complex. This
Dalgarno sequence in recognition of their work. complex is a fully active ribosome with a P site, an A site,
The Shine–Dalgarno sequence is a purine-rich sequence an E site, and a channel for exit of the polypeptide. The first
of about six nucleotides located three to nine nucleotides tRNA (tRNAfMet) is already paired with mRNA at the P site,
upstream of the start codon. A complementary pyrimidine- and the open A site contains the second codon and is await-
rich segment containing the sequence UCCUCC is found ing the next charged tRNA.
near the 3′ end of 16S rRNA, and it pairs with the Shine–
Dalgarno sequence to position the mRNA on the 30S subunit Eukaryotic Translation Initiation The eukaryotic 40S ribo-
(see Figure 9.6). The Shine–Dalgarno sequence is another somal subunit complexes with three eukaryotic initiation
example of a consensus sequence. As with the c onsensus factor (eIF) proteins (eIF1, eIF1A, and eIF3) to form the pre-
sequences we describe for promoters (see Section 8.2), initiation complex (Figure 9.8, 1 ). In step 2 , the preinitiation
the precise nucleotide sequence and exact position of the complex joins with the initiator tRNA and eIF5.
Shine–Dalgarno sequence vary slightly from one mRNA to The initiation complex is formed by binding of
another (Figure 9.7). the mRNA. This initiates the process called scanning
In the next step of translation initiation (Figure 9.6, 2 ), (Figure 9.8, 3 ), in which the small ribosomal subunit moves
the initiator tRNA binds to the start codon at what will along the 5′ UTR in search of the start codon. About 90%
be part of the P site after ribosome assembly. The amino of eukaryotic mRNAs use the first AUG encountered by the
acid on the initiator tRNA is a modified methionine called initiation complex as the start codon, but the remaining 10%
N-formylmethionine (fMet); thus, the charged initiator use the second or, in some cases, the third AUG as the start
tRNA is abbreviated tRNAfMet. This tRNA has a 3′-UAC-5′ codon. The initiation complex is able to accurately locate
anticodon sequence that is a complementary mate to the start the authentic start codon because the codon is embedded in
codon sequence. An initiation factor (IF) protein designated a consensus sequence that reads
IF2 and a molecule of GTP are bound at the P site to facili-
tate binding of tRNAfMet. Initiation factor 1 (IF1) also joins 5′-ACCAUGG-3′
the complex to forestall attachment of the 50S subunit. At (the start codon itself is shown in bold). This consen-
this point, the 30S initiation complex, consisting of mRNA sus sequence is called the Kozak sequence after Marilyn
Kozak, who discovered it in 1978.
Locating the start codon leads to recruitment of the 60S
subunit to the complex, using energy derived from GTP
Shine–Dalgarno Start
sequence codon hydrolysis. This final step 4 in the formation of the 80S
E. coli araB U U U G G A U G G A G U G A A A C G A U G G C G A U U G C A 3¿ ribosome is accompanied by dissociation of the eIF pro-
E. coli lacl C A A U U C A G G G U G G U G A A U A U G A A A C C A G U A teins. In the 80S ribosome, the initiator tRNAMet is located at
E. coli lacZ U U C A C A C A G G A A A C A G C U A U G A C C A U G A U U the P site; the A site is vacant, awaiting arrival of the second
E. coli thrA G G U A A C C A G G U A A C A A G G A U G C G A G U G U U G tRNA (Genetic Analysis 9.1).
E. coli trpA A G C A C G A G G G G A A A U C U G A U G G A A C G C U A C
E. coli trpB A U A U G A A G G A A A G G A A C A A U G A C A A C A U U A Archaeal Translation Initiation and Its Implications for
l phage cro A U G U A C U A A G G A G G U U G U A U G G A A C A A C G C Evolution Archaeal ribosome subunits are composed of
R17 phage A protein U C C U A G G A G G U U U G A C C U A U G C G A G C U U U U rRNAs that are more similar in size to those of bacteria than
ob phage A replicase UAACUAAGGAUGAAAUGCAUGUCUAAGACA to those of eukaryotes. However, the ribosomal RNAs that
fX174 phage A protein A A U C U U G G A G G C U U U U U U A U G G U U C G U U C U make up the central structure of the subunits are distinct in
E. coli RNA polymerase B A G C G A G C U G A G G A A C C C U A U G G U U U A C U C C each domain.
Consensus sequence AGGAGG Despite the similarity in size of archaeal and bacterial
ribosomes, the process of translation initiation in archaea is
Figure 9.7 The Shine–Dalgarno consensus binding decidedly eukaryote-like. One example of this similarity is
sequence. The AUG start codon (orange) is near the Shine– the archaeal use of methionine as the common first amino
Dalgarno sequence (gold), which binds to the 3′ end of 16S rRNA. acid of polypeptide chains. This is like eukaryotes and unlike
Q Name two features of a Shine–Dalgarno sequence that are bacteria, which use N-formyl-methionine. A second aspect
essential to its ability to function in translation initiation. of archaeal translation initiation concerns the presence of
9.2 Translation Occurs in Three Phases 323
eIF5
Met
eIF3
eIF5
P A eIF1A P eIF1A
E E A
eIF1 UAC eIF1 UAC
Met
eIF5
P A eIF1A P
E A
eIF1 E UAC UAC
CCAUGG CCAUGG
40S subunit
A
A
movement
along mRNA
eIF4
complex 3¿ 3¿
Cap
5¿ 5¿
Start
codon
The large subunit attaches
mRNA 5¿ ACCAUGG 3¿ to form the 80S ribosome
Kozak that begins translation.
sequence
Figure 9.8 Initiation of eukaryotic translation. The Kozak sequence orients the mRNA on the small
subunit and places the authentic start codon in position to begin translation.
Q In a sentence or two describe the mechanism that places the authentic start codon of a eukaryotic
mRNA in position to begin translation.
Shine–Dalgarno sequences. These are relatively common in archaeal initiation factor proteins (aIFs) are homologous
the mRNAs of some archaeal species but not in others. in structure and function to eIFs.
More significantly, the homology seen between tran- The archaea have multiple mechanisms of mRNA–
scription factor initiation proteins of archaea and eukary- ribosome interaction at translation initiation. This is most
otes is strong, whereas the homology between those of apparent at 5′ mRNA ends, many of which—some stud-
archaea and bacteria is less so (Table 9.3). Recall from our ies say more than 50% in certain archaeal species—appear
discussion in Section 1.4 that amino acid or nucleic acid not to have a 5′ UTR. Those mRNAs lacking a 5′ UTR are
sequences that are homologous have a common ancestral said to be “leaderless” mRNAs and are apparently missing
origin. Proteins that have greater degrees of homology have all or most of the translation-initiating segments, including
more recent common ancestral history than do proteins with the Shine–Dalgarno sequence in some cases. The mecha-
lower levels of homology. Based on the protein homology nism through which leaderless mRNA translation is initi-
information in Table 9.3, it appears that translation initiation ated is not yet known. Archaeal species producing mRNAs
in archaea is more complex than in bacteria and that known with 5′ UTRs typically have Shine–Dalgarno sequences
324 CHAPTER 9 The Molecular Biology of Translation
mRNA binding; start codon fidelity IF3 (in some phyla only) aIF1 eIF1
mRNA binding IF1 aIF1a eIF1A/eIF4
tRNA P-site binding IF2 aIF2/5 eIF5
Met
tRNA binding No homolog aIF3 eIF3
a
The absence of a homologous protein is identified as “No homolog.”
b
Archaeal proteins are identified by the letter a.
c
Eukaryotic proteins are identified by the letter e.
to aid translation initiation. This finding does not suggest a by GTP hydrolysis, the cleavage of one phosphate group
specific translational mechanism for leaderless mRNA, but from guanosine triphosphate (GTP). Hydrolysis releases
it has led to speculation that the leaderless mRNA state may energy and converts nucleotide triphosphates to nucleo-
be ancestral to the state featuring 5′ UTRs. In other words, it tide diphosphates (i.e., GTP S GDP). In step 1 of elonga-
is possible that the last universal common ancestor (LUCA) tion as portrayed in the figure, a charged tRNA is bound by
of bacteria, archaea, and eukaryotes produced leaderless the elongation factor EF-Tu and GTP. In step 2 , the tRNA
mRNAs and that the mRNAs with 5′ UTRs are a more recent enters the A site. If the tRNA has the correct anticodon
development. In this context, archaeal translation may be sequence, it pairs with the mRNA codon. In step 3 , hydro-
something of a relic reminiscent of translation in the LUCA. lysis of GTP releases EF-Tu–GDP from tRNA. In step 4 ,
the enzyme peptidyl transferase catalyzes peptide bond for-
Polypeptide Elongation mation between the amino acid at the P site and the newly
recruited amino acid at the A site. This elongates the poly-
Elongation, the second phase of translation, begins with the
peptide and transfers the polypeptide to the tRNA at the A
recruitment of elongation factor (EF) proteins into the ini-
site. In step 5 , the tRNA at the P site departs the ribosome
tiation complex. Elongation factors facilitate three steps of
through the E site. Elongation factor EF-G uses GTP hydro-
polypeptide synthesis:
lysis to translocate the ribosome by moving it in the 3′
1. Recruitment of charged tRNAs to the A site direction on mRNA. This translocation is exactly one codon
2. Formation of a peptide bond between sequential amino in length, that is, three nucleotides. Translocation moves the
acids tRNA formerly at the A site to the P site, and opens the A
site for binding by a charged tRNA with the correct antico-
3. Translocation of the ribosome in the 3′ direction along
don sequence. In step 6 , the next charged tRNA is ready to
mRNA
enter the A site.
GTP cleavage provides the energy for each step of
elongation in bacteria, archaea, and eukaryotes. Moreover, Elongation of Eukaryotic and Archaeal Polypep-
the steps in the elongation process are the same in all three tides Evolution has acted to strongly conserve the
types of organisms: Although the elongation factors differ, basic biochemistry of polypeptide elongation in all three
the ribosomal P, A, and E sites of all three organisms serve domains of life. The elongation factors that carry out poly-
nearly identical functions. The rates of elongation seem also peptide elongation in eukaryotes and archaea are shown
to be similar; bacteria add about 20 new amino acids per sec- in Table 9.4. All organisms use two elongation factors to
ond to a nascent polypeptide chain, and eukaryotes elongate carry out polypeptide elongation, and the illustration of
the polypeptide at a rate of 15 amino acids per second. The polypeptide elongation in Figure 9.9 is an equally accurate
elongation rate in archaea has not been established. Lastly,
numerous studies indicate high fidelity of translation in all
organisms. An error rate of approximately one amino acid in
each 10,000 added to polypeptides is estimated for bacteria. Table 9.4 Translation Elongation Factor Homologs
et N et N
fM fM
GTP
g
Ser Ala Pro Ar
Gln
Gln
Gln
GTP
EF-Tu GTP
P P
A EF-Tu
E UCG A E UCGGUU
AGC CUG CUG AGC
GCC CAA GCC CAA
CCU
CCU
GAC …
GAC …
Charged tRNA
…AGA
…AGA
3¿ 3¿
5¿ 5¿
Elongation factor protein EF-Tu and Many charged tRNAs enter the A site; only the one with
GTP attach to a charged tRNA. the correct anticodon sequence pairs with the codon.
N fMet N
et
fM g
3 GTP hydrolysis 4 Peptide bond Ar
g
P P Gln
A A
E UCGGUU E UCGGUU
AGC
GCC CAA AGC
GCC CAA
CCU
GAC …
CCU
GAC …
…AGA
…AGA
GDP
3¿ EF-Tu
5¿ 5¿ 3¿
A charged tRNA fills the A site using energy Peptidyl transferase catalyzes the formation of a
obtained by hydrolyzing GTP, reducing it to GDP. peptide bond between the amino acids in the P
EF-Tu–GDP is released. and A sites. The peptide chain moves to the A site.
fMet N fMet N
5 Translocation g 6 A site open for g
Ar charged tRNA Ar
Gln Ser Ala Pro
GTP
EF-Tu
Asp
Asp
E P A E P A
UCGGUU Ribosome movement UCGGUU
GCCAAGA along mRNA GCCAAGA
CA CA
C…
C…
UGC
UGC
CUG CUG
GACC
GACC
…A
…A
5¿ 3¿ 5¿ 3¿
Elongation factor protein G (EF-G) translocates the The uncharged tRNA is released from the E site and
ribosome; the uncharged tRNA is moved to the E site. open A site is ready to recruit the correct charged tRNA.
325
GENETIC ANALYSIS 9.1
PROBLEM In an investigation designed to identify the consensus sequence containing the AUG codon BREAK IT DOWN: The Kozak con-
that initiates translation of eukaryotic mRNA, Marilyn Kozak (1986) compared the amounts of protein sensus sequence, 5′-ACCAUGG-3′,
includes the AUG start codon
produced from 10 mutant mRNA molecules having different single-base substitutions flanking the AUG. sequence and several surrounding
Protein production was gauged by the optical density (OD) of protein bands in electrophoretic gels. mRNA nucleotides and is critical
Higher OD values indicated more protein produced. In the two tables shown here, AUG, the start codon, to ribosome recognition of the
authentic start codon (p. 322).
is highlighted (dark blue) and its adenine (A) is labeled the +1 nucleotide of the translated region. Kozak
examined six single-base mutants at nucleotides -3 and +4 (light blue). These are identified by number BREAK IT DOWN: Efficient trans-
(1 to 6) in Table A. She also examined four single-base mutants of positions -2 and -1 (light blue). These lation of mRNA produces more
protein and is indicated by higher
are numbered 7 to 10 in Table B. The OD for protein production by each mutant was measured and is OD values for mutants possessing
given below the mutant in the table. Use the OD values to determine answers to the problem questions. that capability (p. 322).
Table A Six Position - 3 and + 4 Mutants Table B Four Position -2 and -1 Mutants
a. Looking just at the nucleotides in positions -3 and +4 for the six mutants in Table A, decide which
nucleotides give the highest level of protein production.
b. Describe the impact of each nucleotide (A, T, C, and G) in the -3 position.
c. Looking just at nucleotides at positions -2 and -1 for the four mutants in Table B, decide which
nucleotides give the highest level of protein production.
d. Why did Kozak use only A in the -3 position to test the effects of nucleotides at positions -2 and -1?
e. Putting together data from both Table A and Table B, give the sequence of the mRNA region from
-3 to +4 that produces the highest level of translation.
Evaluate
1. Identify the topic this problem addresses 1. This problem involves examination and interpretation of the effects that
and the nature of the required answer. sequence differences surrounding the mRNA start codon have on translation.
The answer requires comparing the effects of base substitutions on translation
and identifying the mRNA sequence corresponding to the highest translation
level.
2. Identify the critical information given in the 2. Two tables provide mRNA sequence for different sequence variants. For
problem. each variant, an OD value describes the approximate level of protein
TIP: Notice that AUG is the start codon sequence produced by translation of the sequence. Higher OD values correspond
in all mutants tested. As a consequence, differ- to more protein production.
ences in OD result from differences among the
surrounding nucleotides.
Deduce
3. Identify the constant and variable 3. In Table A, the nucleotide C is constant at positions -1 and -2, and the
nucleotides displayed in Table A. start codon nucleotides A, U, and G occupy positions +1, +2, and +3,
respectively. Nucleotide variability is limited to positions -3 and +4.
4. Identify the constant and variable 4. In Table B, only the nucleotide at the -1 and -2 positions vary; all other
nucleotides shown in Table B. nucleotides are constant.
326
GENETIC ANALYSIS 9.1 CONTINUED
Solve Answer a
5. Specify the nucleotides in the -3 and +4 5. In Table A, the presence of A in position -3 and G in position +4 pro-
positions (Table A) that give the highest OD. duces the highest OD value. At the +4 position, G produces two high
OD values and two low ODs, and T produces one high and one low OD.
Answer b
6. Assess how each nucleotide in the -3 posi- 6. At posvition -3, A produces the highest and the third-highest OD values;
tion affects OD. G produces the second-highest and the lowest OD; T and C produce the
same low OD value.
Answer c
7. Evaluate how nucleotide differences at the 7. In Table B, a C in position -2 and an A in position -1 produce the highest
-1 and -2 positions (Table B) affect OD. OD. Considering only the variable position -2, C produces higher OD
values than does G.
Answer d
8. Explain the decision to base Table B evalu- 8. Adenine is selected as the nucleotide in position -3 for Table B
ations only on sequences with A in the -3 evaluations based on the high average OD value reported for this
position. nucleotide in the -3 position in Table A in comparison with other
nucleotides. The average OD for A in the -3 position in Table A is
(5.0 + 2.6) (3.1 + 0.7)
= 3.8 versus the next-highest average of = 1.9
2 2
TIP: Compare OD values and nucleotide
differences from both tables to determine for G in the -3 position.
the most efficient consensus sequence.
Answer e
9. Identify the start codon consensus 9. Data from the two tables combined identify the sequence ACCAUGG
sequence that results in the highest level of (start codon in bold) as the most efficient consensus sequence for the
translation. start codon. For the nucleotide positions immediately surrounding the
start codon, A is most efficient at -3, C is more efficient than G at -2, C
is more efficient than A or G at -1, and G is more efficient than U at +4.
For more practice, see Problems 34, 35, and 36. Visit the Study Area to access study tools. Mastering Genetics
portrayal of the process in eukaryotes and archaea. Based like RF3 of bacteria, participates in recycling eRF1. The
on sequence comparisons, the archaeal and eukaryotic currently available information on sequence and function
elongation factor homologs are more alike than are archaeal of RFs suggests that archaea and eukaryotes have RFs that
and bacterial EFs. are more like one another than either is like bacterial RFs
(Table 9.5).
Translation Termination
The elongation cycle continues until one of the three stop 9.3 Translation Is Fast and Efficient
codons, UAG, UGA, or UAA, enters the A site of the
ribosome. There are no tRNAs with anticodons comple- With mRNA transcripts of hundreds to thousands of
mentary to stop codons, so the entry of a stop codon into genes in cells, translation is an active and ongoing pro-
the A site is a translation-terminating event. All organisms cess that must efficiently initiate, elongate, and terminate
use release factors (RF) to bind a stop codon in the A site polypeptide synthesis. In recent decades, research has
(Figure 9.10 1 ). The catalytic activity of RFs releases the uncovered several aspects of the translation machinery
polypeptide bound to tRNA at the P site 2 . Polypeptide that help explain the speed, accuracy, and efficiency of
release causes ejection of the RF from the P site and leads to polypeptide production.
the separation of the ribosomal subunits 3 .
In bacteria, two release factors, RF1 and RF2, recog-
The Translational Complex
nize stop codons. RF1 recognizes UAG and UAA, and RF2
recognizes UAA and UGA. A third bacterial release factor, Cell biologists estimate that each bacterial cell contains
RF3, is active in recycling RF1. Eukaryotic and archaeal about 20,000 ribosomes, collectively constituting nearly
translation are terminated by the action of a single release one-quarter of the mass of the cell. The number of ribo-
factor, identified as eRF1 in eukaryotes and aRF1 in somes per eukaryotic cell is variable, but it too is in the tens
archaea, that recognizes all three stop codons in organisms of thousands. Given these numbers, it is not surprising that
of both of these domains. Eukaryotes have a second RF that, translation is almost never a matter of a solitary ribosome
327
328 CHAPTER 9 The Molecular Biology of Translation
A…
A site.
AG
AA
3¿
5¿ Cap …CG C
2 Polypeptide release
C Phe Hi Released
sT
hr polypeptide
Uncharged Arg (a)
Lys
tRNA Ala Met N
Transcription
DNA
(b) Translation
AG
AA
C 3¿
5¿ Cap …CG
Figure 9.11 Polyribosomes in bacteria. (a) Electron micrograph
of polyribosomes shows that as mRNAs are being transcribed
3 Ribosome dissociation and mRNA release from DNA, multiple ribosomes are bound to each mRNA, to
C translate it and to produce polypeptides. (b) Artist rendition of
Ph
Polypeptide is shown on one mRNA. It begins at the bottom (the 5 mRNA end)
and progresses in the 3 direction toward the top.
rg
Lys
A
eRF1 la
ribosome in the polyribosome structure independently syn-
M
AAA
et thesizes a polypeptide, markedly increasing the efficiency
N of utilization of an mRNA.
40S
In bacteria, the absence of a nucleus and of pre-mRNA
processing leads to the “coupling” of transcription and
5¿ Cap translation seen in Figure 9.11. This means that multiple
3¿
ribosomes can be engaged in translation of the 5′ region
of mRNAs whose 3′ end is still being synthesized by
Figure 9.10 Termination of translation by release factor (eRF)
RNA polymerase. In Figure 9.11, transcription occurs along
proteins in eukaryotes. A similar process terminates bacterial and
DNA in the left-hand to right-hand direction. Translation
archaeal translation.
of the mRNA transcripts begins before transcription is
Q In a sentence or two describe the mechanism that terminates complete and stops when the mRNA degrades. The aver-
translation in bacteria and eukaryotes. age half-life of bacterial mRNA is a few minutes, but many
polypeptides can be translated in that time span.
translating a single mRNA. Rather, electron micrographs By contrast, transcription and translation in eukaryotes
reveal structures called polyribosomes, busy translational are uncoupled. Transcription takes place in the nucleus,
complexes containing multiple ribosomes that are each where pre-mRNA is processed to form mature mRNA. Trans-
actively translating the same mRNA (Figure 9.11). Each lation occurs in the cytoplasm after release of mature mRNA.
9.4 The Genetic Code Translates Messenger RNA into Polypeptide 329
However, once in the cytoplasm, each individual eukaryotic 9.4 The Genetic Code Translates
mRNA is translated by multiple ribosomes simultaneously.
The half-life of an average mature mRNA is several hours, Messenger RNA into Polypeptide
and many polypeptides can be produced in that time span.
In chemical terms, nucleic acids and amino acids are very
different compounds, and there is no direct mechanism by
Translation of Polycistronic mRNA which mRNA could synthesize a polypeptide. Neverthe-
Each polypeptide-producing gene in eukaryotes produces less, the nucleotide sequences of mRNA do provide a means
monocistronic mRNA, meaning mRNA that contains the by which the amino acid sequences of polypeptides can be
transcript of a single gene. According to the scanning model specified. This vehicle is the “genetic code,” the name used
described earlier for translation in eukaryotes, each eukary- to describe the correspondence between nucleotide triplets
otic mRNA contains a single authentic start codon and a in mRNA and individual amino acids.
nucleotide sequence that codes only one kind of polypeptide The conversion of an mRNA sequence into a
chain. In contrast, groups of bacterial and archaeal genes polypeptide depends on interactions between mRNA
often share a single promoter, and the resulting mRNA tran- and the transfer RNAs (tRNAs) that carry amino acids to
script contains information that synthesizes several different the ribosome. At ribosomes, complementary base pairing
polypeptides. These polycistronic mRNAs are produced as binds consecutive sets of three mRNA nucleotides—the
part of operon systems that regulate the transcription of sets codons—to the three nucleotide bases of the correct tRNA
of bacterial genes functioning in the same metabolic path- anticodons. Once the correct tRNA is bound by a codon, it
way (a form of regulation we discuss in Section 12.2). The transfers its amino acid to the end of a growing polypeptide
term “cistron” is equivalent to “gene”; thus, a polycistronic chain. Transfer RNA molecules facilitate the translation of
mRNA contains the transcripts of two or more genes. genetic information from one chemical language (nucleic
To repeat, polycistronic mRNAs consist of multiple acid) to another (amino acid). That is, tRNA is an adaptor
polypeptide-producing segments, so when a polycistronic molecule that interprets and then acts on the information
mRNA is translated, two or more polypeptides are pro- carried in mRNA.
duced. Each of the polypeptides encoded by a polycistonic Our review of translation and the genetic code in Sec-
mRNA has its own start codon and stop codon. In the case tion 1.3 depicts a triplet genetic code containing 64 different
of bacteria, and in all but the leaderless mRNAs in archaea, codons, more than enough to encode the 20 common amino
most, but not all, translation-initiating regions contain a acids used to construct polypeptides (Figure 9.13; see also
Shine–Dalgarno sequence. Intercistronic spacer sequences the genetic code inside the front cover). The greater num-
separate the cistrons of polycistronic mRNA, and they are ber of codons than amino acids leads to redundancy in the
not translated (Figure 9.12). genetic code, as evidenced by the observation that single
Bacterial intercistronic spacers are variable in length: amino acids are specified by from one to as many as six
Some are just a few nucleotides long, although most are 30 different codons. Codons that specify the same amino acid
to 40 nucleotides long. If the intercistronic spacer is a few are called synonymous codons.
nucleotides in length, it is short enough to be spanned by To an extent, this redundancy has a specific pattern.
a ribosome. In such systems, the ribosome remains intact Notice, for example, that the two synonymous codons for
after completing synthesis of one polypeptide, and it goes histidine (His) and the two synonymous codons for gluta-
on to translate the other genes encoded in the polycistronic mine (Gln) all share the same first two bases in the same
mRNA. On the other hand, when the intercistronic spacer order: C and A. What distinguishes one codon pair from
is longer, the initial ribosome dissociates and new transla- the other is that both His codons have a pyrimidine at the
tion initiation must occur to translate the next polypeptide third position, whereas the two Gln codons have a purine
encoded by the polycistronic mRNA. in the third position. As you look at Figure 9.13, you will
Figure 9.12 Polycistronic mRNA. A polycistronic mRNA is a transcript of multiple genes. A separate
polypeptide is produced from each gene.
330 CHAPTER 9 The Molecular Biology of Translation
specify amino acids, and the remaining 3 are the stop codons
G F L that terminate translation. Only two amino acids, methionine
E (Met)—with the codon AUG—and tryptophan (Trp)—with
S
he
Gly
Ph
Leu
D
P
the codon UGG—are encoded by single codons. The other
Gl UCA
G UC A
r
u
Se
G 18 amino acids are specified by two to six codons.
As
AG UC Y
G U C AGU
p
A U
C r
Each transfer RNA molecule carries a particular amino
Al Ty
a
C
A
G
A C
A Sto
p acid to the ribosome, where complementary base pairing
U C A GU C between each mRNA codon sequence and the correspond-
G U G CAG
G Cys
V Val A ing anticodon sequence of a correct tRNA takes place. This
U U
op
Stto
S p
C complementary base pairing requires antiparallel alignment
Trp W
G of the mRNA and tRNA strands. Recall that Figure 8.28
A G U AUC
R
A C
Arg
C L eu
L illustrates a two-dimensional and a three-dimensional view
U
Ser
A
G of a tRNA molecule. The tRNA in Figure 8.28 has the
S G
A C CU anticodon sequence 39-CGC-59. This corresponds to the
C
s
K
Ly C
n
U
G U G A UGA Pr
o mRNA codon sequence 5′-GCG-3′, which specifies ala-
As AC C P nine (Ala). To visualize the codon–anticodon base-pairing
A
N UG G
Hi
s
Th
Gln
H
t
Meet
T
Ile
M
Q
g
Asp
led many researchers to conclude that the genetic code was Amino acids
most likely triplet. This simple solution to the question of
how amino acid sequences could be coded by nucleic acid
sequences posited that a doublet genetic code (two nucleo-
tides per codon) could produce just 16 (42) combinations of
codons, which is not enough different combinations to spec-
Anticodons 3¿ CUG 5¿ 3¿ CUA 5¿
ify 20 amino acids. On the other hand, a quadruplet genetic
mRNA codons 5¿ GAC 3¿ 5¿ GAU 3¿
code would generate 44, or 256, different combinations of
codons—far too many for the needs of genomes. In contrast, Figure 9.14 Complementary base pairing of codons and
a triplet genetic code, yielding 43, or 64, different codons, anticodons. Isoaccepting aspartic acid (Asp) tRNAs illustrate
provides enough variety to encode 20 amino acids with some, complementary antiparallel base pairing of codon and anticodon
but not excessive, redundancy. Among the 64 codons, 61 sequences.
9.4 The Genetic Code Translates Messenger RNA into Polypeptide 331
Francis Crick devised the wobble hypothesis in 1966, The patterns of third-base wobble are tied directly to
proposing the possibility of nonstandard base pairing between the patterns of genetic code redundancy. Specifically, syn-
the third-position nucleotides of the codon and anticodon. onymous codons that share the first two nucleotides of the
For example, Figure 9.15 shows third-base wobble for two codons and differ only by having alternative purines or
pairs of the six codons of serine (Ser) and the three codons pyrimidines in the third position are subject to third-base
of isoleucine (Ile). Stated differently, two tRNAs with distinct wobble. Different organisms take greater or lesser advantage
anticodon sequences are enough to recognize the four Ser of wobble and have evolved different numbers of different
codons; and a single tRNA recognizes all three Ile codons. tRNA genes. Theoretical calculations find that a minimum
Third-base wobble occurs through flexible base pairing of 31 tRNA anticodon sequences are required to recognize
between the wobble nucleotide—that is, the 3′ nucleotide the 61 mRNA codon sequences, but as far as is known, all
of a codon—and the 5′ nucleotide of an anticodon. At this organisms encode more than the minimum required number
position, base pairing between the nucleotides of the codon of tRNAs.
and anticodon need not be complementary. They must,
however, be a purine and a pyrimidine (with one excep-
The (Almost) Universal Genetic Code
tion explained momentarily). Third-base wobble pairings
are summarized in Table 9.6. The nucleotides at the wobble In astonishing testimony to the conclusion that life on
position in different anticodons include all the RNA nucleo- Earth had a single origin, and to the power of natural
tides and also the modified nucleotide inosine (I). Inosine is selection to, in this case, maintain virtually complete
structurally similar to G but lacks the amino group attached uniformity over hundreds of millions of years, every living
to guanine’s 2 carbon. As a result, inosine base-pairs with organism uses the same genetic code to synthesize poly-
either purines or pyrimidines. peptides. In all living things, from bacteria to humans, the
hereditary script carried by a given sequence of mRNA is
translated by a similar mechanism and produces the same
polypeptide. The universality of the genetic code has led to
technologies in which bacterial systems are used to express
biologically important plant or animal protein products.
Ser
Ser
Ile
Amino acids
As with most general rules, however, there are a few
tRNASER1 tRNASER2 tRNAIlE
exceptions to the universality of the genetic code; thus, biol-
ogists characterize the genetic code as almost universal. The
10 known exceptions to the universal genetic code are sum-
marized in Table 9.7. Most are found in mitochondria, but
Anticodons AGG AGU UAI
three exceptions occur in the translation of genetic informa-
mRNA codons UCC UCA AUC tion encoded in nuclear DNA.
UCU UCG AUA Familiarize yourself with Figure 9.13 and the genetic
AUU code information inside the front cover by using them to
decipher the mutations shown in Genetic Analysis 9.2.
Wobble Wobble Wobble
position position position
Evaluate
1. Identify the topic this problem addresses and 1. This problem concerns the identification of DNA coding and template
the nature of the required answer. strands; the transcription of DNA to mRNA and translation of mRNA
into a polypeptide; and an evaluation of a mutation of the DNA
sequence. The answer requires identification of the DNA strands,
identification of start and stop codons, and determination of the
amino acid sequence of wild-type and mutant polypeptides.
2. Identify the critical information given in the 2. DNA sequence that includes a start (AUG) codon and a stop codon
problem. is given.
Deduce
3. Identify the start codon 3. Scanning both DNA strands in their 3′@to@5′ direction identifies a single
by inspecting both DNA TIP: The AUG start codon
5′-ATG-3′ sequence. The sequence is on the lower strand in the
strands for 5′-ATG-3′ is the most common codon diagram beginning with the seventh nucleotide from the right.
sequences that potentially for translation initiation
corresponds to the DNA
encode start (AUG) codons. triplet 5′-ATG-3′ on the
coding strand.
Survey the putative 4. Since just one DNA triplet encoding a start codon is present, a scan
template strand identified of the strand at the correct distance from the start codon finds a
in the previous step and 5′-TAG-3′ triplet sequence encoding a UAG stop codon:
determine if DNA triplets
5′-TAG-3′, 5′-TGA-3′, 3′-GGGTCG GAT CGGAAACGTTCTCCG GTA TAGCTC-5′
and 5′-TAA-3′ correspond- TIP: The stop codons
UAG, UGA, and UAA
ing to possible stop codons correspond to DNA TIP: Substituting U for T on the coding strand
occur as the seventh codon triplets on the coding produces mRNA sequence. Alternatively,
of an mRNA sequence. strand. arranging RNA nucleotides complementary to
the template strand and assigning antiparallel
polarity produces mRNA.
Solve TIP: The mRNA sequence Answer a
can be determined from
5. Identify the either the coding strand 5. The mRNA sequence is
mRNA sequence or the template strand
encoding the six of DNA. 5′-AUG GCC UCU UGC AAA GGC UAG-3′
amino acids of the polypeptide.
Answer b
6. List the amino acid sequence of the 6. The polypeptide sequence is
polypeptide.
N-Met-Ala-Ser-Cys-Lys-Gly-C
Answer c
7. Identify the effect of the G S A base 7. Substituting the first transcribed G S A on the template strand
substitution on the polypeptide. alters the second codon of mRNA by changing GCC S GUC and
substitutes valine (Val) for alanine (Ala) in the second position of
the polypeptide sequence.
For more practice, see Problems 7, 12, and, 30. Visit the Study Area to access study tools. Mastering Genetics
332
9.4 The Genetic Code Translates Messenger RNA into Polypeptide 333
Cleavage of
pro–amino acids
9.5 Experiments Deciphered
Insulin
the Genetic Code
S Chain A
A remarkable set of experiments performed over less than
S S 4 years in the early 1960s deciphered the genetic code and
opened the way for biologists to understand the molecular pro-
Chain B cesses that convert a messenger RNA nucleotide sequence into
a polypeptide. At the time, biologists knew what the hereditary
Figure 9.17 Examples of posttranslational processing. material was (DNA), and they knew what molecule conveyed
the genetic message to ribosomes for translation (mRNA), but
they did not know how the protein-coding information car-
ried by messenger RNA was deciphered during the assembly
connecting segment, called the pro–amino acid segment, of polypeptides. Several questions had to be answered about
that separates the A-chain segment and the B-chain seg- the structural organization of the genetic code before the code
ment, the two functional pieces of the polypeptide. In itself could be deciphered. The three most important questions,
posttranslational processing of preproinsulin, the pre– listed here, are examined in the sections below:
amino acids of the signal sequence are removed, after the
1. Do neighboring codons overlap one another, or is each
polypeptide is transported through the cell membrane, to
codon a separate sequence?
form proinsulin. Next, three disulfide bonds form within
and between the A-chain and B-chain segments, followed 2. How many nucleotides make up a messenger RNA
by polypeptide cleavage that removes the pro–amino acid codon?
segment. What results is a functional insulin molecule 3. Is the polypeptide-coding information of messenger
consisting of 20 amino acids in the A-chain segment and RNA continuous, or is coding information interrupted
31 amino acids in the B-chain segment. by gaps?
9.5 Experiments Deciphered the Genetic Code 335
Golgi apparatus
Figure 9.18 Translation at endoplasmic reticulum–bound ribosomes and the signal hypothesis.
Translated polypeptides enter the cisternal space through ER receptors to which ribosomes are attached. The
cleavage of signal sequences facilitates packaging and transmembrane transport of polypeptides in vesicles.
Conclusive evidence of a nonoverlapping genetic code nucleotides changes the reading frame and produces a muta-
came from a 1960 study of single-nucleotide substitutions tion called a frameshift mutation.
induced by the mutation-producing compound nitrous oxide. The following analogy illustrates the impact of frame-
Heinz Fraenkel-Conrat and his colleagues studied the effect shift mutations. Single-letter additions or deletions garble
of nitrous oxide on the coat protein of tobacco mosaic virus the translated message by changing the reading frame:
(TMV). Nitrous oxide causes mutations by inducing single wild-type: YOUMAYNOWSIPTHETEA (“you may now
base-pair substitutions in DNA that lead to mutant mRNA sip the tea”)
molecules with one nucleotide base change compared with mutant (addition): YOUMA C YNOWSIPTHETEA (“you
wild-type mRNA. A single base change in mRNA would mac yno wsi pth ete a”)
alter three consecutive codons if the genetic code were over- (deletion): YOUMAYNO || SIPTHETEA (“you may nos
lapping, but just a single codon if the genetic code were ipt het ea”)
nonoverlapping (Figure 9.19). Fraenkel-Conrat’s muta-
tion analysis revealed that only single amino acid changes Frameshift mutations can be reverted (i.e., the correct
occurred as a result of mutation by nitrous oxide. This result reading frame can be restored) by a second mutation in a dif-
is consistent with that predicted for a nonoverlapping genetic ferent location within the same gene. This second mutation,
code, and it is inconsistent with the prediction for an overlap- a type of reversion mutation, counteracts (“reverses”) the
ping genetic code. reading frame disruption by inserting a nucleotide, if the ini-
tial mutation was a deletion, or by deleting a nucleotide, if the
A Triplet Genetic Code initial mutation was an insertion. For example, here is how
the two frameshift mutations shown above might be reverted:
Proof of a triplet genetic code came in 1961 when Fran-
cis Crick, Leslie Barnett, Sidney Brenner, and R. J. Watts- mutant (addition): YOUMA C YNOWSIPTHETEA (you
Tobin used the compound proflavin to create mutations mac yno wsi pth ete a)
in a gene called rII in T4 bacteriophage. Proflavin causes reversion mutant (deletion): YOUMA C YNO ||
mutations by inserting or deleting single base pairs in DNA. SIPTHETEA (“you mac yno sip the tea”)
Such deletions, for example, lead to the absence of single mutant (deletion): YOUMAYNO || SIPTHETEA (“you
nucleotides from mRNA, thus changing the reading frame may nos ipt het ea”)
of the mRNA. Reading frame refers to the specific codon reversion (addition): YOUMAYNO||SIP R THE TEA
sequence determined by the point at which the grouping of (“you may nos ipr the tea”)
nucleotides into triplets begins. The addition or deletion of Crick and his colleagues analyzed numerous bacteriophage
proflavin-induced rII-gene mutants, designating each addition
(a) An overlapping genetic code would change three mutant as a (+) and each deletion mutation as a ( -). They
consecutive codons with each base mutation.
guessed that the first rII-gene mutant they examined, a mutation
Wild-type sequence Mutant sequence designated FC 0, resulted from insertion (“FC” stands for Fran-
ACUCAGAUA ACUCGGAUA cis Crick). Designating FC 0 as a (+) mutation turned out to be
Codon 1 A C U ACU a correct guess. Based on their assumptions that (1) the genetic
Codon 2 CUC CUC
Codon 3 UCA UCG
code is a nonoverlapping triplet and (2) FC 0 is an insertion, or
Codon 4 CAG CGG (+), mutation, the data reported by Crick and colleagues sup-
Codon 5 AGA GGA ported the notion of a triplet genetic code by showing that the
Codon 6 GAU GAU presence of one or two (+) or one or two (-) mutations disrupts
Codon 7 AUA AUA
Codon 8 U A… U A…
the reading frame but that the reading frame is restored by the
presence of three (+) mutations or three (-) mutations.
(b) A nonoverlapping genetic code would change one
codon with each base mutation. No Gaps in the Genetic Code
Wild-type sequence Mutant sequence In their 1961 research, Crick and colleagues also sug-
ACUCAGAUA ACUCGGAUA gested that the genetic code is read as a continuous string of
Codon 1 ACU ACU mRNA nucleotides uninterrupted by any kind of gap, space,
Codon 2 CAG CGA
Codon 3
or pause. If a gap or spacer were present between mRNA
AUA AUA
codons, the mRNA transcript might be represented as fol-
Figure 9.19 Predictions for the results of mutation of an over-
lows (x indicates the gap between codons):
lapping and a nonoverlapping genetic code. (a) Wild-type and YOUxMAYxNOWxSIPxTHExTEAx (“you may now sip
mutant DNA sequences for an overlapping genetic code. A base- the tea”)
pair substitution mutation is predicted to change three consecutive
codons, and therefore three consecutive amino acids. (b) Wild-type If the genetic code were structured in some such way, with
and mutant DNA for a nonoverlapping genetic code. A base-pair each codon set off from its neighbors, insertion or deletion of
substitution mutation is predicted to change only one amino acid. a nucleotide would not cause the kind of frameshift mutation
9.5 Experiments Deciphered the Genetic Code 337
that Crick and colleagues had observed. Instead, insertion or radioactive amino acid in each translation. They detected
deletion of nucleotides could be expected to alter the affected production of a highly radioactive polypeptide after con-
codon but not the identity of adjoining codons. For example, ducting translation in a system containing radioactively
consider the following insertion mutation, where the separa- labeled phenylalanine (Figure 9.20). The radioactive poly-
tion between codons confines the alteration to a single word: peptide was poly-phenylalanine (poly-Phe). Since the only
possible triplet codon in the mRNA was UUU, Nirenberg
YOUx,MA T Yx,NOWx,SIPx,THEx,TEAx, (“you ma t y
and Matthaei reasoned that 5′-UUU-3′ codes for phenyl-
now sip the tea”)
alanine. They went on to construct poly(A), poly(C), and
poly(G) synthetic mRNAs and identified 5′-AAA-3′ as a
Deciphering the Genetic Code codon for lysine (Lys), 5′-CCC-3′ as a proline (Pro) codon,
Once it had been established that the genetic code consists and 5′-GGG-3′ as a codon for glycine (Gly) (Table 9.8).
of triplets, researchers sprang to the task of establishing
which triplets are associated with each amino acid in the pro- Synthetic poly(U) mRNA
cess of translation. Marshall Nirenberg and Johann Heinrich 5¿ UUUUUUUUUUUUUUUUUUUUU 3¿
Matthaei performed a simple experiment in 1961 that laid the
groundwork for later experiments in deciphering the genetic
code. Their experimental design was straightforward: Con-
struct synthetic strings of repeating nucleotides, and use an in
vitro translation system to translate the sequence into a poly- In vitro translation
peptide. For example, Nirenberg and Matthaei synthesized an systems, each
artificial mRNA containing only uracils, known as a poly(U). containing a different
14
C-labeled amino acid.
They devised an in vitro translation system composed of
the known cellular components of bacterial translation—
ribosomes, charged transfer RNA molecules, and essential
translational proteins. Regardless of where translation might
N Phe Phe Phe Phe Phe Phe Phe Phe C
begin along the poly(U) mRNA, the only possible codon it
contained was UUU. The researchers were therefore hoping to Test the resulting polypeptides
determine which amino acid corresponds to the UUU codon. for radioactivity.
Twenty separate in vitro translations of poly(U) mRNA Figure 9.20 Use of synthetic mRNAs to determine genetic
were carried out, each time using a pool of 19 unlabeled code possibilities. Synthetic poly(U) mRNA, forming only UUU
amino acids and one amino acid labeled with radioactive codons, is translated in vitro in a series of experiments, each using
carbon (14C). To determine which amino acid is encoded a different 14C-labeled amino acid—in this example, phenylalanine.
by poly(U) mRNA, Nirenberg and Matthaei used a different A polypeptide consisting of phenylalanine (Phe) is formed.
Mutant 1: N...Asn-Cys-Leu-Thr-His-Thr-C
Mutant 2: N...Asn-Cys-Leu-Thr-His-Thr-Tyr-His-Lys-C
Mutant 3: N...Asn-Cys-Leu-Thr-His-Thr-Tyr-His-Tyr-Ser-Ser-Leu-Ala-Val-C
Identify the mutational events that produce each of the mutant proteins.
BREAK IT DOWN: Mutations occur at the level of
DNA. Comparison of each mutant DNA and amino
acid sequences with the wild-type sequence will reveal
how the DNA sequence is changed (p. 241).
Solve
5. Identify the mutation and its 5. Two different base substitutions altering the tyrosine (Tyr) codon UAU to a stop
consequence for translation in codon could cause Mutant 1. The wild-type UAU codon was most likely altered by
Mutant 1. base substitution to form either a UAA or a UAG stop codon.
6. Identify the mutation and its 6. Lysine (Lys), which was added to the mutant polypeptide, is encoded by AAA or
consequence in Mutant 2. AAG. Deletion of the U from the wild-type stop codon would produce an AAG codon
followed by UAG, a stop codon.
7. Identify the mutation and its 7. Tyrosine, specified by codons UAU and UAC, is found in place of the normal stop
consequence in Mutant 3. codon. This is followed by a serine codon (UCN or AGU/C), rather than the GUA (Val)
TIP: Examine the wild-type nucleotide that follows the “in-frame” stop codon in the wild type. A base-pair insertion that
sequence at the place where mutation is adds a U or a C into the third position of the normal UAA stop codon forms a UAU
expected to have occurred, and identify
ways in which base substitution, insertion,
or a UAC tyrosine (Tyr) codon. The altered reading frame from that point would then
or deletion could have had the observed read AGU (Ser), followed by AGC (Ser), CUA (Leu), GCA (Ala), GUC (Val), and UGA (stop).
effect on the amino acid sequence.
For more practice, see Problems 5, 11, 16, and 32. Visit the Study Area to access study tools. Mastering Genetics
339
340 CHAPTER 9 The Molecular Biology of Translation
C A SE ST U D Y
Antibiotics and Translation Interference
We have all taken antibiotics at various times during our lives to do less familiar antibiotics such as erythromycin, puromycin,
counteract a painful or persistent microbial infection. As a result and cycloheximide. Each antibiotic contains a different active
of the efficiency of these compounds, we have experienced compound that takes advantage of unique features of bacterial
rapid relief of symptoms and elimination of the infection. These translation to disrupt the production of bacterial proteins while
beneficial effects are accomplished by selective cell death or not interfering with the translation of proteins in our cells.
through blocking cell proliferation. Specifically, the antibiotic
either kills microorganisms without harming our own cells in the TRANSLATION DISRUPTION BY AMINOGLYCOSIDES
process or it acts to prevent further microbial cell growth. What Streptomycin is one of several antibiotics in a class of biochemical
is the biochemical basis of antibiotic action? How do antibiotic compounds called aminoglycosides. Streptomycin inhibits bacte-
compounds specifically target microbial cells for destruction? rial translation by interfering with binding of N-formylmethionine
tRNA to the ribosome, thus preventing the initiation of translation.
PROTEIN SYNTHESIS INHIBITION BY ANTIBIOTIC COM- Streptomycin can also cause misreading of mRNA during transla-
POUNDS You will probably not be surprised to learn that dif- tion by generating mispairing between codons and anticodons.
ferent antibiotics target different aspects of microbe biology. For example, the codon UUU normally specifies phenylalanine,
But you may be surprised to learn that many different antibiotics but streptomycin induces pairing between a UUU codon and the
target microbial translation as their mode of action (Table 9.9). tRNA carrying isoleucine, whose codon is AUU. This error leads
Familiar antibiotics such as tetracycline, streptomycin, and chlor- to amino acid changes in proteins and potentially to defective
amphenicol target different stages of microbial translation, as protein activity. Other aminoglycosides, such as neomycin, kana-
mycin, and gentamicin, also cause mispairing between codons
and anticodons and can generate defective proteins. Erythromy-
cin also impairs bacterial translation, but it does so in a very dif-
Table 9.9 Antibiotic Inhibitors of Protein Synthesis ferent way. It binds to the 50S (large) subunit in the tunnel from
which the newly synthesized polypeptide emerges. The effect of
Antibiotic Inhibitory Action its binding is to block the polypeptide from passing out of the
Chloramphenicol Blocks polypeptide formation by ribosome. This causes the ribosome to stall on mRNA, bringing
inhibiting peptidyl transferase in the translation to a halt. Table 9.9 provides details about these and
70S ribosome (antibacterial action) other actions of antibacterial agents.
Erythromycin Blocks translation by binding to 50S
subunit and inhibiting polypeptide TRANSLATION BLOCKAGE BY ANTIFUNGAL COM-
release (antibacterial action) POUNDS Single-celled eukaryotic microorganisms, such
as fungi, can also cause human infections. To fight these
Streptomycin Inhibits translation initiation and causes
infections, antibiotics such as puromycin and cycloheximide,
misreading of mRNA by binding to the
which target translational activities of fungal cells, are used.
30S subunit (antibacterial action)
Puromycin has a three-dimensional structure similar to that of
Tetracycline Binds to the 30S subunit and inhibits the 3′ end of a charged tRNA. It stops translation of bacterial
binding of charged tRNAs (antibacterial and eukaryotic mRNAs by binding at the ribosomal A site
action) and acting as an analog of charged tRNA. When puromycin
Cycloheximide Blocks polypeptide formation by is bound at the A site, its amino group forms a peptide bond
inhibiting peptidyl transferase activity in with the carboxyl group of the P-site amino acid. However,
the 80S ribosome (antieukaryote action) puromycin does not contain a carboxyl group. This difference
Puromycin Causes premature termination of prevents formation of any additional peptide bonds and puts
translation by acting as an analog an end to translation. Cycloheximide exclusively blocks fun-
of charged tRNA (antibacterial and gal translation by binding to the 60S subunit and inhibiting
antieukaryote action) peptidyl transferase activity, much like chloramphenicol does
to bacterial peptidyl transferase (see Table 9.9).
SU MMA RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
9.1 Polypeptides Are Amino Acid Chains That ❚❚ Polypeptides have four structural levels: the amino acid
Are Assembled at Ribosomes order (primary), intrachain folding (secondary), three-
dimensional functional folding (tertiary), and multimeric
❚❚ Polypeptides contain 20 kinds of amino acids that carry protein structure (quaternary).
side chains, giving them specific properties. ❚❚ Polypeptides have an N-terminal (amino) end and a
❚❚ Translation takes place at the ribosome, where mRNA C-terminal (carboxyl) end.
codons are coupled to transfer RNA anticodons by ❚❚ Ribosomes are composed of two subunits that each consist
complementary base pairing. of ribosomal RNA and numerous proteins.
Summary 341
❚❚ Ribosomes have three functional sites of action: the P site, intercistronic spacers in bacterial polycistronic mRNAs
where the polypeptide is held; the A site, where tRNA permits a ribosome to translate two or more polypeptides
molecules bind to add their amino acid to the end of the consecutively from the mRNA before dissociating.
polypeptide; and the E site, which provides an exit point for ❚❚ The evolutionary evidence derived from homologies among
uncharged tRNAs. translationally active proteins of members of the three
domains of life suggests that archaea are more closely
9.2 Translation Occurs in Three Phases related to eukaryotes than they are to bacteria.
❚❚ Bacterial translation is initiated with the binding of the
Shine–Dalgarno sequence on the 5′ mRNA end to a 9.4 The Genetic Code Translates Messenger
complementary sequence of nucleotides on the 3′ end of RNA into Polypeptide
the 16S rRNA in the small ribosomal subunit. The nearby
start codon is the site where translation commences. ❚❚ Each mRNA codon is composed of three consecutive
nucleotides. Of the 64 codons contained in the genetic
❚❚ In eukaryotic mRNA, the 5′ cap is the binding site for code, 61 specify amino acids and 3 are stop codons.
eukaryotic initiation factors that cause the small ribosomal
subunit to begin scanning in search of the start codon, ❚❚ The genetic code is redundant, meaning that most amino
which is part of the Kozak sequence. acids are specified by more than one codon. Redundancy of
the genetic code is made possible by third-base wobble that
❚❚ Archaea carry multiple translation-initiation factors that relaxes the strict complementary base-pairing requirements
are homologous to eukaryotic initiation factors, but archaea at the third base of the codon.
also produce a high proportion of leaderless mRNAs that
have an unknown translation-initiation mechanism. ❚❚ The genetic code is essentially universal among living
organisms. The few exceptions to the genetic code are
❚❚ During polypeptide synthesis, charged tRNAs enter the A site, found mainly in mitochondria.
and peptidyl transferase catalyzes peptide bond formation,
transferring the polypeptide from the A-site tRNA to the P-site ❚❚ Properly charged tRNAs play the central role in converting
tRNA. Elongation factor proteins translocate the ribosome, mRNA sequence into polypeptide sequence.
shifting the tRNA–polypeptide complex from the A site to the ❚❚ Specialized enzymes called aminoacyl-tRNA synthetases
P site and opening the A site for the next charged tRNA. catalyze the addition of a specific amino acid to each
❚❚ Translation terminates when a stop codon enters the A site. tRNA.
Release factor proteins, rather than tRNA, bind to stop ❚❚ Proteins in eukaryotic cells are sorted to their cellular desti-
codons. Release factors cause release of the polypeptide nations by signal sequences at their N-terminal ends. Signal
and lead to the dissociation of the ribosome from mRNA. sequences are removed from polypeptides in the ER, where
they are sorted for their cellular destinations.
9.3 Translation Is Fast and Efficient
9.5 Experiments Deciphered the Genetic Code
❚❚ An mRNA undergoes simultaneous translation by
several ribosomes that attach to it sequentially to form a ❚❚ In vitro experimental analysis demonstrates that the genetic
polyribosome. code is triplet and does not contain gaps or overlaps.
❚❚ Usually, a ribosome will dissociate from mRNA upon ❚❚ The genetic code was deciphered by analysis of in vitro
encountering a stop codon, but the small size of some translation of synthetic messenger RNA.
PRE PA R I N G F O R P R O BLEM S O LV I NG
In addition to the list of problem-solving tips and suggestions 4. Be able to use an amino acid sequence to determine the
given here, you can go to the Study Guide and Solutions Man- corresponding mRNA and DNA sequences.
ual that accompanies this book for help at solving problems.
5. Know the general structure of ribosomes and the steps
1. Know the general structure of genes and the relationships and processes that initiate translation.
between a gene, its mRNA transcript, and the polypeptide
6. Be able to describe the steps of polypeptide elongation
translated from the mRNA. Be able to describe the relative
and the processes that produce polypeptides.
positions of the transcription start, transcription termination,
5′ UTR, 3′ UTR, start codon, and stop codon, and be able 7. Know the similarities and the differences between bac-
to assign polarity to strands of nucleic acids and to identify terial and eukaryotic translation.
the N-terminal and C-terminal ends of polypeptides.
8. Be prepared to describe mechanisms of posttransla-
2. Be familiar with the genetic code and be able to use it tional polypeptide processing.
to deduce the primary structure of a polypeptide from
an mRNA sequence. 9. Be familiar with the experimental evidence that deci-
phered the genetic code.
3. Be familiar with amino acid structure and with the four
levels of polypeptide structure.
342 CHAPTER 9 The Molecular Biology of Translation
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Some proteins are composed of two or more 11. Consider translation of the following mRNA sequence:
polypeptides. Suppose the DNA template strand sequence 5′-...AUGCAGAUCCAUGCCUAUUGA...-3′
3′-TACGTAGGCTAACGGAGTAAGCTAACT-5′ produces a a. Diagram translation at the moment the fourth amino acid
polypeptide that joins in pairs to form a functional protein. is added to the polypeptide chain. Show the ribosome;
a. What is the amino acid sequence of the polypeptide label its A, P, and E sites; show its direction of movement;
produced from this sequence? and indicate the position and anticodon triplet sequence of
b. What term is used to identify a functional protein like this tRNAs that are currently interacting with mRNA codons.
one formed when two identical polypeptides join together? b. What is the anticodon triplet sequence of the next
2. In the experiments that deciphered the genetic code, many tRNA to interact with mRNA?
different synthetic mRNA sequences were tested. c. What events occur to permit the next tRNA to interact
a. Describe how the codon for phenylalanine was identified. with mRNA?
b. What was the result of studies of synthetic mRNAs 12. The diagram of a eukaryotic ribosome shown below con-
composed exclusively of cytosine? tains several errors.
c. What result was obtained for synthetic mRNAs con-
taining AG repeats, that is, AGAGAGAG...?
d. Predict the results of experiments examining GCUA N
Phe Ala
repeats. 80S
N
Gly
3. Several lines of experimental evidence pointed to a triplet
genetic code. Identify three pieces of information that
supported the triplet hypothesis of genetic code structure. P
A
E CGTG Ribosome movement
4. Outline the events that occur during initiation of G
TCGCAC C
T 60S along mRNA
translation in E. coli.
CG G U C
U
… AA
17. The line below represents a mature eukaryotic mRNA. 19. Define and describe the differences in the primary,
The accompanying list contains many sequences or secondary and tertiary structures of a protein.
structures that are part of eukaryotic mRNA. A few of 20. Describe the roles and relationships between
the items in the list, however, are not found in eukaryotic
a. tRNA synthetases and tRNA molecules.
mRNA. As accurately as you can, show the location, on
b. tRNA anticodon sequences and mRNA codon
the line, of the sequences or structures that belong in
sequences.
eukaryotic mRNA; then, separately, list the items that are
not part of eukaryotic mRNA.
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
21. In an experiment to decipher the genetic code, a poly-AC b. What is the sequence of the resulting polypeptide?
mRNA (ACACACAC...) is synthesized. What pattern c. How did the polypeptide composition help confirm the
of amino acids would appear if this sequence were to be triplet nature of the genetic code?
translated by a mechanism that reads the genetic code as d. If the genetic code were a doublet code instead of a
a. a doublet without overlaps? triplet code, how would the result of this experiment
b. a doublet with overlaps? be different?
c. a triplet without overlaps? e. If the genetic code was overlapping rather than non-
d. a triplet with overlaps? overlapping, how would the result of this experiment
e. a quadruplet without overlaps? be different?
f. a quadruplet with overlaps? 25. An experiment by Khorana and his colleagues translated a
22. Identify and describe the steps that lead to the secretion of synthetic mRNA containing repeats of the trinucelotide UUG.
proteins from eukaryotic cells. a. How many reading frames are possible in this
mRNA?
23. The amino acid sequence of a portion of a polypeptide is b. What is the result obtained from each reading frame?
N...Cys-Pro-Ala-Met-Gly-His-Lys...C c. How does the result of this experiment help confirm
a. What is the mRNA sequence encoding this polypeptide the triplet nature of the genetic code?
fragment? Use N to represent any nucleotide, Pu to 26. The human b@globin polypeptide contains 146 amino
represent a purine, and Py to represent a pyrimidine. acids. How many mRNA nucleotides are required to
Label the 5′ and 3′ ends of the mRNA. encode this polypeptide?
b. Give the DNA template and coding strand sequences
corresponding to the mRNA. Use the N, Pu, and Py 27. The mature mRNA transcribed from the human b@globin
symbols as placeholders. gene is considerably longer than the sequence needed to
encode the 146–amino acid polypeptide. Give the names
24. Har Gobind Khorana and his colleagues performed
of three sequences located on the mature b@globin mRNA
numerous experiments translating synthetic mRNAs. In
but not translated.
one experiment, an mRNA molecule with a repeating
UG dinucleotide sequence was assembled and translated. 28. Figure 9.7 contains several examples of the Shine–
a. Write the sequence of this mRNA and give its polarity. Dalgarno sequence. Using the seven Shine–Dalgarno
344 CHAPTER 9 The Molecular Biology of Translation
sequences from E. coli, determine the consensus sequence a. Locate the start codon and stop codon in this sequence.
and describe its location relative to the start codon. b. Determine the amino acid sequence of the polypeptide
produced from this mRNA. Write the sequence using the
29. Figure 9.17 shows three posttranslational steps required
three-letter and one-letter abbreviations for amino acids.
to produce the sugar-regulating hormone insulin from the
starting polypeptide product preproinsulin. 33. Diagram a eukaryotic gene containing three exons and
a. A research scientist is interested in producing human two introns, the pre-mRNA and mature mRNA transcript
insulin in the bacterial species E. coli. Will the genetic of the gene, and a partial polypeptide that contains the fol-
code allow the production of human proteins from bac- lowing sequences and features. Carefully align the nucleic
terial cells? Explain why or why not. acids, and locate each sequence or feature on the appro-
b. Explain why it is not feasible to insert the entire priate molecule.
human insulin gene into E. coli and anticipate the pro- a. the AG and GU dinucleotides corresponding to intron–
duction of insulin. exon junctions
c. Recombinant human insulin (made by inserting human b. the + 1 nucleotide
DNA encoding insulin into E. coli) is one of the most c. the 5′ UTR and the 3′ UTR
widely used recombinant pharmaceutical products in d. the start codon sequence
the world. What segments of the human insulin gene e. a stop codon sequence
are used to create recombinant bacteria that produce f. a codon sequence for the amino acids Gly-His-Arg at
human insulin? the end of exon 1 and a codon sequence for the amino
30. A DNA sequence encoding a five–amino acid polypeptide acids Leu-Trp-Ala at the beginning of exon 2
is given below. 34. Table C contains DNA-sequence information compiled by
...ACGGCAAGATCCCACCCTAATCAGACCGTACCATTCACCTCCT... Marilyn Kozak (1987). The data consist of the percentage
...TGCCGTTCTAGGGTGGGATTAGTCTGGCATGGTAAGTGGAGGA... of A, C, G, and T at each position among the 12 nucleo-
a. Locate the sequence encoding the five amino acids of tides preceding the start codon in 699 genes from various
the polypeptide, and identify the template and coding vertebrate species and at the first nucleotide after the start
strands of DNA. codon. (The start codon occupies positions + 1 to + 3,
b. Give the sequence and polarity of the mRNA encoding and the first nucleotide immediately after the start codon
the polypeptide. occupies position + 4.) Use the data to determine the con-
c. Give the polypeptide sequence, and identify the sensus sequence for the 13 nucleotides ( - 12 to - 1 and
N-terminus and C-terminus. + 4) surrounding the start codon in vertebrate genes.
d. Assuming the sequence above is a bacterial gene, iden- 35. Table D lists a@globin and b@globin gene sequences for
tify the region encoding the Shine–Dalgarno sequence. the 11 or 12 nucleotides preceding the start codon and the
e. What is the function of the Shine–Dalgarno sequence? first nucleotide following the start codon (see Problem
31. A portion of the coding strand of DNA for a gene has the 34). The data are for 16 vertebrate globin genes reported
sequence by Kozak (1987). The sequences are written from - 12
5′-...GGAGAGAATGAATCT...-3′ to + 4 with the start codon sequence in capital letters.
Use the data in this table to
a. Write out the template DNA strand sequence and
polarity as well as the mRNA sequence and polarity a. Determine the consensus sequence for the 16 selected
for this gene segment. a@globin and b@globin genes.
b. Assuming the mRNA is in the correct reading frame, b. Compare the consensus sequence for these globin
write the amino acid sequence of the polypeptide using genes to the consensus sequence derived from the
three-letter abbreviations and, separately, the amino larger study of 699 vertebrate genes in Problem 34.
acid sequence using one-letter abbreviations. 36. The six nucleotides preceding the start codon and the
32. A eukaryotic mRNA has the following sequence. The first nucleotide after the start codon in eukaryotes exhibit
5′ cap is indicated in italics (CAP), and the 3′ poly(A) tail strong sequence conservation as determined by the per-
is indicated by italicized adenines. centages of nucleotides in the - 6 to - 1 positions and the
+ 4 position (see Problem 34). Use the data given in the
5′-CAPCCAAGCGUUACAUGUAUGGAGAGAAUGAAACUGAGGCUUG
table for Problem 35 to determine the seven nucleotides
CCACGUUUGUUAAGCACCUAUGCUACCGAAAAAAAAAAAAAAAAA
that most commonly surround the start in vertebrates.
AAAAAAA-3′
Table C
Position - 12 - 11 - 10 -9 -8 -7 -6 -5 -4 -3 -2 -1 [start] + 4
Percent A 23 26 25 23 19 23 17 18 25 61 27 15 [AUG] 23
Percent C 35 35 35 26 39 37 19 39 53 2 49 55 [AUG] 16
Percent G 23 21 22 33 23 20 44 23 15 36 13 21 [AUG] 46
Percent T 19 18 18 18 19 20 20 20 7 1 11 9 [AUG] 15
Problems 345
Table D
Gene Sequence Gene Sequence
- 12 start + 4 - 12 start + 4
a-Globin Family b-Globin Family
Human adult agagaacccaccATGg Human fetal agtccagacgccATGg
Human embryonic caccctgccgccATGt Human embryonic aggcctggcatcATGg
Baboon ccagcgcgggcATGg Rabbit adult aaaccagacagaATGg
Mouse adult caggaagaaaccATGg Rabbit embryonic agaccagacatcATGg
Rabbit adult gaaggaaccaccATGg Chicken adult ccaaccgccgccATGg
Goat embryonic tcagctgccaccATGt Chicken embryonic cccgctgccaccATGg
Duck adult ggagctgcaaccATGg Xenopus adult tcaactttggccATGg
Chicken embryonic ctctcctgcacaATGg Xenopus larval tctacagccaccATGg
37. In terms of the polycistronic composition of mRNAs and the a. Do organisms of the three domains use the same amino
presence or absence of Shine–Dalgarno sequences, compare acid as the initial amino acid in translation? Identify
and contrast bacterial, archaeal, and eukaryotic mRNAs. similarities and differences.
b. Despite AUG being the most common start codon
38. Organisms of all three domains of life usually use the
sequence, very few proteins have methionine as the
mRNA codon AUG as the start codon.
first amino acid. Why is this the case?
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
A heel stick is a minimally invasive procedure, being used here to collect a small amount of
blood from a newborn infant. The blood is used to screen for disorders on the Recommended
Uniform Screening Panel (RUSP) list of human hereditary diseases, as discussed in this chapter.
The odds worked out for Kristen and for Nate, but that’s not always the case.
Kristen graduated from Stanford University in 2016, and she finished Twitch. The
documentary is about HD, her mother’s disease progression, Kristen’s decision
making regarding testing, the consequences of her test result, and the stigma
and trauma of HD. See it, if you can. It carries an important message we should
all hear. You can go to http://www.twitchdocumentary.com to read Kristen’s story
and gain access to Twitch.
and soon the infant is experiencing permanent mental and and it greatly expands the number of tests that can be per-
developmental impairment. Prior to the availability of treat- formed using a small amount of blood from a newborn.
ment, PKU was the cause of thousands of cases of profound
mental incapacity annually around the world.
Living with PKU
Fortunately, discovery of the abnormality in Phe metabo-
lism led directly to creation of a disease management protocol A positive result for PKU immediately initiates an array of
able to prevent the development of PKU symptoms. Under- other tests to verify the diagnosis, followed within hours by
standing of the abnormality also led Robert Guthrie to develop the beginning of treatment that will last a lifetime. The prin-
a newborn test for PKU. The Guthrie test, as it was known, cipal component of treatment to prevent PKU is a special
was a simple, inexpensive procedure that accurately identified low-protein diet, beginning with an infant formula that is
newborn infants with PKU, using just a few drops of their blood phenylalanine-free. This diet, along with regular monitor-
obtained from a “heel stick” (see the chapter opener photo). A ing of the child’s blood for its Phe concentration, keeps the
heel stick is done using a small sterile lance to puncture the blood levels of Phe and PPA near the normal ranges. Doing
skin on the heel of the foot (where very few nerve endings are this prevents the symptoms of PKU from developing.
located), drawing a small amount of blood. Heel sticks are per- The Phe-free diet is more expensive than a typical diet,
formed in the first few hours after birth, and they are the prin- and managing the dietary intake of infants and children—and
cipal way material is collected for newborn genetic screening. later, of teenagers and young adults—can be difficult; but the
The original Guthrie test involved an examination of outcome is well worth the expense and effort. The result is nor-
bacterial growth on a Petri dish. A positive Guthrie test, mal development, fully intact mental and motor capabilities,
indicating possible PKU, was identified by bacterial growth and the likelihood of a normal life span. The diet must be fol-
in the presence of a few drops of the infant’s blood along lowed closely, however; especially in women with PKU intend-
with a compound that normally inhibits bacterial growth. ing to have children. There is strong evidence that excessive
The excessive level of Phe in an affected infant’s blood levels of Phe in their blood circulation are a risk factor for birth
allowed abnormal bacterial growth to occur. defects in their children. Moreover, there is evidence of signifi-
Today, the newborn test for PKU is done using mass cant declines in mental capacity in people with PKU when their
spectrometry (MS). MS is an analytical chemistry tech- blood Phe levels have been persistently high for an extended
nique that ionizes a test substance and then measures the period of time.
abundance of specific gas ions that are released. MS is par- There are also a number of dietary traps to be avoided
ticularly useful for identifying the composition of proteins, by those with PKU. One of the most common dietary pit-
nucleic acids, and other organic chemical compounds. An falls is the artificial sweetener known as aspartame, an
MS analysis of a newborn infant’s blood sample can iden- ingredient of NutraSweet and many other “sugar-free”
tify scores of proteins in the blood, as well as the concen- products (Figure B.3). This compound is manufactured by
trations of other compounds, including amino acids such linking together two amino acids, aspartic acid and phe-
as Phe. MS can complete numerous chemical analyses of nylalanine. On ingestion, the aspartame is broken down
heel stick blood in a matter of minutes. This substantially and Phe is released. Occasional exposures to the artificial
cuts the time and expense required by the Guthrie test, sweetener are not serious, but persistent intake can lead
Amino acid OH H N Amino acid
aspartic acid H2N 2
OH phenylalanine
(Asp) (Phe)
O
B.2 Newborn Genetic Screening 351
HO O
H CH3
N O
H 2N
O
Aspartame
Breakdown on
ingestion Figure B.4 An aspartame warning label. The danger of Phe
to people with phenylketonuria has prompted the U.S. Food and
Drug Administration and similar agencies in other countries to
require warning labels on all products containing aspartame.
O
HO O
Amino acid Amino acid could be screened in newborn infants, and states were slow
OH H N
aspartic acid H2N 2
OH phenylalanine to mandate the available tests. But in that year, the U.S.
(Asp)
O
(Phe) Department of Health and Human Services (HHS) set up
advisory panels to search out additional testable and treat-
able genetic diseases and to make recommendations to the
secretary of HHS for newborn genetic testing. In 2003, the
HHS secretary’s Advisory Committee on Heritable Disor-
Figure B.3 Aspartame. This artificial sweetener aspartame
(found in sugar substitutes like NutraSweet) is composed of two
ders in Newborns and Children established a list of genetic
amino acids—aspartic acid (Asp) and phenylalanine (Phe). After diseases recommended for such testing: the Recommended
consumption, the breakdown of the sweetener releases both amino Uniform Screening Panel (RUSP). As a consequence, by
acids. Phe is the compound to be avoided by people with phenyl- 2007, all U.S. states provided newborn genetic screening for
ketonuria; thus aspartame poses a danger to these individuals. 25 disorders. As of November 2016, the American College
of Medical Genetics lists 34 “core” hereditary conditions
on the RUSP list (Table B.1). These are recommended for
screening by all states, and nearly all states test for all of
to serious complications of the disease. For this reason, these core conditions. There are an additional 25 “second-
all food and beverage products containing this artificial ary” conditions for states to consider for inclusion on their
sweetener carry a warning label to phenylketonurics test list (Table B.2).
(Figure B.4). There are two principal criteria for placement of a
Years ago those with PKU were doomed to suffer from genetic disease on these RUSP lists. First, the disease must
severe mental impairment and numerous other problems that be reliably detected in newborn infants, and second, the
led to short lives of complete dependency. Today, newborn disease must either be preventable or its symptoms and
detection of PKU and a specialized diet means that people prognosis must be substantially improved with treatment.
with the condition can avoid these impairments and are just Many of the disorders currently on these lists are meta-
as likely as anyone else to be honors students. It is estimated bolic disorders of organic acid, fatty acid, or amino acid
that worldwide since the 1960s, more than 50,000 babies production or breakdown. They are caused by the absence
born with PKU have gone on to develop normal cognitive or severely reduced action of single proteins. Like PKU
ability thanks to newborn genetic testing and the special (in Table B.1) or argininemia (ARG; in Table B.2 and
diet. Two organizations, the National PKU Alliance (npkua. described in Application Chapter A), these conditions are
org) and National PKU News (pkunews.org), provide sup- often treated with dietary restrictions and drug therapy.
port and information for people with phenylketonuria, their Hemoglobin disorders are generally treated with drug ther-
families, and their friends. apy and blood transfusions. Endocrine and other disorders
are commonly treated by drug therapy, dietary restrictions,
The Recommended Uniform Screening or other interventions.
Most of the diseases on the RUSP list are rare, occur-
Panel ring in just a few of every 25,000 to 100,000 infants born.
Apart from the success of the Guthrie test and the Phe-free Yet despite their individual rarity, their combined frequency
diet that prevents PKU, advances in newborn genetic screen- is high enough that newborn genetic screening is esti-
ing occurred slowly at first. By 1999, just five disorders mated to save or improve the lives of approximately 12,000
352 APPLICATION B Human Genetic Screening
Table B.1 RUSP 34 Core Conditions for Newborn Table B.2 RUSP 25 Secondary Conditions for
Genetic Screeninga Newborn Genetic Screeninga
ACMGb Code Condition ACMGb Code Condition
Organic Acid Disorders Organic Acid Disorder
CblA, CblB Methylmalonic acidemia (cobalamin disorders) Cbl C, D Methylmalonic acidemia with
GA1 Glutaric acidemia type 1 homocystinuria
symptoms. Furthermore, the emotional and other benefits DNA-Based Carrier Screening
for families may be incalculable. One recent study of the and Diagnostic Verification
costs, benefits, and impact of newborn genetic screening
was conducted in the state of Washington for the 10-year In the past two decades, the direct testing of DNA has
period from 2004, when the state first mandated screening, become possible. DNA genetic testing allows the direct
through 2014. The study found that during this period there detection of mutant DNA sequences producing mutant,
was a 20% decrease in infant mortality and a 14% decrease disease-causing alleles. The purpose of DNA-based genetic
in serious developmental disabilities, and that the cost sav- testing is twofold. One use is to identify carrier status for
ings was many times the cost of carrying out the screening conditions that do not have signature protein variation in
program. the blood. The other use is to verify clinical diagnoses by
You can learn more about the conditions on the RUSP list determining that a person suspected of having a particu-
in your state, and about other hereditary and childhood dis- lar hereditary condition is homozygous for variant alleles
eases, at two websites: the HHS-sponsored website http://www. causing the condition. DNA-based genetic testing is often
babysfirsttest.org/newborn-screening/states provides details capable of identifying more different disease-causing
on the RUSP, and the National Institutes of Health-sponsored alleles than is possible with genetic tests of blood-protein
website for the Eunice Kennedy Shriver Institute for Child variants. Dozens of different hereditary diseases and condi-
Health and Human Development (http://www.nichd.nih.gov) tions are detected and diagnosed by the direct examination
offers details on the effects of RUSP diseases on infants and of DNA.
children. For example, DNA genetic testing for cystic fibrosis
(OMIM 219700), which occurs predominantly in people of
Caucasian ancestry, not only can detect the most common
B.3 Genetic Testing to Identify disease-causing allele (that produces serious cases of cys-
tic fibrosis and accounts for almost 50 percent of the cystic
Carriers fibrosis alleles in the population) but also can identify doz-
ens of other mutant alleles of the same gene. Any genotype
Genetic carrier screening is used to determine the genotypes that contains two mutant copies of the gene will result in
of adults for the purpose of identifying those who are hetero- cystic fibrosis in a child. Homozygosity for the most com-
zygous for mutations that cause serious or fatal diseases in mon mutant allele produces a severe form of the disease,
children with homozygous recessive genotypes. This type of but either homozygosity for another of the mutations or so-
genetic testing has been in use for three decades and examines called compound heterozygosity, the presence of two dif-
either blood proteins or DNA, depending on the genetic condi- ferent mutant alleles in a genotype, can lead to milder, but
tion of interest. still serious, forms of cystic fibrosis. The same is true for
many of the diseases detected by DNA analysis. In a clini-
Testing Blood Proteins cal setting, this information can have an important impact
on patient care and case management. With a disease like
The first and most frequently used adult carrier genetic
cystic fibrosis, cases that are potentially more serious may
screens examine blood proteins of individuals from popula-
be more responsive to certain types of care than less serious
tions known to have elevated frequencies of certain mutant
cases are.
alleles and therefore higher numbers of heterozygous
carriers. In these carrier genetic screening tests, the het-
Carrier Screening Criteria
erozygous genotype could be determined by detection of
both the wild-type protein product and the mutant protein Whether carrier genetic screening is performed by assess-
product in a blood sample. Figure 1.13 shows an example ment of blood proteins or by DNA testing, there are two
of detection of the heterozygous genotype in a carrier of different screening strategies that can be followed. The first
the recessive allele for sickle cell disease (SCD; OMIM strategy is a population-based or community-based screen-
141900). SCD is one of the conditions examined in car- ing effort. In these instances, members of certain popula-
rier screening. It is particularly prevalent among people of tions in which a particular hereditary disease is prevalent are
African and Mediterranean ancestry. Carrier genetic testing recruited to participate in carrier testing programs. Carrier
for Tay–Sachs disease (OMIM 272800) and Gaucher dis- testing programs for Tay–Sachs disease and Gaucher disease
ease Type I (OMIM 230800) have been frequent subjects in Ashkenazi Jewish populations are examples. The partici-
of testing in populations of Ashkenazi Jewish ancestry pants in these programs are all free of the disease and they
since the 1990s. The purpose of identifying heterozygotes might or might not have family members with the disease.
is so that male and female partners who are both hetero- The purpose of the genetic screening is to identify individu-
zygous for a serious condition will know of their one in als who are heterozygous carriers of the disease so that they
four chance of having a child with the condition and can can use this information for decisions such as family plan-
make informed decisions about the pregnancy and care of ning. Prospective parents who each know their genotypes
the newborn infant. will have solid genetic information to use for these purposes.
354 APPLICATION B Human Genetic Screening
Alternatively, a woman who is a member of a popula- some that cause severe disability to a fetus and some that
tion in which a certain disease is prevalent but who does not are fatal.
know her genotype can take the second approach to carrier The abnormalities screened in prenatal genetic testing
screening. If, for example, a woman in a population in which fall into three categories (Table B.3). The first is chromosomal
cystic fibrosis is prevalent intends to have a child, she can abnormalities: an extra or a missing chromosome, extra or
have her genotype identified. If she is homozygous for the missing chromosome segments, or structural abnormalities
dominant allele, she has no chance of having a child with of chromosomes. The most common criteria for recommend-
cystic fibrosis, and testing goes no further. If, on the other ing prenatal screening of chromosomes are maternal age over
hand, she is a heterozygous carrier of a mutation produc- 35, a previous child born with a chromosome abnormality, or
ing cystic fibrosis, her partner can be tested to determine his the presence of a chromosome abnormality in one parent. The
genotype. If he is homozygous for the dominant allele, then second category of conditions examined by prenatal genetic
there is no chance the child will have cystic fibrosis. If he screening is developmental or growth conditions. These
is also a heterozygous carrier, however, then the couple can include neural tube (spinal cord and brain) abnormalities, bone
seek additional medical and genetic services to minimize or skeletal abnormalities, such as osteogenesis imperfecta
the chance that a child of theirs could have cystic fibrosis. (brittle bone disease), and stature-dwarfing conditions. A his-
tory of any of these conditions in a family or in a prior preg-
nancy is a common criterion for recommending this screening.
Pharmacogenetic Screening The final category of prenatal genetic screening conditions is
A special category of carrier testing is the developing area of hereditary disease. Several genetic diseases tested prenatally
pharmacogenetic screening that can be important in guid- also appear on the RUSP list for newborn genetic testing. Once
ing drug treatment of disease, as it can predict individual again, a family history or a previous child with the condition
responsiveness or reaction to certain medications. Inherited are common reasons a physician might recommend prenatal
genetic variation has been shown to influence the effective- screening.
ness of, or to increase the likelihood of adverse reactions to,
about one dozen commonly used drugs. Among the dozen Invasive Screening Using Amniocentesis
or so well-documented examples of a genotype–drug influ- or Chorionic Villus Sampling
ence is the use of the blood thinner warfarin that is often
given to help prevent blood clots in heart patients. Proper Amniocentesis uses a needle to penetrate the uterus and
dosages of warfarin are critical for management of blood- placenta of a pregnant woman to withdraw 10 to 20 mL
clotting risk. The CYP2C9 gene (cytochrome P) produces (2–3 tablespoons) of amniotic fluid. This fluid contains
an enzyme that metabolizes warfarin. More than 30 alleles fetal cells that can be cultured and used for genetic testing
of the gene have been identified. Most genotypes metabo-
lize warfarin at the wild-type rate, but individuals who are
homozygous for either the CYP2C9*2 or the CYP2C9*3
allele, and also those who have a heterozygous genotype Table B.3 Examples of Conditions Detected by
involving the two alleles (i.e., CYP2C9*2/CYP2C9*3), Prenatal Genetic Screening Methods
metabolize warfarin at a significantly lower rate than
wild type, and they require a lower drug dose to prevent Type of Condition Detection Methods
overdosing. Chromosome conditions such as . . .
Trisomies of 21, 18, or 13 Primarily by amniocentesis or CVS
Sex chromosome following MSS
Ultrasound
monitor 2 A small amount
of material from
the chorion...
Uterus
10- to 11-
week fetus
Chorion Extraction
catheter
Figure B.6 Chorionic villus sampling (CVS). Cells obtained by CVS can be used
directly in biochemical tests, DNA analysis, or chromosome examination.
356 APPLICATION B Human Genetic Screening
Noninvasive Prenatal Testing brain development or the spinal cord, the head and spine
of the fetus are carefully examined. Ultrasound can also be
One of the most common reasons a physician might recom- used to diagnose other developmental abnormalities, includ-
mend amniocentesis or CVS in a pregnancy has to do with ing growth disorders, heart or kidney abnormalities, and
the risk of the numerical chromosome condition called tri- physical abnormalities associated with some chromosome
somy 21, or Down syndrome. In this condition, the fetus car- anomalies and some hereditary conditions.
ries three copies of chromosome 21 rather than the normal Prenatal ultrasound is performed in a large proportion
two copies. The term trisomy means “three chromosomes.” of pregnancies in which there is no indication an abnormal-
Maternal age over 35 is strongly linked to an elevated risk of ity is present. Ultrasound may be used routinely to obtain
trisomy 21, as we discuss in Section 10.2. Before a recom- an accurate measurement of fetal age. The due date for a
mendation for amniocentesis or CVS is made, however, a baby’s birth can be set accurately by determining the age
noninvasive prenatal test called maternal serum screening is of a fetus during the first or second trimester of pregnancy.
usually performed. One by-product of this use of ultrasound is that fetal sex can
be ascertained at the same time.
Maternal Serum Screening Karyotyping, the identification of the chromosomes car-
Maternal serum screening (MSS), also called triple ried in cell nuclei (see Section 10.2 and Figure 10.4), may
screening, measures the levels of three proteins in a preg- also be indicated as a follow-up to MSS if the result suggests
nant woman’s blood circulation between the 15th and 20th the possibility of Down syndrome (trisomy 21) in a fetus.
week of gestation. MSS requires nothing more than draw- Fetal Cell Sorting Noninvasive methods for the diagnosis
ing a small amount of blood from a vein. The three proteins of inherited conditions or chromosome abnormalities are
are alpha fetoprotein (AFP), a form of the hormone estriol highly desirable because they pose no risk to the fetus. In
(uE3), and human chorionic gonadotropin (HCG). The lev- 1969 it was discovered that a small number of fetal cells
els of these three proteins are associated with elevated risks were present in maternal blood circulation, opening the pos-
of two chromosome trisomy conditions, as Table B.4 indi- sibility for a noninvasive pathway to diagnosis. The tech-
cates. A significantly elevated level of AFP by itself is an nique of fetal cell sorting involves identifying and then
indicator of a possible neural tube defect. isolating fetal cells in maternal blood circulation for analy-
It is important to recognize that MSS, which is used sis of DNA and chromosomes. Several advances in the sort-
routinely in many obstetric practices, is a screening test, not ing and analysis of fetal cells have been made, but moving
a diagnostic test. In other words, MSS results can indicate from research to reliable application as a clinical technique
the increased likelihood of a chromosome trisomy or a neu- remains elusive.
ral tube defect, but they do not mean a condition is present. Fetal cells, and some fetal DNA from ruptured fetal
Protein levels detected in MSS are simply associated with cells, are present in maternal circulation as early as the 8th
these conditions. Should an MSS produce results in the nor- week of gestation, but the cells are fragile and present in
mal ranges (i.e., the results do not indicate a potential chro- very low numbers; perhaps one in 1 billion cells in maternal
mosome trisomy or a neural tube defect), the risk of these circulation is of fetal origin. To date, there has been some
conditions is not zero, but amniocentesis or CVS is unlikely success in identifying fetal cells that contain a Y chromo-
to be recommended. Should an MSS produce abnormal some. These cells from males fetuses are the easiest to dis-
results, a recommendation of amniocentesis or CVS is very tinguish from maternal cells, since female cells contain only
likely as a follow up. X chromosomes. Some success has also been seen using
isolated fetal cells to identify genetic disorders, including
Prenatal Ultrasound Imaging As noted above, a signifi- cystic fibrosis and spinal muscular atrophy. Studies in 2012
cantly elevated level of AFP in MSS indicates the possibility reported that in tests involving cells taken from a number of
of a neural tube defect. These defects are diagnosed by the pregnancies, all fetuses affected by cystic fibrosis or by spi-
use of ultrasound, or ultrasonic sound wave frequencies, to nal muscular atrophy were correctly identified using molec-
produce images of the fetus. Since neural tube defects affect ular genetic analysis. Research continues in an attempt to
turn fetal cell sorting into a reliable method for prenatal
diagnosis.
Table B.4 Maternal Serum Screen Results
Indicating an Abnormality Preimplantation Genetic Screening
Condition AFP level uE3 level HCG level In vitro fertilization (IVF) is a long-standing method of
Trisomy 21 Decreased Decreased Elevated assisted reproduction for individuals and couples who have
difficulty reproducing without assistance, or choose not to
Trisomy 18 Decreased Decreased Decreased
do so. In this method, ovulation is induced in a woman with
Neural tube Elevated Not applicable to this condition the aid of hormone injections. A large number of eggs are
defect
then removed from the surface of the ovaries. The number
B.5 Direct-to-Consumer Genetic Testing 357
collected depends on the age and fertility of the egg donor. now developed commercial applications that provide several
These eggs can either be frozen for later use or used imme- kinds of personal genetic information directly to individual
diately for fertilization by sperm. Following fertilization in customers.
a laboratory dish, embryos go through a small number of Direct-to-consumer genetic analysis is a new and
cell divisions over 3 to 5 days, and they are then ready for growing wave in personal genetic testing. The Palo Alto,
implantation into the uterus. California-based company 23andMe is one of the more
The success rate of IVF for a fertilized embryo var- readily recognized for-profit companies involved in the
ies with the age of the woman into whom the fertilized direct-to-consumer genetics market. 23andMe, and compa-
embryos are implanted, but for all ages it is less than 50%. nies like it, offer several kinds of personal genetic informa-
Women under age 35 have about a 40% success rate. Those tion and testing. One testing component is carrier genetic
who are 35 to 37 have about a 35% success rate. The suc- testing for more than 40 recessive conditions. These are
cess rate is about 25% for ages 38 to 40, and it drops to the same tests included in our discussion in Section B.3,
about 15% for women over 40. As a consequence of these with results identifying carriers of recessive genetic con-
low rates, it is common for a woman to have to undergo two ditions. A second genetic-testing component identifies
or more IVF implantation treatments to attain a successful inherited variants influencing drug response. We described
pregnancy. this interaction above under “Pharmacogenetic Testing.”
IVF was first successfully used to assist human repro- Direct-to-consumer genetic testing can also identify the
duction in 1978. In that year, in England, Louise Brown likely presence of certain physical traits in people based
became the first human IVF baby. IVF was introduced into on the alleles in their genome. For example, individual
the United States in 1981 and since that time has resulted in differences in caffeine metabolism, an aversion to cilan-
more than one million babies being born. IVF is expensive tro (coriander), the presence of freckling, lactose intoler-
and not usually covered by medical insurance. Depending ance, male pattern baldness, and red or blond hair color
on the methods used and on other circumstances, the cost can be identified as likely to be present based on the inheri-
for each IVF cycle is about $12,000–$17,000. tance of specific genetic variants. Personal genetic test-
There are numerous reasons for opting for IVF, but one ing for ancestry relationships and evidence of individual
of them is the risk of a genetic disease. The most common geographic origins can also be provided. We discuss the
situations are those in which either both prospective parents details of these genetic analyses in Application Chapter E:
are heterozygous carriers of an autosomal recessive con- Forensic Genetics.
dition or a prospective mother is a carrier of an X-linked The most recent application of personal genetic test-
recessive condition. In either case, couples may choose ing, however, is perhaps one of the most far-reaching. In
IVF in combination with preimplantation genetic screen- April 2017 the U.S. Food and Drug Administration (FDA)
ing (PGS) as a way of minimizing the risk of having a child and 23andMe announced the approval of genetic testing to
with the condition. identify individuals’ risks of developing 10 medical condi-
PGS begins with IVF. After in vitro fertilization, tions that are influenced, but are not exclusively caused, by
embryos are normally allowed to rest for several cycles of genetics. Each of the 10 conditions covered by the agree-
mitotic division prior to implantation. Once they reach the ment is more likely to occur in individuals who carry spe-
8-cell or the 16-cell stage, one cell can be removed from cific single nucleotide polymorphism (SNP) variants or
the embryo without risk of harm. DNA is taken from this alleles of certain genes. These SNP variants or markers
single cell, and the segment targeted for genetic analy- are not equivalent to recessive or dominant mutant alleles
sis is amplified by PCR (polymerase chain reaction). Any that cause a condition. Instead, these markers are associ-
embryo that tests positive for the genetic condition will ated with the occurrence of a condition. In the context of
not be used for implantation, whereas those that test nega- inherited traits, the presence of an association means that
tive are known to be free from the condition and will be people who inherit a particular SNP variant or marker are
implanted. To date, thousands of healthy babies have been significantly more likely to develop a specific hereditary
born subsequent to PGS. condition than those who do not inherit the SNP variant
or marker. Stated another way, the presence of a SNP vari-
ant or marker is often necessary for a genetic condition
to develop, but it is not sufficient. Some additional non-
B.5 Direct-to-Consumer Genetic genetic event or set of events are required for the genetic
Testing condition to manifest itself. This means that heredity can
make individuals susceptible to developing the condition,
There is no doubt that the genomics era has provided new but other factors must also have their effect for the condi-
opportunities to gain insight on the impact of genomic vari- tion to develop.
ation. In addition to the discoveries that human biologists Table B.5 identifies the 10 genetic conditions covered
and medical professionals have made through the recent by the FDA-23andMe agreement. It also identifies the gene
technical advances in genomics, for-profit companies have affected and the specific SNP variants or markers that are
358 APPLICATION B Human Genetic Screening
associated with development of the condition. Some of the case of AATP genotyping, having one or two alleles encod-
SNP variants associated with a condition are identified by ing PI*Z or PI*S significantly increases an individual’s
their specific sequence location in the human genome. This likelihood of developing COPD, but it does not necessarily
genomic-sequence-location address is called the rs or rsid mean that a person will develop the condition.
number, “rsid” being an abbreviation for the reference SNP One might ask, what is the value of having such a genetic
ID cluster that identifies the genomic location of a SNP or test if it identifies increased risk but does not identify the
identifies the allele associated with the condition. actual presence of a condition? The answer is that knowing
The association between the identified SNP or allelic you have a genotype associated with an increased risk of dis-
marker and each of these conditions is strong. In each case, ease can be an important element in personal decision mak-
the presence of specific alleles or SNPs identifies the inheri- ing. In this instance, knowing the increased risk of AATD and
tance of alleles that increase the likelihood of developing a the likelihood of COPD can motivate the person to undergo
condition. An example is the condition alpha-1 antitrypsin regular pulmonary screening for breathing difficulties and for
deficiency (AATD), a degenerative lung and liver condition early signs of disease. It can also influence personal decisions
that most frequently results in chronic obstructive pulmonary such as whether to smoke or whether to work in an environ-
disease (COPD), a severe breathing difficulty. The alpha-1 ment with a high level of exposure to airborne pollutants.
antitrypsin protein (AATP) is produced by the SERPINA1 One pivotal but as yet unresolved issue to consider when
gene, and its occurrence is the result of inheriting specific providing genetic information that is associated with but is not
variant alleles producing a protease inhibitor (PI) protein. diagnostic of disease is the question of whether to also pro-
The most common form of the PI protein, known as PI*M, vide individuals with the information and support they need to
protects sensitive lung and other tissues from the protein- make informed decisions. Test results can identify those indi-
destroying (protease) activity of the protein trypsin. AATP viduals who are at high genetic risk for a condition, but the
is produced when either of two variant forms of the protein accessing of support and additional information from a genetic
called PI*Z and PI*S are present. Both of these variant pro- counselor or other medical professional is likely to be left up
teins are defective in protease inhibition. Individuals with to the individuals themselves. These needs may be long-term
genotypes encoding a variant protease inhibitor protein, that or life-long. Among the conditions listed in Table B.5 are Par-
is, PI*Z / PI*Z, PI*Z/PI*S, or PI*S/PI*S, are at increased risk kinson disease and late onset Alzheimer disease, conditions
of developing COPD, especially when they are also smokers that might never develop or might not manifest symptoms
or are exposed to high levels of particulate matter in the air for several decades after genetic testing. A significant public
and other airborne pollutants over a long period of time. health issue arising from direct-to-consumer genetic testing
The genetic test offered by 23andMe identifies indi- concerns how genetic counseling and medical monitoring for
vidual genotypes for this gene, and this identification is the the conditions will be accessed, paid for, and managed. Direct-
basis for assigning the risk of AATD. Importantly, genotyp- to-consumer genetic testing will likely never achieve the level
ing for this condition and the other nine conditions does not of follow-up genetic counseling as genetic testing conducted
diagnose the presence or absence of the diseases. The pres- under the auspices of a medical practice, so it is incumbent
ence of a given genotype can only indicate the presence of a on consumers to think carefully about their personal needs in
significant elevation of risk of an associated condition. In the regard to obtaining, understanding, and managing the genetic
Problems 359
information they seek and about the source from which that With opportunities come risks, and with the possi-
genetic information is obtained. bility of obtaining information, certain choices must be
faced and certain decisions must be made—the choice
of whether or not to obtain information through avail-
B.6 Opportunities and Choices able genetic testing and the decision about what to do
once the information is in hand. There is no universally
The kinds of genetic testing described in this chapter offer right or wrong decision, nor is there one right decision
us an unprecedented view of ourselves that was unavailable for every situation. Whether for decision making today,
to previous generations. Today, we have the chance to know using currently available knowledge and technology, or
if we carry a mutant allele that might combine with another for decision making in the future, when additional infor-
mutant allele to produce a serious or fatal disorder. We can mation and choices are available, what is most important
screen our pregnancies for possible chromosome, devel- is the access to accurate information and the freedom to
opmental, or genetic problems. Newborn infants and their make the individual choices that are right for each person
families can be spared the worst ravages of certain heredi- and each new situation. The roles of genetic counseling
tary conditions, and we can even use genetics to look into and related support services, as described in Application
our futures to foresee the onset of certain diseases. At the Chapter A: Human Hereditary Disease and Genetic Coun-
same time, with the exception of newborn genetic screening seling are and will continue to be integral parts of these
mandated by state laws, the various testing and screening information-gathering, information-delivery, and decision-
approaches described are options, not requirements. making processes.
can deduce a genotype. If a genotype cannot be deter- 12. If you were to look up Gaucher disease on the OMIM
mined completely, list the alleles you know or deduce website, you would see that there are three major types,
must be present. designated Type I (OMIM 230800), Type II (OMIM
c. Explain why you are able to assign genotypes to the 230900), and Type III (OMIM 231000). All three types
man’s parents despite their not being tested. are mutations of the gene for acid@b@glucosidase, encoded
7. Diseases and conditions on the RUSP list are tested on on chromosome 1. Different mutations of this gene pro-
every newborn infant, and if the baby has one of the con- duce the three types of Gaucher disease that differ some-
ditions, the parents are immediately informed. What kind what in their symptoms and disease severity.
of information and counseling should be provided to the a. For each mutation, speculate about whether the
parents along with the diagnosis? acid@b@glucosidase enzyme is merely reduced in
function or whether its production is eliminated, and
8. Do you think it is important that participation in explain why.
community-based genetic screening be entirely voluntary? b. Thinking about the production or function of the
Why or why not? acid@b@glucosidase enzyme, why do you suppose dif-
9. If a man and a woman are each heterozygous carriers of a ferent mutations of this gene produce differences in
mutation causing a disease on the RUSP list, what do you symptoms and disease severity?
think are the three or four most important factors they should 13. Imagine yourself in the same position as Kristen Powers,
consider in their decision making about having children? faced with the decision of whether or not to undergo
a genetic test that will discover if you have inherited
10. Suppose a man and a woman are each heterozygous car-
Huntington’s disease. List five life decisions or choices
riers of a mutation causing a fatal hereditary disease not
that you think are likely to be affected by the results of
on the RUSP list. Prenatal genetic testing can identify the
the genetic test. Do you think you would make the same
genotype of a fetus with regard to this disease and can
choice to test that Kristen made? Why or why not?
identify fetuses with the disease. What do you think are
the three or four most important factors this couple should 14. Select one of the hereditary conditions from either the
consider in their decision making about having children? RUSP core conditions list or the RUSP list of secondary
conditions and do some online research to find the follow-
11. The most common reason a physician might recommend
ing information:
that a woman have maternal serum screening and a karyo-
type analysis is concern that her fetus may have Down a. The frequency of the condition in newborn infants
syndrome. Log on to the OMIM website at www.ncbi (note any populations in which the condition is more
.nlm.nih.gov/omim and look up Down syndrome frequent).
(OMIM 190685). b. The defect that characterizes the condition.
c. The symptoms and consequences of the condition if it
a. List the main symptoms of Down syndrome.
is not treated.
b. Look at the “Mapping” and “Molecular Genetics”
d. The recommended treatment for those with the
sections and describe what is meant by the Down syn-
condition.
drome critical region (DSCR).
e. The duration of treatment.
c. Summarize what is known about the location and
f. The anticipated outcome if treatment is applied.
genes found within the DSCR.
d. How might those genes lead to the main symptoms of
Down syndrome?
Eukaryotic Chromosome
Abnormalities and Molecular
Organization
10
CHAPTER OUTLINE
10.1 Chromosome Number and
Shape Vary among Organisms
10.2 Nondisjunction Leads to
Changes in Chromosome
Number
10.3 Changes in Euploid Content
Lead to Polyploidy
10.4 Chromosome Breakage
Causes Mutation by Loss,
Gain, and Rearrangement of
Chromosomes
10.5 Chromosome Breakage Leads
to Inversion and Translocation
of Chromosomes
10.6 Eukaryotic Chromosomes Are
Organized into Chromatin
ESSENTIAL IDEAS
❚❚ The unique chromosome content of each
genome can be visualized and analyzed
Chromosome translocations are mutations that rearrange chromosome by microscopic and molecular methods to
structure. This electron micrograph shows two pairs of homologous yield information about normal chromo-
chromosomes that have exchanged segments and must form a tetravalent somes and to compare between species.
structure involving the four chromosomes to synapse their homologous ❚❚ Nondisjunction causes changes in the
regions during prophase number of chromosomes and may result
in gametes containing the wrong chro-
T
mosome number.
he genome of a species is the totality of hereditary ❚❚ Changes in the number of sets of chro-
information carried in the DNA of the species. mosomes alter phenotypes and can con-
fer evolutionary advantages.
This information is contained in chromosomes. Bacterial
❚❚ Chromosome breakage can change chro-
and archaeal species generally carry all of their genomic mosome structure and may lead to loss
information in a single chromosome. Some bacterial species or duplication of genes.
have their genomes divided into two or more chromosomes, ❚❚ Chromosome breakage can lead to chro-
mosome inversions and translocations.
but all bacterial and archaeal species have only a single copy
❚❚ Large amounts of protein affiliate with
of each gene. As a consequence, these species are haploid, eukaryotic chromosomes to form a com-
and the number of chromosomes they possess is represented plex called chromatin that condenses
by the variable n. chromosomes during cell division and
plays an important role in regulating
Eukaryotic genomes differ substantially from those of gene transcription.
bacteria and archaea by having at least two copies of each
361
362 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
gene. All animal species and many plant species are chromosome shapes being found in most species. Each pair
diploids, having two gene copies in their genome. of chromosomes in a diploid genome is distinctive in the
size, shape, and genetic content of the homologs, and these
Their cell nuclei carry the characteristic diploid number
differences can be visualized by molecular and microscopic
of chromosomes for the species—a number described methods. The use of these methods enables researchers to
as 2n. Numerous plant species have more than two identify individual chromosomes of genomes. It is impor-
copies of each gene and therefore more than a diploid tant to note that even though chromosome numbers, sizes,
and shapes are species-specific, none of these parameters
number of chromosomes. These species are polyploid
is directly associated with the complexity of the organism
and may have up to 12n or more as their chromosome (Table 10.1).
number.
Chromosomes are composed of a single, long Chromosomes in Nuclei
DNA molecule. The chromosomes of bacteria and Early observers of chromosomes in the nucleus, including
archaea are associated with small amounts of pro- Edmund Beecher Wilson, Walter Sutton, and Theodore
tein that help compact the chromosome in cells. In Boveri, hypothesized that chromosomes contained the
genetic material and noticed that their movement and sepa-
contrast, eukaryotic chromosomes contain as much
ration during meiosis, and their union at fertilization mir-
protein as they do DNA. The protein and DNA are rored the separation and transmission of genes. Biologists
combined in a complex called chromatin, and this now know that these early investigators were correct, and
complex is critical for accomplishing four essential contemporary biologists have learned a great deal about
structure of chromosomes and their behavior during the cell
functions. First, the chromatin helps compact
cycle.
chromosomes so they fit efficiently into the eukaryotic Chromosome behavior during interphase has been of
nucleus. Second, chromatin helps stabilize DNA and particular interest, since chromosomes are highly decon-
protects it from damage. Third, chromatin promotes densed and difficult to visualize during this period. Cell
biologists Thomas Cremer and Christoph Cremer have used
chromosome condensation and decondensation that
specialized methods to determine that interphase chromo-
are required for cell division. Finally, chromatin is a somes are partitioned into their own chromosome territories
major factor in regulating DNA replication and gene (Figure 10.1). A chromosome territory is a small region of
transcription. the nucleus that is the domain of a single chromosome. It is
not bounded by any sort of membrane, nor is it demarcated
We begin this chapter with a discussion of
in any distinctive manner. Chromosomes do not occupy
natural variation in chromosome number and
structure among eukaryotic species. After that we
look at several kinds of abnormalities of chromosome
Table 10.1 Chromosome Number in Selected
number and structure. We then return to normal Animal Species
chromosomes to describe the basic organization of
Diploid Chromosome
chromatin.The latter discussion sets the stage for Species Number (2n)
a more detailed examination in Section 13.2 of the Carp (Cyprinus carpio) 104
role of chromatin and chromatin modification in the Cat (Felis catus) 38
regulation of eukaryotic gene transcription. Chicken (Gallus domesticus) 78
Chimpanzee (Pan troglodytes) 48
Cow (Bos taurus) 60
Dog (Canis familiaris) 78
10.1 Chromosome Number and
Frog (Rana pipiens) 26
Shape Vary among Organisms Fruit fly (Drosophila melanogaster) 8
Horse (Equus caballus) 64
The content of a genome, the number of chromosomes
contained in a nucleus, and the relative size and shape of Human (Homo sapiens) 46
each chromosome are species-specific characteristics. Mouse (Mus musculus) 40
Chromosome number varies widely among species, though Rat (Rattus norvegicus) 42
closely related species tend to have similar numbers. Simi- Rhesus monkey (Macaca mulatta) 42
larly, chromosome shapes vary, with three or four general
10.1 Chromosome Number and Shape Vary among Organisms 363
Short
arm
(p) Satellite
(no p arm)
Replicated
centromere
Long
arm
(q)
1
3
2 1 .1
some spreads are often photographed for karyotyping.
.2
1 .1 .3
.1 2 .1
1 .1
4
.2
1 .1
1
An international symposium in Paris, France, was .2 .2 .1
1 .3
1
.2 3 .2 .2
.1
2 .1 .3
2
convened in 1971 to agree on the standard banding pattern .1 .2
.1 1 .2
1
.1
.21
.3
.1
.21 .3
1 .22
.23 1 .21
for each human chromosome as well as on a standardized 1
.22
.23
2
.1
.2
2
.1
.2
.3
.3
.1
3 .22
.23
.3
.3
2 .3 .11 2 .2
nomenclature for identifying chromosome banding pat- 2 3
.1
.2 1 .12
.13
3
.3
4
.1
2 .1 .3
3 .2
.2
.3
terns based on karyotypes of metaphase chromosomes. This 3 .2
.3
.1 4
.1
.2 .31
.32 2 4
5
.3 .33 5
nomenclature remains in use today to ensure accuracy in
.2
4 .3 .1 .1
.1 .2 1 .2
.1 .1 1 .2
.3
6 .3
.3
5 1
identifying each chromosome and in describing any chro-
.2 .2 .1
.1 2
5q2.3.1
.3 .3
.11 2 .2 7 2
.2
.3
.3 .1
regions are lightly staining regions of G-banded chro- of a chromosome alters the euploid number and gener-
mosomes. Conversely, chromosome regions in which ates a chromosome count known as aneuploidy (i.e., “not
chromatin is tightly condensed are said to contain het- euploid”). Chromosome nondisjunction is the cause of
erochromatin and are called h eterochromatic regions. aneuploidy.
Heterochromatic regions contain many fewer expressed Nondisjunction in germ-line cells produces aneuploid
genes than do euchromatic regions. With fewer expressed gametes—reproductive cells that have one or more extra or
gene sequences, heterochromatic DNA is more likely than missing chromosomes. These errors lead to the production
euchromatic DNA to contain repetitive DNA sequences of aneuploidy of fertilized eggs. Meiotic nondisjunction can
that may be located in multiple regions of the genome. In occur in either meiosis I or II and most often affects just a
G-banded chromosomes, heterochromatin is identified as single homologous pair or a single pair of sister chromatids
darkly staining chromosome regions and euchromatin as in a gametocyte (gametocytes are the cells that undergo
lightly staining regions. meiosis to produce gametes). Meiosis I nondisjunction is the
We will return to the theme of chromatin condensation failure of homologous chromosomes to separate. It results
and gene transcription in the last section of this chapter, in both homologs moving to a single pole. One of the game-
where we describe the fundamental molecular organiza- tocytes produced in meiosis I contains both chromosomes,
tion of chromatin and discuss a mutation in Drosophila that and the other contains neither chromosome (Figure 10.6).
demonstrates the role of chromatin condensation in gene These gametocytes, contain aneuploid chromosome num-
transcription. Chromatin and its role in regulating gene tran- bers of n + 1 and n - 1 (assuming only one chromosome
scription is also discussed in Section 13.2. pair is affected). Meiosis II usually proceeds normally even
Genetic Analysis 10.1 gives you practice with these con- when meiosis I is aberrant, and its completion sends the sis-
cepts as you interpret the results of a hypothetical experi- ter chromatids to different gametes. If nondisjunction occurs
ment involving the use of FISH probes that have unknown in meiosis I, each of the four resulting gametes are aneu-
sequence targets within chromosomes. ploid—either n + 1 or n - 1. The union of an aneuploid
gamete with a normal haploid gamete at fertilization results
in a fertilized egg with an aneuploid number of chromo-
10.2 Nondisjunction Leads to somes that will be either trisomic (2n - 1), having three of
one of the chromosomes rather than a homologous pair, or
Changes in Chromosome Number monosomic (2n - 1) having just a single copy of one of the
chromosomes rather than a homologous pair.
In Section 3.2, we discussed the connection between Nondisjunction occurring in meiosis II typically
Mendel’s two laws of heredity and the disjunction of follows a normal meiosis I that produced normal secondary
homologous chromosomes and sister chromatids during gametocytes, both containing the haploid (n) number of
meiosis. In the discussion that now follows, we focus on chromosomes (Figure 10.7). Since these gametocytes are
nondisjunction, the failure of chromosomes and sister separate cells, they independently divide during meiosis
chromatids to properly disjoin during cell division. As we II; thus, if nondisjunction occurs, only one of the second-
describe, nondisjunction is the cause of abnormalities of ary gametocytes will be affected. Among the four resulting
chromosome number in cells. gametes, two are normal because a normal disjunction took
The changes in chromosome number we describe in this place during each meiotic division. The other two gametes
section exert their effects primarily by addition or removal are aneuploid: one contains n + 1 chromosomes and the
of one or more chromosomes of the normal complement in other n - 1 chromosomes. Trisomic or monosomic fertil-
a nucleus. Such changes are mutations that add or remove ized eggs are produced when one of these aneuploid gam-
large numbers of genes. In animal species, but less so in etes unites with a normal gamete at fertilization.
plant species, these abnormalities almost always alter the
phenotype, and can have an effect on the development and Gene Dosage Alteration
reduce fertility and viability of the affected organism.
In 1913, at about the same time Calvin Bridges was demon-
strating the chromosome theory of heredity by examining
Chromosome Nondisjunction nondisjunction in fruit flies (see Section 3.3), Albert Francis
With a few unusual exceptions, the number of chromosomes Blakeslee and John Belling reported the phenotypic con-
is the same for males and females of a species, and the num- sequences of aneuploidy in the diploid (2n = 24) jimson
ber of chromosomes in nuclei of normal cells is a multiple weed (Datura stramonium), in which 12 chromosome
of the haploid number (n), the number in a single set of pairs are identified as A to L. Blakeslee and Belling iden-
chromosomes. In nearly all animal species, the total chro- tified 12 phenotypically distinct lines of trisomic Datura,
mosome number is 2n (diploid), but in plants, 3n (triploid) one for each of the chromosome pairs (Figure 10.8). Their
or higher multiples of n are relatively common. Chromo- results documented that aneuploidy causes phenotypic
some numbers that are a multiple of the haploid number are consequences. Over ensuing decades, this observation was
identified as euploid. In contrast, the addition or removal expanded and it was found that aneuploidy profoundly
Meiosis I Meiosis II Figure 10.6 Meiosis I nondisjunc-
tion. Homologous chromosomes fail
Secondary Fertilization (with Fertilized to disjoin in meiosis I, and all resulting
gametocytes Gametes a normal gamete) eggs gametes are aneuploid. Fertilization
by a normal haploid gamete produces
a fertilized eggs that are trisomic (2n + 1)
A
or monosomic (2n - 1).
a a a
a a
(n + 1) + A
A A a
A (n)
(n + 1)
Trisomic (2n + 1)
a a
Nondisjunction (n + 1)
A A
(2n) Primary
a
gametocyte a
(n – 1) +
(n)
(n – 1) Monosomic (2n – 1)
(n – 1)
affects the phenotype and development of nearly all animal chromosome. In a diploid organism, where two copies of a
species. The effects on the phenotype of plants were also gene, on a homologous pair of chromosomes, generate 100%
further documented. of gene dosage, a monosomic mutant has just one gene copy
The phenotypic and developmental abnormalities and just 50% of normal gene dosage for each gene on the
associated with aneuploidy result from changes in gene chromosome. In contrast, a trisomic mutant has three copies
dosage, the number of copies of a gene in the genome. Aneu- and 150% of normal gene dosage for each of the genes on the
ploidy changes the dosage of all the genes on the affected chromosome.
A A (n + 1) (n)
Nondisjunction Trisomic (2n + 1)
a
(n) a
+
a a
(n – 1) (n) Monosomic (2n – 1)
A A
(2n) Primary a a a
gametocyte a
a a
(n) + (n)
a a Normal diploid (2n)
(n)
a
a
(n) (n)
367
GENETIC ANALYSIS 10.1
PROBLEM Suppose Dr. O. Sophila receives three new FISH probes from a colleague with the request that
Dr. Sophila’s laboratory determine the likely hybridization targets of the probes on human chromosomes. Each of
the three FISH-probe designs contains a different nucleotide sequence and is labeled with a different-colored
fluorophore. Chromosome spreads are prepared, and the FISH probes are added. The
following results are obtained: Probe A is several dozen nucleotides in length, and it labels
BREAK IT DOWN: each chromosome centromere but no other parts of any chromosome; probe B is about a
Review the discussion of dozen nucleotides in length, and it labels the telomeres on every chromosome but no other
FISH on p. 364.
parts of any chromosome; probe C is about a dozen nucleotides in length, and it labels a single
spot on each copy of chromosome 4 at band position 4q3.2. Dr. Sophila asks you to ponder these experimental
results and to help his colleague by hypothesizing about the likely sequence-binding target of each probe.
Evaluate
1. Identify the topic of this problem and the 1. This problem concerns the interpretation of hybridization results of FISH
nature of the required answer. (fluorescent in situ hybridization) in human chromosomes. The answer must
2. Identify the critical information given identify the likely target sequences detected by each of three FISH probes.
in the problem. 2. Hybridization patterns for three FISH probes are described.
Deduce
3. Review your knowledge of the different 3. Centromeres contain specialized DNA sequences that are bound by
portions of chromosomes to which microtubules during cell division. Telomeres are located at chromosome
these probes hybridize. [See discussions ends and are composed of hundreds of copies of short, repetitive DNA
in Section 3.1 (centromeres) and sequences.
Section 7.4 (telomeres).]
4. Recall the makeup of eukaryotic 4. Eukaryotic DNA contains most of the expressed genes. Heterochromatic
chromosomes in terms of their content DNA contains few expressed genes and is more likely than eukaryotic
of protein-coding genes and other types regions to contain repetitive sequences.
of DNA sequences.
Solve
5. Provide an interpretation of the DNA 5. By hybridizing exclusively to centromeric regions, probe A is likely to be
sequence targeted by probe A. targeting these specialized DNA sequences.
6. Provide an interpretation of the DNA 6. Hybridization exclusively to telomeres indicates that probe B is targeting
sequence targeted by probe B. the short repetitive DNA sequences of telomeres.
7. Provide an interpretation of the DNA 7. Probe C hybridizes to a single location on homologous copies of chromo-
sequence targeted by probe C. some 4 that is most likely to be a protein-coding gene. The band 4q3.2 is a
euchromatic region of the chromosome, where many expressed genes are
located. The identity of the gene cannot be determined, however, without
additional information.
For more practice, see Problems 11 and 28. Visit the Study Area to access study tools. Mastering Genetics
Changes in gene dosage lead to an imbalance of gene part to their having developmental programs that differ dis-
products from the affected chromosome relative to unaf- tinctly from those of animals. It is not unusual to find plant
fected chromosomes, and this imbalance is at the heart of strains with more than two copies of each chromosome. We
alterations of normal development and the production of describe this situation in more detail in a later section.
abnormal phenotypes. Most animals are highly sensitive to
changes in gene dosage, and their developmental biology, Aneuploidy in Humans
especially within the nervous system, does not proceed nor-
mally in the presence of gene dosage imbalance. Humans are enormously sensitive to changes in gene dos-
In contrast to animals, that are profoundly, often lethally, age, and almost all human aneuploidies are incompatible
affected when aneuploidy occurs, gene dosage changes are with life. Theoretically, there are potentially 24 different
more easily tolerated in many species of plants, owing in kinds of trisomy in humans—one for each autosome, and
368
10.2 Nondisjunction Leads to Changes in Chromosome Number 369
Autosomal Aneuploidy
Trisomy 13 Patau syndrome 1 in 15,000 Mental retardation and developmental
delay, possible deafness, major organ
abnormalities, early death
Trisomy 18 Edward syndrome 1 in 8000 Mental retardation and developmental
delay, skull and facial abnormalities, early
death
Sex-Chromosome Aneuploidy
Trisomy 21 Down syndrome 1 in 1500 Mental retardation and developmental
delay, characteristic facial abnormalities,
short stature, variable life span
47, XXY Klinefelter syndrome (males) 1 in 1000 Variable secondary sexual characteristics,
infertility, frequent breast swelling; no
impact on mental capacity
47, XYY Jacob syndrome (males) 1 in 1000 Tall stature common; possible reduction
but not loss of fertility; no impact on
mental capacity
47, XXX Triple X syndrome (females) 1 in 1000 Tall stature common; possible reduction
of fertility; menstrual irregularity; no
impact on mental capacity
45, XO Turner syndrome (females) 1 in 5000 No secondary sexual characteristics;
infertility, short stature; webbed neck
common; no impact on mental capacity
370 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
Maternal Age Range Total Live Births Studied Trisomy 21 Births Rate per 1000 Births
15–19 30,272 18 0.49
20–24 117,593 87 0.73
25–29 108,746 96 0.90
30–34 49,487 72 1.56
35–39 19,522 73 4.19
40–44 4880 73 18.02
45–49 304 19 55.02
a
Data adapted from E. B. Hook and A. Lindsjo, Down syndrome in live births by single year maternal age interval in a Swedish study: Comparison with results
from a New York State study. Am. J. Hum. Genet. 30 (1978): 19–27.
In the 500,000 or so follicles in each of the two fetal ova- and Drosophila, where its protein product participates in the
ries, meiosis reaches the point of homologous chromosome formation of the heart and components of the developing ner-
synapsis in prophase I and then arrests. At puberty, or at any vous system.
point over a woman’s span of reproductive fertility, monthly A different kind of change in gene dosage is seen in
hormone cycling reinitiates meiosis in a few follicles. Meio- humans with Turner syndrome, a monosomy of the X chro-
sis I (homologous chromosome separation) leads to an egg mosome in which there is one X chromosome but no second
that is released into the fallopian tube. If the egg is fertilized sex chromosome (see Table 10.2). Despite the occurrence of
by a sperm cell, meiosis II is stimulated to occur, the two random X-inactivation in human female embryos that leads
parental haploid nuclei fuse, and fertilization is complete. to one expressed X chromosome and one inactive X chromo-
If maternal age at conception is functionally linked to the some in each nucleus, two sex chromosomes are necessary
risk of trisomy 21, then researchers should find that non- for normal early development. In female embryos that are
disjunction errors in maternal meiosis I are more often the XO (Turner syndrome), the single copy of the gene SHOX,
cause than are errors in maternal meiosis II. In fact, molecu- located in pseudoautosomal region 2 on the short arm of the
lar genetic analysis of the chromosomes in infants with tri- X chromosome (and in males also on the Y chromosome;
somy 21 has indeed determined that more than 90% of cases Section 3.2), is insufficient to direct certain aspects of nor-
of trisomy 21 are attributable to a maternal nondisjunction, mal development. The haploinsufficiency of SHOX appears
and that the majority of nondisjunction events are errors to play a central role in producing Turner syndrome.
in meiosis I. Predominantly, infants with trisomy 21 have Genetic Analysis 10.2 guides you through an analysis of
two identical copies of a maternal chromosome 21 and one chromosome 21 nondisjunction.
copy of a paternal chromosome 21. This circumstance arises
through maternal meiosis I nondisjunction.
Mosaicism
Molecular and genomic analyses have also determined
that a small number of genes on chromosome 21 are respon- Our discussion of random X-inactivation of mammalian
sible for mental and developmental delays and heart abnor- females in Chapter 3 identified the phenomenon as an
malities, which are the principal symptoms of trisomy 21. example of naturally occurring mosaicism, in which dif-
The critical portion of chromosome 21 for trisomy 21 is ferent cells of the organism contain differently functioning
the Down syndrome critical region (DSCR). Its discovery X chromosomes (see Section 3.6). Mosaicism is the condi-
came from the study of individuals with the symptoms of tion of being composed of two or more cell types having
Down syndrome who have two complete copies of chromo- different genetic or chromosomal makeup. In addition to
some 21 and an additional fragment of a third copy of the the random X-inactivation process, mosaicism can also
chromosome. Only when the additional fragment contains develop as a consequence of mitotic nondisjunction early
DSCR are the symptoms of trisomy 21 present. in embryogenesis. Mosaicism derived in this way is one of
Research in mice has pointed to a potential explanation the many kinds of chromosome abnormalities that occur in
for the role of DSCR in generating the symptoms of Down newborn infants. For example, 25–30% of cases of Turner
syndrome. Among a handful of candidate genes located syndrome, the X-chromosome monosomy (XO), occur in
in the DSCR, one gene, DYRK, has a homolog that pro- females exhibiting mosaicism in which some cells are 45,
duces dosage-sensitive learning defects. Mice with an extra XO and others are 46, XX. Some individuals with mosaic
copy of the DYRK homolog have a reduction in brain size. Turner syndrome carry 47, XXX cells as well. This kind of
DSCAM is a second gene whose increased dosage is linked mosaicism is usually derived from mitotic nondisjunction in
to Down syndrome. This gene also has homologs in mouse a 46, XX zygote (Figure 10.9).
10.3 Changes in Euploid Content Lead to Polyploidy 371
Evaluate
1. Identify the topic of this problem and the 1. This problem deals with chromosome nondisjunction and requires a pre-
nature of the required answer. diction of PCR results expected for different nondisjunction events.
2. Identify the critical information given in the 2. The four alleles inherited on parental copies of chromosome 21 are
problem. given. These four alleles are of different lengths, making it possible to
identify each chromosome uniquely.
Deduce
3. Review the abnormal chromosome combi- 3. Meiosis I nondisjunction is the failure of homologous chromosomes to
nations that result from nondisjunction in disjoin. An abnormal secondary gametocyte contains homologous cop-
meiosis I and meiosis II. ies of the chromosome that are different from one another. Meiosis II
nondisjunction is the failure of sister chromatids to disjoin. An abnormal
TIP: See Figures 10.6 and 10.7, p. 367. secondary gamete contains identical copies of the chromosome.
4. Identify the alleles that would be 4. The marker alleles present if meiosis I nondisjunction occurs will include
present on the two copies of chromo- the two alleles from one parent’s homologous copies of chromosome
some 21 produced by maternal meiotic I 21. These are 310 and 380 for maternal meiosis I nondisjunction and 290
nondisjunction and by paternal meiotic I and 340 for paternal meiosis I nondisjunction.
nondisjunction.
5. Identify the alleles that would be present 5. The marker alleles present if meiosis II nondisjunction occurs will include
on the two copies of chromosome two identical markers from one parent. These are either both 310 or both
21 produced by maternal meiosis II 380 for maternal meiosis II nondisjunction, or both 290 or both 340 for
nondisjunction and by paternal meiosis paternal meiosis II nondisjunction.
II nondisjunction.
Solve Answer a
6. List the alleles expected if trisomy 6. The gel for the trisomy 21 child’s DNA will have three bands, since all
21 is produced by maternal meiosis I three chromosomes carry different PCR alleles. In the case of maternal
nondisjunction. meiosis I nondisjunction, two bands will be maternal and one will be
paternal. The gel band patterns are either 310, 380, 290 or 310, 380, 340.
Answer b
7. List the alleles expected if trisomy 21 7. The gel will have two PCR bands—one representing the two identical
is produced by maternal meiosis II maternal alleles and the other representing the one paternal allele. There
nondisjunction. are four possible gel band patterns for the child with trisomy 21: (1) 310,
290; (2) 310, 340; (3) 380, 290; or (4) 380, 340.
Answer c
8. List the alleles expected if trisomy 8. The gel will have PCR bands for both of the paternal chromosomes and
21 is produced by paternal meiosis I one of the maternal chromosomes. There are two possible patterns for
nondisjunction. these three PCR gel bands: (1) 290, 340, 310; or (2) 290, 340, 380.
Answer d
9. List the alleles expected if trisomy 9. The gel will have two PCR bands—one representing the two identical
21 is produced by paternal meiosis II paternal alleles and the other representing the one maternal allele. There
nondisjunction. are four possible gel band patterns for the child: (1) 290, 310; (2) 290,
380; (3) 340, 310; or (4) 340, 380.
For more practice, see Problem 25. Visit the Study Area to access study tools. Mastering Genetics
372
10.3 Changes in Euploid Content Lead to Polyploidy 373
and an n pollen could become 6n (autohexaploid) by a dou- maritima (2n = 60), with a non-native salt grass, Spartina
bling of the chromosomes through mitotic nondisjunction. alterniflora (2n = 62; Figure 10.11). Haploid gametes from
In all of the examples described here, all the chromosomes the two parental species fused to produce an interspecific
present in the polyploid cell originate from the same species; hybrid with 61 chromosomes. Chromosome nondisjunction
thus, these are examples of autopolyploidy. that doubled the chromosome number to 122 generated fer-
The strawberries you eat each summer are autoocta- tility in the hybrid and stabilized its genome. With an even
ploids (8n) and have had chromosome sets duplicated by number of chromosomes, balanced gametes were able to
the processes outlined here. Strawberries have a haploid form. This established the new species that grew vigorously
number of n = 7 and a diploid number of 2n = 14, thus and spread its range along the English coast.
the commercial octaploid varieties contain 8n = 56 chro-
mosomes. Octaploid strawberries are prized for their bright Consequences of Polyploidy
red color, sweet taste, and juicy texture. In these traits, as
we describe below, strawberries reveal some of the reasons Polyploids of plant species frequently occur naturally and
why agricultural products are so often polyploids. Octaploid are also produced by human manipulation. When produced
strawberries are larger (and much better tasting) than their for commercial purposes, plant polyploidy has three main
diploid counterparts (Figure 10.10). consequences. First, fruit and flower size are increased. The
In contrast to autopolyploids, the multiple sets of chro- nuclei and cells of polyploid strains are larger than those of
mosomes in allopolyploids originate in different species.
The union of a haploid gamete from species 1 (n1) and Spartina Spartina
a haploid gamete from species 2 (n2) produces a hybrid alterniflora maritima
organism that may have either an even number or an odd
2n = 62 2n = 60
number of chromosomes, depending on the haploid number
that is normal for each species. The chromosomes of the
two contributing species are not homologous and may have Meiosis Meiosis
difficulty pairing in meiosis. However, mitotic duplication
of chromosomes doubles the total chromosome number and
generates homologous pairs of chromosomes. n1 = 31 Gametes n2 = 30
An example of these events is the emergence of a new
species of salt grass, Spartina anglica, along the English
coastline in the late 1800s. S. anglica is a naturally occurring Gamete union
allopolyploid possessing 122 chromosomes. It arose through
the interspecific hybridization of native salt grass, Spartina
Interspecific Interspecific hybrid is infertile
n1 + n2 = 61
hybrid due to nonhomology of
chromosomes.
Chromosome
doubling by
nondisjuction
Meiosis
Gametes
diploid strains, and many familiar fruit and vegetable vari- evolutionary impact of polyploidy more dramatically than
eties benefit from this effect. Apples (3n = 51), bananas Triticum aestivum, common bread wheat, and Triticum
(3n = 33), strawberries (8n = 56), peanuts (4n = 40), and spelta, spelt wheat (Figure 10.12). Both of these species are
potatoes (4n = 48) are just a few examples. allohexaploids. Their development came about through the
Increased fruit and flower size in polyploid plants comes
at the cost of fertility—the second consequence. The problem Years
is particularly acute for odd-numbered polyploids (3n, 5n, BCE Ancestral species Modern species
etc.), in which the odd number of chromosomes cannot be
evenly divided at the first meiotic division. The result is an
unequal distribution of chromosomes that makes almost all of
the resulting gametes nonviable. In some cases, this reproduc- 12,000 ×
tive disadvantage can be turned into commercial advantage:
Certain “seedless” fruits and vegetables in the produce aisle
of your local grocery store are odd-numbered polyploids. Triticum searsii Triticum urartu, Triticum
(possibly another wild einkorn monococcum,
The grass carp furnishes an animal example of the com-
Triticum species), wheat cultivated
mercial benefits of infertility. While most animals do not toler- wild grass 2n = 14 (AA) einkorn wheat
ate polyploidy, there are some exceptions among certain fishes 2n = 14 (BB) 2n = 14 (AA)
and amphibians, and the grass carp (Ctenopharyngodon idella)
is one of them. It is a weed-eating fish that is being employed
to reduce weed growth in more than 50 countries worldwide.
Triploid grass carp are created by first artificially fertilizing carp
eggs and then heat-shocking the newly fertilized eggs. Heat-
shock causes the diploid fertilized eggs to divide unevenly, pro-
ducing a triploid cell that goes on to develop into a fish that is Triticum dicoccum,
fully viable. The triploid grass carp eat weeds vigorously and, 8000 × cultivated emmer
wheat
in doing so, help reduce weed growth in bodies of water with-
4n = 28 (AABB)
out the use of herbicides. As a consequence of their triploidy,
however, the carp are infertile, so they are unable to reproduce Triticum tauschii, Triticum dicoccoides,
and don’t invade the habitats into which they are introduced. wild grass wild emmer wheat
The triploid grass carp must be restocked periodically if its 2n = 14 (DD) 4n = 28 (AABB)
continued presence is desired to control weed growth.
Polyploids exhibit a third characteristic of commercial
importance—an increase in heterozygosity relative to dip- Triticum turgidum,
loids that comes about when inbred lines are crossed and durum pasta
wheat 4n = 28
is the basis of additional growth vigor. This phenomenon (AABB)
is known as hybrid vigor, and it consists of more rapid
growth, increased production of fruits and flowers, and
improved resistance to disease among the heterozygous
(hybrid) progeny of inbred lines.
union of diploid genomes of three ancestral species in two (a) Loss of terminal fragment (b) Terminal deletion in
hybridization events. This evolutionary history of modern cri-du-chat syndrome
Telomere
wheat begins about 12,000 years ago with the hybridization Telomere
of two diploid species that contain 14 chromosomes each.
5.3 Deletion
A A 5.2
5.1 5.1
Break
Einkorn wheat, T. monococcum, is a cultivated variety of B B p 1 4 1 4
point
wheat that can still be found around the world and is the C Terminal C
3
2
3
2
modern form of wild einkorn wheat, T. urartu. Represented deletion
1
1.1
1
1.1
Centromere
D D 1.2 1.2
by the chromosome designation AA, T. urartu hybridized E E
2 2
3 3
with a wild grass species, either T. searsii or T. tripsacoi- Chromosome
q 1 1
4 4
des, each with chromosomes represented as BB, to form an F break F 5 5
allotetraploid variety called emmer wheat, T. dicoccoides. G G 1 1
2 2
Emmer wheat has 28 chromosomes and a chromosome H 2 3 2 3
formula AABB and was being cultivated approximately I H 1 1
8000 years ago when it underwent a second hybridization I
Wild-type 2 2
event with another wild diploid grass species, T. tauschii chromosome 3 3 3 3
Partial deletion 4 4
(chromosome formula DD), to form T. aestivum and chromosome 5 5 Telomere
T. spelta (both AABBDD), the modern allohexaploid spe- Normal Terminal
cies, which each have 42 chromosomes. The acentric fragment chromosome deletion
is lost in subsequent 5 chromosome
cell division. 5
5
partial duplication and partial deletion of chromosome
5 4
3
segments on the resulting recombinant chromosomes. An
2 organism carrying one homolog with duplicated material
1 1
Hybrid gene
Unequal Crossover No phenotypic
The process of reciprocal recombination (crossing over) abnormalities 17 genes 17 genes
achieves the recombination of alleles on homologous chro- …and a partial duplication
mosomes without causing a gain or loss of chromosomal chromosome with PMSA, PMSB, a
material that would result in mutation (see Sections 5.2 and hybrid gene, and duplication of
the 17 genes.
12.6). Occasionally, however, crossing over between homo-
logs is inaccurate, resulting in chromosome mutations that Figure 10.15 Unequal crossover in creation of Williams–Beuren
are due to unequal crossover. These mutations result in the syndrome.
10.4 Chromosome Breakage Causes Mutation by Loss, Gain, and Rearrangement of Chromosomes 377
of PMS on each chromosome is looped out from each homo- Irrespective of the mechanism that may have created
log during misalignment (Figure 10.15b). Unequal crossing a partial chromosome duplication or deletion, prophase I
over between the misaligned chromosomes results in one homologous chromosome synapsis during meiosis produces
recombinant chromosome that has a partial deletion chro- a telltale signature of their existence. Homologous pairs
mosome 7 that results in WBS. This chromosome contains that are mismatched because one contains a large duplica-
a nonfunctional hybrid PMSA 9PMSB gene and is missing tion or deletion will form an unpaired loop in synapsis
intact PMSA and PMSB genes as well as the 17 genes nor- (Figure 10.17). Along most of the length of the homologous
mally found between PMSA and PMSB (Figure 10.15c). The pair, normal synaptic pairing occurs. But in regions of struc-
partial duplication chromosome (containing the PMSA and tural difference, the extra material present on one chromo-
PMSB genes, a hybrid PMSA 9PMSB gene, and duplicated some bulges out to allow synaptic pairing on either side.
copies of the 17 intervening genes) does not cause readily The material in the loop is normal genetic material if one
identifiable phenotypic abnormalities. chromosome carries a deletion, and it is duplicated genetic
material if one homolog carries a duplication.
Detecting Duplication and Deletion
Deletion Mapping
Large deletions or duplications of chromosome segments can
be detected by microscopic examination that reveals altered Pseudodominance is a genetic phenomenon that occurs
chromosome banding patterns resulting from the structural when a normally recessive allele is “unmasked” and
change to the chromosome. Such deletions and duplications expressed in the phenotype because the dominant allele on
are generally quite large. In human chromosomes, duplica- the homologous chromosome has been deleted. Pseudo-
tions and deletions of about 100,000 to 200,000 base pairs dominance is used to map genes in deleted chromosome
are at the lower limit of chromosome banding visualization. regions by a method known as deletion mapping.
Microdeletions and microduplications are considerably We discussed a version of deletion mapping in Section
smaller and are generally not easily detected by chromo- 6.5 in connection with Benzer’s fine-structure analysis of
some banding analysis. Instead, molecular techniques such the genes involved in bacterial lysis by bacteriophage. In
as FISH (fluorescent in situ hybridization; Section 10.1) can that analysis, Benzer mapped mutations by ascertaining
be used to detect the absence or duplication of a particular whether it was possible to form a wild-type lysis recombi-
gene or chromosome sequence (Figure 10.16). nant between a lysis-deficient phage with a point mutation
(a revertible mutation) and one with a deletion mutation
(a nonrevertible mutation). In studies using deletion muta-
tion analysis in diploid organisms, the unmasking of a
(a) Wild-type chromosome
recessive allele (the observation of pseudodominance) is
central to gene mapping. Figure 10.18 shows deletion map-
FISH probes A B C ping using pseudodominance to map the Notch gene (n) in
Drosophila. The Notch gene resides on the X chromosome,
(b) Microinterstitial deletion and its location is revealed by the detection of pseudodomi-
nance in female fruit flies that are heterozygous for partial
X-chromosome deletions. Pseudodominance appears when
A B C the portion containing the dominant allele has been deleted
No fluoresence detected from from one X chromosome, allowing the recessive allele
probe B. that still resides on the other, intact X chromosome to be
(c) Microduplication
A B C Unpaired loop
Q Refer back to Figure 10.14. If a fluorescent label for Figure 10.17 An unpaired loop at synapsis. The partial
chromosome band 11p2 was used to stain different copies of duplication heterozygote shown here has duplicated genetic
the chromosome, each having one of the nine partial deletions material of bands 5 through 9. The extra material forms an
shown, which partial deletion chromosomes would be labeled unpaired loop at synapsis to allow homologous regions to align
by fluorescence and which would not? correctly.
378 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
z w rst n dm
Partial
deletion mutant
1 2 3 4 5 1 2 3 1 2 3 4 5 6 1 2 3 4 5 6 7 8 910 1 2 3 4 1 2 3 4 5 6 7 8 9101112 1 2 3 4 5 6 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 1 2 phenotype
2D 2E 2F 3A 3B 3C 3D 3E 3F 4A
rJ1 Dominant
258-42 Dominant
62d18 Pseudodominant
N71a Pseudodominant
264-32 Pseudodominant
264-39 Pseudodominant
Figure 10.18 Deletion mapping of the Drosophila Notch (n) gene. The open blue sections of the
grid without bisecting lines show the extent of each partial deletion of the Drosophila X chromosome
for six partial deletion mutants. The retention of the dominant character or the emergence of notch by
pseudodominance is indicated in the right-hand column. The smallest X-chromosome segment missing
from all pseudodominant mutants is region 3C7, indicating this as the location of the gene.
expressed. In the figure, the gray segments in the grid rep- inversion, whereas attachment to a nonhomologous chro-
resent chromosome segments remaining on the partial dele- mosome results in chromosome translocation. We discuss
tion X chromosomes of six different mutants. The colored two types of chromosome inversion events and two types
portions of the grid identify segments that have been deleted of chromosome translocation in this section. A repeating
from that chromosome in each mutant. The first two par- theme that emerges is that as long as no critical genes or
tial deletions (rJ1 and 258-42) do not lead to pseudodomi- regulatory regions are mutated by chromosome breakage,
nance (in other words, the dominant wild-type phenotype is and as long as dosage-sensitive genes are retained in their
observed), indicating that the regions deleted do not contain proper balance, individuals that have a chromosome inver-
the Notch gene. The next two partial deletions, 62d18 and sion or a chromosome translocation might not experience
N71a, do result in pseudodominance (in other words, the any phenotypic abnormalities. However, complications dur-
recessive phenotype is observed), indicating that the Notch ing meiosis may affect the efficiency of chromosome seg-
gene locus containing the dominant allele is in the region regation, and fertility may be affected in those individuals.
3C4 to 3C8. To home in on the location of Notch, progres-
sively smaller partial deletions are used to identify the small-
est deletion segment common to all deletions resulting in
Chromosome Inversion
pseudodominance. In this instance the smallest partial dele- Chromosome inversions occur as a result of chromosome
tion common to genomes expressing pseudodominance for breaks followed by reattachment of the free segment in the
Notch is region 3C-7, which is missing from mutant 264-39. reverse orientation. Two kinds of chromosome inversion are
This is where the gene resides. Genetic Analysis 10.3 guides observed, depending on whether the centromere is part of
you through analysis of deletion mapping. the inverted segment (Figure 10.19). Paracentric inversion
results from the inversion of a chromosome segment on a
single arm and does not involve the centromere, whereas
10.5 Chromosome Breakage Leads pericentric inversion reorients a chromosome segment that
to Inversion and Translocation of includes the centromere.
Inversion most commonly affects just one member of
Chromosomes a homologous pair of chromosomes, and individuals who
have one inverted chromosome and a homologous chro-
Chromosome breakage involves double-strand DNA breaks mosome without the inversion are designated as inversion
that sever a chromosome. Breakage that is not followed by heterozygotes. The definition might be more specific—for
reattachment of the broken segment leads to partial chromo- instance, paracentric inversion heterozygote or pericentric
some deletion—but what happens if the broken chromo- inversion heterozygote—if the type of inversion is known.
some reassembles with the broken segment reattached in the Chromosome inversion causes a difference in lin-
wrong orientation or if the broken segment reattaches to a ear order of genes on homologous chromosomes by a
nonhomologous chromosome? The answers are that reat- 180-degree reorientation of the inverted segment. Again,
tachment in the wrong orientation produces a chromosome if the chromosome breakage event leading to the inversion
GENETIC ANALYSIS 10.3
PROBLEM In Drosophila, the X-linked recessive mutant traits singed bristle, lozenge X chromosome
eye, and cut wing are encoded at linked genes. Five strains of Drosophila produced 2 4 6 8 10 12 14 16 18 20
Map units
by the cross of pure-breeding wild-type and
BREAK IT DOWN: Pseudodominance Strain 1
can emerge in heterozygous organ- pure-breeding mutant flies (SLC/SLC * slc/slc)
isms when the dominant allele on are expected to have the trihybrid genotype singed
one copy of a chromosome pair is Strain 2
deleted, leaving only the recessive SLC/slc and express the wild-type phenotypes.
allele on the unaltered chromo- Females of each strain exhibit pseudodomi- singed, cut
some (p. 377). nance for one or more of the traits, however, due Strain 3
to partial deletion of the X chromosome. lozenge
Comparative X-chromosome maps showing the extent of deletions in each Strain 4
pseudodominant strain (indicated by dashed lines) are given here along with the singed, cut
pseudodominant phenotypes found in each strain. Use this information to locate Strain 5
each gene as accurately as possible along the X chromosome. cut
BREAK IT DOWN: Gene mapping by
pseudodominance seeks to identify the
smallest region of chromosome that
might contain a particular gene (p. 378).
Deduce
3. Review the meaning of pseudodominance 3. Pseudodominance is the appearance of a recessive trait in a presumed
and the connection between chromosome heterozygous organism due to deletion of a chromosome segment car-
deletion and pseudodominance. rying the dominant allele. In deletion mapping using pseudodominance,
the location of a gene maps to the smallest common deletion region
shared by all organisms expressing the pseudodominant trait.
Solve
4. Interpret the meaning of the 4. Strain 1 is missing chromosome material from the 8th to the 14th
pseudodominant phenotype in strain 1. map unit. The appearance of the pseudodominant phenotype singed
TIP: Compare deletion mutants that indicates that the singed gene maps to this interval.
share pseudodominance phenotypes
to see where their deletions overlap.
5. Compare strain 2 with strain 1, and 5. Strain 2 has a deletion from map units 4 to 13 that includes both singed
interpret the meaning of the new and cut. This narrows the location of singed to the interval between 8
pseudodominant phenotype cut. and 13 map units. The cut location is between the 4th and 8th map unit,
based on its appearance with the deletion of this interval.
6. Assess pseudodominance of strain 3. 6. Co-occurrence of the deletion between map units 16 and 20 and the
appearance of the pseudodominant lozenge phenotype maps the
lozenge gene to this location.
7. Assess strains 4 and 5, and refine the 7. Strain 4 contains a deletion between map units 4 and 12, confining the
locations of the genes further where location of singed to the interval between 8 and 12. This strain pro-
possible. TIP: Again, compare deletion mutants vides no additional information about the location of cut. The deletion
that share pseudodominance phe- between map units 3 and 6 in strain 5 includes cut and refines its loca-
notypes to see where their deletions tion to between map units 4 and 6.
overlap.
8. Identify gene locations based on the 8. Based on the data for pseudodominance in these five strains, cut
deletion-mapping analysis. resides in the interval between units 4 and 6, singed lies between 8
and 12, and lozenge is between 16 and 20.
For more practice, see Problems 8, 22, and 24. Visit the Study Area to access study tools. Mastering Genetics
379
380 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
D
B
D
B
G G G G G C 1¿
1 A E F G H I
D
B
H H H H H
D
B
2 A E F G H I 2¿
I I I I I 3 A E F G H I 3¿
4 A E F G H I 4¿
Chromosome Free-segment Paracentric Paracentric
breakage rotation inversion inversion
heterozygote
(b) Pericentric inversion Crossover between
homologs
A A A A A Anaphase I
B B B B B migration
C Breakage C C C C
D H H D 1 A B C D E F G H I 1¿
2 A B D E F G H I 2¿
E G G E C C
D
Inverted 4 A D B E F G H I 4¿
E
F F
segment 3 A D C B E F G H I 3¿
F
F F
G
Dicentric Acentric
H
E
tion (the dot represents the centromere). The recombinant
H
E
D
H
chromosomes, however, are abnormal: One is a dicentric 1 A B C FG I 1¿
H
DE
DE
H
2 A B C I 2¿
chromosome with two centromeres (2 • ABCDA • 4), and
the other is an acentric fragment that has no centromere 3 A B C I 3¿
4 A B C I 4¿
2′ IHGFEDCBEFGHI 4′). At anaphase I, when centromeres
on homologous chromosomes normally migrate toward
opposite poles, a dicentric bridge forms as the dicentric Crossover between
chromosome is pulled toward both poles of the cell. Eventu- homologs
ally the bridge snaps under the tension, at a random break
Anaphase I
point. Both products of the break have a centromere, but migration
both are also missing genetic material. In contrast, the acen-
tric fragment, lacking a centromere, has no mechanism by 1 A B C D E F G H I 1¿
which to migrate to a pole of the cell and will be lost during 2 A B C D E F G H C B A 4
meiosis. The completion of meiosis of this paracentric inver- 3 A B C H G F E D I 3¿
sion heterozygote results in two viable gametes, one with the 4’ I D E F G H I 2¿
normal-order chromosome (1 • ABCDEFGHI 1′) and one
with the inverted-order chromosome (3 • ADCBEFGHI 3′), Anaphase I
and two nonviable gametes with partial deletion migration
chromosomes. Meiosis II completion
Crossover in the inversion loop in a pericentric
inversion heterozygote also yields two viable gametes and
1 A B C D E F G H I 1¿ Normal
two nonviable gametes (Figure 10.21). One viable gamete chromosome
contains the normal-order chromosome (1 ABCDE • FGHI (viable)
1′) and one contains the inversion-order chromosome 2 A B C D E F G H A B C 4
(3 ABCHGF • EDI 3′). Each of the two nonviable gametes Duplication/
4’ I D E F G H I 2’ deletion
has a combination of deletions and duplications (2 ABCDE chromosomes
• FGHCBA 4 and 4′ IDE • FGHI 2′). (nonviable)
Three observations about recombination in inversion 3 A B C H G F E D I 3¿ Inversion
heterozygotes have important genetic implications: chromosome
(viable)
1. The probability of crossover within the inversion
Crossover in the inversion loop results in two viable gametes and
loop is linked to the size of the inversion loop. Small two nonviable gametes.
inversions produce small inversion loops that have a low
frequency of crossover. On the other hand, larger inver- Figure 10.21 The consequences of crossover in the inversion
sions produce loops that span more of the chromosome loop in pericentric inversion heterozygotes.
and correlate with a higher probability of crossover.
Q If two chromosome homologs each contain the same inver-
2. Inversion suppresses the production of recombi- sions, would they form an inversion loop? Why or why not?
nant chromosomes. The viable gametes produced
by inversion heterozygotes contain either the normal-
order chromosome or the inversion-order chromo- duplications and deletions, there is little possibility
some, but no recombinant chromosomes are viable, of viability for any progeny formed from the gametes
due to duplications and deletions of chromosome that contain them. Geneticists have taken advantage of
segments. The absence of recombinant chromosomes crossover suppression in research by marking cross-
in progeny is identified as crossover suppression. over-suppressed chromosomes with dominant alleles
In reality, crossovers do occur between homologous that aid in the interpretation of genetic crosses. Experi-
chromosomes carried by inversion heterozygotes, mental Insight 10.1 describes research by Hermann
but because the recombinant chromosomes contain Muller, who used the so-called ClB (“See-el-bee”)
382 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
mutation. They can be used in additional studies to charac- Muller used the ClB method to demonstrate that X-ray expo-
terize the nature of the lethal mutation. sure induces mutations at a rate more than 150 times greater
Identifying X-ray–induced recessive lethal mutations than the spontaneous mutation rate in Drosophila. His work
using the ClB method is highly accurate: It requires only led to the characterization of many of these mutations and
a determination of whether or not males are produced by to the identification of the linear relationship between the
Cross II. Nonlethal X-ray induced mutations can also be level of X-ray exposure and the frequency of induced lethal
identified, by examining the males produced by Cross II. mutations.
chromosome to identify and later investigate lethal harm by the chromosome fusion event. Were chromosome
X-linked mutations induced in Drosophila by X-ray fusion to lead to the loss of critical genes, the organism would
exposure. not survive. One consequence of Robertsonian translocation
3. Fertility may be altered if an inversion heterozygote is the reduction of chromosome number.
carries a very large inversion. When an inversion
spans all or nearly all the length of a chromosome, any Patterns of Reciprocal Balanced Translocation In
crossover that occurs will produce two viable and two reciprocal balanced translocation, one member of each
nonviable gametes. This means that approximately half homologous pair is altered by translocation, and none of the
the gametes will be lost in the specific case of an inver- four chromosomes has a fully homologous partner. Instead,
sion heterozygote who carries a very large inversion. the translocated chromosome segments homologous to the
No such loss of fertility is expected for organisms with normal member of each pair are dispersed on two other
small inversions. chromosomes. The absence of complete homology between
chromosome pairs requires formation of an unusual tetra-
valent synaptic structure, a cross-like configuration made
up of the four chromosomes related by the translocation, to
Chromosome Translocation enable homologous regions to synapse during metaphase I,
Chromosome translocation takes place when chromosome as shown in Figure 10.23. The chromosomes in the figure
breakage is followed by the reattachment of a broken seg- are labeled I, II, III, and IV so that we may more easily fol-
ment to a nonhomologous chromosome. Once again, if no low their progress in meiosis and meiotic outcomes.
critical genes are severed or have their regulatory regions dis- Two main patterns of chromosome segregation emerge
rupted by the breakage or translocation events, translocation from the tetravalent structures found in translocation het-
heterozygotes, with one normal chromosome and one altered erozygotes. Alternate segregation and adjacent-1 segrega-
chromosome in each affected homologous pair, may display tion each occur in approximately 50% of meiotic divisions,
no outward phenotype effects. Even if no phenotypic abnor- although the actual proportions vary somewhat among dif-
malities are detected, however, certain translocation heterozy- ferent species. At anaphase I in alternate segregation,
gotes can experience semisterility as a result of abnormalities chromosomes I and IV move to one cell pole and chromo-
of chromosome segregation, as described below. somes II and III move to the opposite pole. At the comple-
Three principal types of translocation are observed. tion of meiosis, all gametes are viable because each contains
Unbalanced translocation arises from a chromosome break a complete set of genetic information for the two chromo-
and subsequent reattachment of the fragment to a nonho- somes. Fertilization of a gamete containing chromosomes
mologous chromosome in a one-way event; that is, a piece of I and IV will produce a normal zygote, whereas fertilization
one chromosome is translocated to a nonhomologous chro- of a gamete containing chromosomes II and III will produce
mosome and there is no reciprocal event (Figure 10.22a). a zygote with reciprocal balanced translocation heterozy-
Reciprocal balanced translocation is produced when gosity, like the parent chromosomes at the top of the figure.
breaks occur on two nonhomologous chromosomes and the In anaphase I of adjacent-1 segregation, chromosomes
resulting fragments switch places when they are reattached I and III are moved to one cell pole and chromosomes II and
(Figure 10.22b). Robertsonian translocation, also known as IV go to the opposite pole. None of the gametes formed by
chromosome fusion, involves the fusion of two nonhomolo- this pattern of segregation is viable because of duplications
gous chromosomes. Chromosome fusion is accompanied and deletions of genetic information. Gametes containing
by loss of one of the centromeres and by the loss of a chro- chromosomes I and III have a duplication of the F and G
mosome short (p) arm (Figure 10.22c). The chromosomes regions, along with deletion of the R and S regions. Con-
involved in Robertsonian translocation are usually acrocentric versely, gametes containing chromosomes II and IV have a
or telocentric chromosomes. These have little or no genetic duplication of the R and S regions and a deletion of regions
information in the short arm, thus the organism does not suffer F and G.
384 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
Tetravalent complex
I G G G G III
F F F F
ABCD E T UV
ABCD E T UV
Metaphase I
ABCD E T UV
ABCD E T UV
S S S S
II R R R R IV
Gametes
I ABC D E FG I ABC D E FG I ABC D E FG
IV R S T UV III G F T UV II A B C D E SR
II A B C D E SR II A B C D E SR III G F T UV
III G F T UV IV R S T UV IV R S T UV
II A B C D E SR II A B C D E SR III G F T UV
III G F T UV IV R S T UV IV R S T UV
Alternate segregation separates Adjacent-1 segregation separates Adjacent-2 segregation is very rare
homologous centromeres and produces homologous centromeres and produces because it does not separate homologous
normal gametes. nonviable gametes with duplications centromeres; gametes are nonviable due
and deletions. to duplications and deletions.
Conclusion: Only alternate segregation produces viable gametes and progeny. This segregation pattern occurs in
about half of meioses and accounts for semisterility of translocation heterozygotes.
Figure 10.23 The tetravalent synaptic structure and alternate and adjacent chromosome segregation
in reciprocal balanced translocation heterozygotes.
10.6 Eukaryotic Chromosomes Are that typifies all kinds of eukaryotic cells. In this section
we describe the typical chromatin organization of chromo-
Organized into Chromatin somes and its impact on chromosomes throughout the cell
cycle. At the end of this section we introduce the concept of
We return now to the subject of chromosome structure and the influence of chromatin compaction on gene expression
the chromatin organization of eukaryotic chromosomes. by examining a mutant phenotype in fruit flies. In Section
This molecular organization is essential to the normal func- 13.2 we take up further discussion of the regulation of gene
tion and distribution of chromosomes in cell division, and expression in eukaryotes, exploring the normal and essential
it plays a pivotal role in the regulation of gene expression role chromatin plays in these processes.
386 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
Ratio of Basic /Acidic
Histonea Amino Acids Molecular Weight (D) Number of Amino Acids Location
H1 5.4 23,000 224 Linker DNA
H2A 1.4 13,960 129 Nucleosome
H2B 1.7 13,774 125 Nucleosome
H3 1.8 15,273 135 Nucleosome
H4 2.5 11,236 102 Nucleosome
a
Histone proteins from calf thymus gland.
F O U N D A T I O N F I G U R E 10.24
10-nm
fiber
Histone octamers
(c)
Nucleosome Solenoid
(34 nm),
30-nm fiber
Histone H1
Solenoid
(34 nm),
30-nm fiber
Looped chromatin
Extended
chromatin,
300-nm fiber
Scaffold proteins
(e) (d)
Sister chromatids
Coiled chromosome
Centromere arm, 700 nm
Condensed
chromatin,
1400 nm
387
388 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
The 146 bp of DNA wrapped around a nucleosome (pp. 288–289) and discussion in Section 8.3 in connection
core particle is called core DNA, and the combination of a with DNA footprint-protection analysis that DNase I cuts
nucleosome core particle wrapped with core DNA is identi- DNA that has no protein bound to it but is unable to cut DNA
fied as a nucleosome. Electron micrographs of chromatin in regions bound by protein. Noll’s most important result was
fibers in a highly decondensed state show a regular series of obtained by mixing mammalian chromatin with a high con-
circular structures strung together by connecting filaments centration of DNase I and using gel electrophoresis to deter-
(see Figure 10.24b). This form of chromatin is identified mine that the length of DNA fragments produced by DNase
as the “beads on a string” morphology of chromatin. The I digestion measured approximately 200 bp in length. This is
“beads” are nucleosomes that are a little more than 11 nm precisely the length Kornberg predicted, as it is the sum of the
in diameter, and the “string” is called linker DNA. Linker approximately 145 bp of DNA wrapping a nucleosome core
DNA is the DNA between regions of core DNA. particle and the 55 bp of linked DNA between nucleosomes.
The length of linker DNA segments varies among Kornberg’s model was also supported by structural pro-
organisms, although in each species it is a consistent length tein studies, X-ray diffraction imaging, and cryogenic electron
and thus nucleosomes occur at regular intervals. In the yeast microscopy (cryo-EM). The latter has produced detailed images
Saccharomyces cerevisiae, linker DNA is 13 to 18 bp in of nucleosome structure and revealed the likely points of inter-
length. Linker DNA is about 35 bp long in the fruit fly Dro- action between the octameric nucleosome core particle and core
sophila. In humans and other mammals, linker DNA spans DNA. Timothy Richmond and his colleagues have described
about 40 to 50 bp; in sea urchins, linker DNA is very long— the crystal structure of the nucleosome using cryo-EM at 2.8-Å
approximately 110 bp. If the 146 bp in length of core DNA resolution (Figure 10.25). Richmond’s analysis indicates that
is added to the length of linker DNA, the nucleosome repeat there are 1.65 turns of core DNA around each nucleosome core
distance of the beads-on-a-string structure is approximately particle. The analysis identifies additional molecular interac-
160 to 260 bp. This beads-on-a-string form of chromatin is tions between the N-terminal (amino terminal) tails of histone
identified as the 10-nm fiber, since the diameter of nucleo- proteins and core and linker DNA. These interactions are criti-
somes is approximately 10 nm. cally important to the types of chromatin structure present in
This nucleosome-based model of chromatin was pro- different regions of eukaryotic chromosomes.
posed by Roger Kornberg in 1974. Kornberg based his The 10-nm fiber is an unnatural state for chromatin. To
model on biochemical observations that chromatin contains achieve it, chromatin must be chemically treated and held in
a ratio of one molecule of each of the four core histone pro- conditions that are not found in cells. Under in vitro condi-
teins (H2, H2A, H3, and H4) to each 100 base pairs and one tions, chromatin forms the 30-nm fiber, although it is not
molecule of the histone H1 to each 200 base pairs. certain this structure forms in vivo (see Figure 10.24c).
Structural protein–imaging (described momentarily) Electron micrographs and molecular modeling help us visu-
supported Kornberg’s model, but the molecular proof of the alize how the 30-nm fiber is assembled. It is produced by
model’s validity came from research by Markus Noll, who coalescence of the 10-nm fiber into a cylindrical filament of
treated eukaryotic chromatin with different concentrations coiled nucleosomes that is hollow in the middle. Due to its
of the enzyme DNase I to cut DNA where it is not protected coiled structure and open middle, the 30-nm fiber is often
by bound proteins. Recall from Research Technique 8.1 also called the solenoid structure (like the coil of wire in
H3
H4
DNA
10.6 Eukaryotic Chromosomes Are Organized into Chromatin 389
the starter of a car). Each turn of the solenoid structure con- proteins. The chromosome scaffold gives a chromosome
tains six to eight nucleosomes. The diameter of the solenoid its shape. The scaffold is in some ways like the steel infra-
is approximately 34 nm. Research examining in vivo chro- structure that provides the shape, strength, and support
matin structures will soon be able to determine occur in cells for a building. In the case of chromosomes, the chroma-
or only in vitro. tin is “hung” on the scaffold. Figure 10.26a shows a fully
The histone protein H1 plays a key role in stabilizing condensed chromosome at metaphase, and Figure 10.26b
the solenoid structure. The long N-terminal and C-terminal shows the protein scaffold of a metaphase chromosome
ends of the H1 protein attach to adjacent nucleosome core after being stripped of DNA. The shape of the chromo-
particles. H1 protein pulls the nucleosomes into an orderly some scaffold is clearly reminiscent of the metaphase
solenoid array and lines the inside of the structure. Experi- chromosome structure, consisting of sister chromatids
mental analysis shows that chromatin from which H1 has joined at the centromere, which is visible as a constriction
been removed can form 10-nm fibers but not 30-nm fibers. near the midpoint of the scaffold. The stringy gray material
Chromatin exists in a 30-nm-fiber state or a more condensed surrounding the scaffold is the DNA of the chromosome.
state during interphase. Chromatin loops containing 20,000 to 100,000 bp of
DNA are anchored to the chromosome scaffold by other
Higher Order Chromatin Organization and nonhistone proteins at sites called matrix attachment
regions (MARs). Contemporary models of chromatin
Chromosome Structure
organization predict that the chromatin loops progressively
Beyond the 30-nm stage, chromatin compaction and the consolidate and are further compressed by nonhistone pro-
presence of nonhistone proteins are integral to the struc- teins. Ultimately, the compaction of chromatin achieved by
ture of chromosomes and the process of chromosome con- metaphase is approximately a 250-fold compaction of the
densation that initiates with the onset of prophase in the 300-nm fiber, which, as shown in Foundation Figure 10.24,
M phase of the cell cycle. Nonhistone proteins perform already represents significant compaction.
multiple roles in influencing chromosome structure and in
facilitating M phase chromosome condensation. Interphase Nucleosome Disassembly, Synthesis, and
chromosome structure results from the formation of looped
domains of chromatin similar to supercoiled bacterial DNA
Reassembly during Replication
(see Figure 10.24d). The loops are variable in size, contain- Our current discussion of histones as a basic organizing ele-
ing from tens to hundreds of kilobase pairs and consisting ment of chromatin and our discussion of DNA replication
of 30-nm–fiber DNA looped on a category of nonhistone in Chapter 7 invites a question about the chromatin orga-
proteins that are the foundation of chromosome shape. The nization of newly synthesized DNA. Specifically, when the
diameter of looped chromatin is approximately 300 nm, so amount of DNA doubles during S phase, do the number of
looped chromatin is called the 300-nm fiber. With contin- nucleosomes also double to organize the newly synthesized
ued condensation, the chromatin loops form the sister chro- DNA? If so, how are the new nucleosomes constructed?
matids. In metaphase, chromosome condensation reaches its It would be of interest to know whether old nucleosomes
zenith, resulting in chromosomes that are easily visualized are recycled during replication or whether the new nucleo-
by microscopy (see Figure 10.24e). somes are composed entirely of newly produced proteins.
The chromosome scaffold is a filamentous framework Experimental research has answered these questions.
made up of a large number of distinct nonhistone scaffold The evidence collected by numerous investigators finds that
2 μm
390 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
Looking at the chromosomes of mutant flies with var- has a very specific consequence for the expression of the w +
iegated eye color, Muller discovered that the X chromosome alleles that are close to the centromere as a result of para-
had undergone an inversion. Exposure to X-rays had broken centric inversion. As Figure 10.28b indicates, if centromeric
the X chromosome, and the broken ends reattached to form a heterochromatin spread is limited (top image) and does not
paracentric inversion. Muller examined the banding patterns reach the new position of w +, the allele is expressed in the
in the inverted X chromosome and noticed that variegated cell. All cells descending from this initial cell grow in a clus-
flies had a particular kind of paracentric inversion. Their ter in the eye, and the cells in such a cluster will have red
inversions had moved the w gene from its normal location pigment and form red patches in the variegated eye. If, on
near the telomere of the X chromosome to a new location the other hand, centromeric heterochromatin spread is more
very near the chromosome centromere (Figure 10.28). extensive, and the relocated w + allele is covered by reformed
The chromosome region immediately surrounding the heterochromatin (bottom image), the allele is not expressed
centromere is a heterochromatic region that in Drosophila in the cell. All cells descending from this one also grow in
and most eukaryotes contains very few expressed genes. a cluster in the eye, and they have no pigment (i.e., they are
During S phase there is a temporary dissociation of nucleo- white). This is the source of white patches in the variegated
somes as DNA replicates. The reassociation of nucleosomes eye. Because the spread of centromeric heterochromatin can
after DNA replication leads to the reformation of heterochro- vary from chromosome to chromosome, and the development
matin around the region of the centromere, but the distance to of patches of eye tissue is also variable from eye to eye, there
which the reformed heterochromatin extends can vary from is a great deal of observed variation in the patterns of eye
chromosome to chromosome. This variability is permissible color variegation.
because the centromeric region normally contains few if any Since the time when Muller first described position effect
expressed genes. Specifically, on some X chromosomes the variegation and the time when its molecular basis was identi-
centromeric heterochromatin spreads a greater distance out- fied, geneticists and cell biologists have come to understand
ward from the centromere than on other X chromosomes. A that chromatin structure and the degree of chromatin compac-
greater extent of heterochromatin spread leads to more DNA tion are critical components of gene expression in eukary-
sequence being included in the heterochromatic region. otic genomes. Research on PEV and on chromatin state have
Muller was not able to provide a molecular explana- led to two central conclusions: (1) Gene expression can be
tion for variegation, but research in the decades since Muller controlled by the state of the chromatin in which a gene is
made his observations about eye color variegation have pro- located, and (2) gene expression or gene silencing can be dic-
vided an explanation for both the patches of red and white tated by chromatin structure that is transmissible from one
eye color and the variability of the variegation pattern. The cell generation to the next. We discuss these and other topics
X- chromosome-to-X-chromosome variability of centro- related to the modification of eukaryotic gene expression in
meric heterochromatin spread following DNA replication Section 13.3.
w+ allele
silenced
Centromeric heterochromatin
spread is variable.
392 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
C A SE ST U D Y
Human Chromosome Evolution
Researchers can trace the evolution of human chromosomes clusters because their common ancestor carried these clus-
by comparing chromosome structure and genetic com- ters. Human and mouse chromosomes have diverged from
position of humans with those of other species that share those of their common ancestor by numerous rearrange-
an ancestor with us. We describe two such comparative ments, including chromosome translocation, chromosome
approaches here: One approach compares syntenic clus- fusion, and chromosome inversion, that have changed many
ters of genes (genes grouped on the same chromosome) attributes of chromosome structure, but they also retain large
in related species which identify how species diversification segments of genes and sequences as syntenic clusters. Sec-
has affected the distribution of those genes among chromo- ond, for X-linked genes specifically, the strong syntenic rela-
somes. The second approach compares banding patterns of tionship has been maintained by natural selection driven by
chromosomes in closely related species to reconstruct the the requirements of embryonic development and the neces-
chromosome-level events that have produced the contempo- sity to maintain a balance in dosage of X-linked genes by ran-
rary chromosomes of the species. dom X-inactivation.
In Figure 10.29, syntenic clusters of genes on Figure 10.30 illustrates the banding patterns of
20 chromosomes (19 autosomes and the X chromosome) chromosomes 1, 2, and 3 of human (H), chimpanzee (C),
in the mouse genome are colored to show their corre- gorilla (G), and orangutan (O). These four closely related
spondence to the sequences making up the 23 chromo- primate species last shared a common ancestor between
somes (22 autosomes and the X chromosome) in humans. 30 and 35 million years ago. In each of the three chro-
Published in 2002 by a large research group known as mosomes, strong similarity of banding patterns directly
the Mouse Genome Sequencing Consortium, this study reflects the strong genetic similarity between the species.
compares 342 syntenic chromosome segments. The aver- Structural and numerical differences between the chromo-
age size of the syntenic segments is a little less than somes allow reconstruction of the evolutionary events that
10 million base pairs. Syntenic groups of genes found in shaped the contemporary chromosomes of each species.
the human genome are dispersed among several chro- By comparing the human chromosome with each of the
mosomes in the mouse genome. Interestingly, human others, we can reconstruct some of that evolutionary his-
chromosomes 17 and 20 each correspond entirely to a tory as follows.
portion of mouse chromosomes 11 and 2, respectively.
In both cases, the human chromosome corresponds to a • Chromosome 1 is very similar in the four primate species,
long cluster of contiguous syntenic groups in the identi- with the exception of a pericentric inversion and the addi-
fied mouse chromosome. Comparison of X chromosomes tion of a small segment near the centromere of the human
of human and mouse reveal very strong sequence and chromosome (1q1.2 to 1q2.1).
genetic similarity.
This comparison leads to two salient evolutionary con-
clusions. First, mouse and human share similar syntenic H C G O H C G O Robertsonian translocation H C G O
Mouse chromosomes
Inversion
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X
Human chromosomes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X
Q The genetic composition of the mouse and that of the Figure 10.30 Human and great ape chromosome evolu-
human X chromosome look very similar based on this image. tion. Chromosomes 1, 2, and 3 of human (H), chimpanzee
Thinking back to the discussion of X-linked genes and (C), gorilla (G), and orangutan (O) are compared to determine
X-inactivation in Section 3.6, explain why this makes sense the events leading to different chromosome numbers and
in evolutionary terms. structures.
Summary 393
• Chromosome 2 holds the explanation for the difference in • Chromosome 3 shows strong similarity of banding pattern
diploid number between humans (2n = 46) and our close in the four species with the exception of the orangutan
relatives (2n = 48). The reduction in human diploid num- chromosome, which has undergone a pericentric inver-
ber is the result of a Robertsonian translocation fusing two sion that changed the relative arm lengths and altered the
small acrocentric chromosomes that belong to separate position of the centromere in comparison with the other
chromosome pairs in chimp, gorilla, and orangutan. primate chromosomes.
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
10.1 Chromosome Number and Shape Vary ❚❚ Polyploidy is common in plant species, causing increases in
among Organisms fruit and flower size that alter fertility and producing hybrid
vigor.
❚❚ Chromosomes are categorized by shape on the basis of
the centromere position and the ratio of long-arm (q arm)
10.4 Chromosome Breakage Causes Mutation
length to short-arm (p arm) length.
by Loss, Gain, and Rearrangement of
❚❚ Specialized molecular probes are used for in situ hybridiza-
tion to locate specific genes or chromosome-specific DNA Chromosomes
sequences. These probes often utilize fluorescent labels for ❚❚ Chromosome breakage can result in terminal deletion or
detection. in interstitial deletion and may alter chromosome banding
❚❚ During interphase, each chromosome inhabits a territory of patterns.
its own in the nucleus. ❚❚ Heterozygosity for partial deletion or partial duplication
❚❚ Each chromosome has a distinctive banding pattern created produces phenotypic abnormalities through disturbances of
by applying stains or dyes to solutions of condensed chro- gene dosage balance.
mosomes from a single nucleus that are spread on micro- ❚❚ Homologous chromosome synapsis involving a partial dele-
scope slides. tion or partial duplication chromosome produces a charac-
❚❚ Heterochromatic DNA forms darkly staining bands that teristic unpaired loop.
contain relatively few expressed genes. ❚❚ Microdeletions and microduplications too small to be
❚❚ Euchromatic DNA forms lightly staining bands that contain seen by banding changes are detected by molecular
the majority of expressed genes. methods.
❚❚ The detection of pseudodominance provides important
10.2 Nondisjunction Leads to Changes positional indicators for deletion mapping of genes.
in Chromosome Number
10.5 Chromosome Breakage Leads to
❚❚ In euploid nuclei, the number of chromosomes is equal to
a multiple of the haploid number (n), whereas aneuploid Inversion and Translocation of Chromosomes
nuclei have additional or missing chromosomes. ❚❚ Chromosome breakage can lead to inversion or transloca-
❚❚ Chromosome nondisjunction is the failure of homologous tion of chromosome segments.
chromosomes or sister chromatids to separate and is a com- ❚❚ Chromosome inversion heterozygotes have one chromo-
mon cause of aneuploid gametes. some with the normal order but have an inversion in the
❚❚ Aneuploidy alters the phenotype of an organism by chang- homolog. Homologs in these organisms form an inversion
ing the balance of gene dosage of critical genes. loop at synapsis.
❚❚ Human aneuploidy manifests as trisomy of certain auto- ❚❚ Paracentric inversions have two break points on one arm
somes and as trisomy or monosomy of sex chromosomes. only, and the inversion does not include the centromeric
❚❚ Chromosomal mosaics are organisms containing cells with region. Pericentric inversions have break points on each
two or more genetic or chromosomal constitutions. arm, and the centromeric region is included in the inverted
❚❚ Uniparental disomy occurs when both homologous copies region.
of a chromosome originate in a single parent. ❚❚ Chromosome inversion is a crossover-suppression
mechanism.
10.3 Changes in Euploid Content Lead ❚❚ A tetravalent synaptic structure containing chromosomes
involved in reciprocal translocation leads to two main pat-
to Polyploidy terns of chromosome segregation in meiosis.
❚❚ Polyploids carry three or more haploid sets of ❚❚ The reduction in the number of viable gametes produced by
chromosomes. reciprocal balanced translocation heterozygotes results in
❚❚ Allopolyploids carry chromosome sets from different spe- semisterility.
cies, whereas autopolyploids have multiple chromosome ❚❚ Robertsonian translocation occurs by the fusion of nonho-
sets from a single species. mologous chromosomes.
394 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
10.6 Eukaryotic Chromosomes Are Organized ❚❚ The 10-nm fiber condenses to form the 30-nm fiber.
into Chromatin ❚❚ Nonhistone proteins form the chromosome scaffold that
gives structure to chromatids and aids in additional chromo-
❚❚ Eukaryotic nuclei contain multiple chromosomes, and they some compaction during prophase of the cell cycle.
are highly compacted. ❚❚ Chromatin loops form with the aid of the proteins that com-
❚❚ Eukaryotic chromosomes are composed of chromatin—a prise the chromosome scaffold.
mixture of DNA, histone proteins, and nonhistone proteins. ❚❚ Studies of position effect variegation (PEV) have deter-
❚❚ Eight histone protein molecules form nucleosomes around mined that the structure of chromatin surrounding a gene
which 146 bp of DNA wraps to form the 10-nm fiber. directly influences transcription.
PRE PA R IN G F O R P R O BLE M S O LV I NG
In addition to the list of problem-solving tips and sug- 4. Be familiar with experimental approaches to analysis
gestions given here, you can go to the Study Guide and of chromosomes, including G banding, karyotype
Solutions Manual that accompanies this book for help at analysis, DNase I analysis, and the interpretation of
solving problems. PEV (position effect variegation).
1. Be familiar with general chromosome nomenclature, 5. Understand the errors in meiosis that lead to
including the system used to describe chromosome abnormalities in chromosome number and the role
banding. chromosome breakage plays in generating structural
abnormalities of chromosomes.
2. Be prepared to describe the basis for chromosome
banding and the molecular components and general 6. Understand the mechanisms and origins of polyploidy
structure of chromatin. and its consequences for the phenotype of organisms.
3. Understand the role of chromatin in chromosome 7. Be prepared to describe and predict the effects of
condensation and the general role chromatin structure abnormalities of chromosome number and structure on
plays in gene transcription. the phenotype of organisms.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Give descriptions for the following terms: 4. Describe the importance of light and dark G bands that
a. histone proteins appear along chromosomes.
b. nucleosome core particle 5. Human late prophase karyotypes have about 2000 visible
c. scaffold proteins G bands. The human genome contains approximately
d. G bands 22,000 genes. Consider the region 5p1.5 through the end
e. euchromatin of the short arm of chromosome 5 that is identified on the
f. heterochromatin late prophase chromosome in Figure 10.5, and assume the
g. nucleosome entire region is deleted. Approximately how many genes
h. chromosome territory will be lost as a result of the deletion?
2. The human genome contains 2.9 * 109 base pairs.
6. Consider synapsis in prophase I of meiosis for two plant
Approximately how many nucleosomes are required to
species that each carry 36 chromosomes. Species A is
organize the 10-nm–fiber structure of the human genome?
diploid and species B is triploid. What characteristics of
Show the calculation you use to determine the answer.
homologous chromosome synapsis can be used to distin-
3. In eukaryotic DNA, guish these two species?
a. where are you most likely to find histone protein H4? 7. From the following list, identify the types of chromosome
b. where are you most likely to find histone protein H1? changes you expect to show phenotypic consequences.
c. along a 6000-bp segment of DNA, approximately how
a. pericentric inversion
many molecules of each kind of histone protein do you
b. interstitial deletion
expect to find? Explain your answer.
c. duplication
d. how does the role of H1 differ from the role of H3 in
d. terminal deletion
chromatin formation?
Problems 395
Application And Integration For answers to selected even-numbered problems, see Appendix: Answers.
12. A pair of homologous chromosomes in Drosophila has production of a tomato. He joins a haploid gamete from
the following content (single letters represent genes): each species to form a hybrid and then induces doubling
of chromosome number.
Chromosome 1 RNMDHBGKWU
a. How many chromosomes will the hybrid have before
Chromosome 2 RNMDHBDHBGKWU chromosome doubling?
a. What term best describes this situation? b. Will this hybrid be infertile?
b. Diagram the pairing of these homologous chromo- c. How many chromosomes will the polyploid have after
somes in prophase I. chromosome doubling?
c. What term best describes the unusual structure that d. Can Dr. Dopsis be sure the polyploid will have the
forms during pairing of these chromosomes? characteristics he wants? Why or why not?
d. How does the pairing diagrammed in part (b) differ from 15. A normal chromosome and its homolog carrying a para-
the pairing of chromosomes in an inversion heterozygote? centric inversion are shown here. The dot (•) represents
13. An animal heterozygous for a reciprocal balanced the centromere.
translocation has the following chromosomes: Normal ABC • DEFGHIJK
MN • OPQRST
Inversion abc • djihgfek
MN • OPQRjkl
cdef • ghijkl a. Diagram the alignment of chromosomes during pro-
cdef • ghiST phase I.
a. Diagram the pairing of these chromosomes in pro- b. Assume a crossover takes place in the region between
phase I. F and G. Identify the gametes that are formed follow-
b. Identify the gametes produced by alternate segrega- ing this crossover, and indicate which if any gametes
tion. Which if any of these gametes are viable? are viable.
c. Identify the gametes produced by adjacent-1 segrega- c. Assume a crossover takes place in the region between A
tion. Which if any of these gametes are viable? and B. Identify the gametes that are formed by this cross-
d. Identify the gametes produced by adjacent-2 segrega- over event, and indicate which if any gametes are viable.
tion. Which if any of these gametes are viable? 16. The accompanying chromosome diagram represents a
e. Among the three segregation patterns, which is least eukaryotic chromosome prepared with Giemsa stain.
likely to occur? Why? Indicate the heterochromatic and euchromatic regions of
14. Dr. Ara B. Dopsis has an idea he thinks will be a boon the chromosome, and label the chromosome’s centromeric
to agriculture. He wants to create the “pomato,” a hybrid and telomeric regions.
between a tomato (Lycopersicon esculentum) that has
12 chromosomes and a potato (Solanum tuberosum) Centromere
that has 48 chromosomes. Dr. Dopsis is hoping his new
pomato will have tuber growth like a potato and the fruit
396 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
a. What term best describes the shape of this researcher wants to determine the location and order
chromosome? of genes on the chromosome, so he sets up a series of
b. Do you expect the centromeric region to contain het- crosses in which flies homozygous for a mutant allele
erochromatin? Why or why not? are crossed with flies that are homozygous for a partial
c. Why are expressed genes not found in the telomeric deletion. The progeny are scored to determine whether
region of chromosomes? they have the mutant phenotype (“m” in the table) or
d. Are you more likely to find the DNA sequence encod- the wild-type phenotype (“ + ” in the table). Use the par-
ing the digestive enzyme amylase in a heterochro- tial deletion map and the table of progeny phenotypes
matic, euchromatic, centromeric, or telomeric region? to determine the order of genes on the chromosome.
Explain your reasoning.
17. Histone protein H4 isolated from pea plants and cow Chromosome
thymus glands contains 102 amino acids in both cases. Deletion
A total of 100 of the amino acids are identical between 1
the two species. Give an evolutionary explanation for this 2
strong amino acid sequence identity based on what you 3
know about the functions of histones and nucleosomes. 4
18. A survey of organisms living deep in the ocean reveals 5
two new species whose DNA is isolated for analysis. DNA 6
samples from both species are treated to remove nonhistone 7
proteins. Each DNA sample is then treated with DNase I that
cuts DNA not protected by histone proteins but is unable
to cut DNA bound by histone proteins. Following DNase I Mutation
treatment, DNA samples are subjected to gel electrophoresis, Deletion a b c d e f g
and the gels are stained to visualize all DNA bands in the gel.
The staining patterns of DNA bands from each species are 1 + m + m + + +
shown in the figure. The number of base pairs in small DNA 2 m + + + + m +
fragments is shown at the left of the gel. Interpret the gel 3 m + + + + + m
results in terms of chromatin organization and the spacing of
nucleosomes in the chromatin of each species. 4 m + + m + m m
5 + m + m m + +
Species 6 m m m m + m m
A B 7 m + + + + + +
–
bp
800 22. Two experimental varieties of strawberry are produced by
600 crossing a hexaploid line that contains 48 chromosomes
400 and a tetraploid line that contains 32 chromosomes.
200 Experimental variety 1 contains 40 chromosomes, and
+ experimental variety 2 contains 56 chromosomes.
a. Do you expect both experimental lines to be fertile?
19. In humans that are XX/XO mosaics, the phenotype is Why or why not?
highly variable, ranging from females who have classic b. How many chromosomes from the hexaploid line are
Turner syndrome symptoms to females who are essen- contributed to experimental variety 1? To experimental
tially normal. Likewise, XY/XO mosaics have phenotypes variety 2?
that range from Turner syndrome females to essentially c. How many chromosomes from the tetraploid lines are
normal males. How can the wide range of phenotypes be contributed to experimental variety 1? To experimental
explained for these sex-chromosome mosaics? variety 2?
23. In the tomato, Solanum esculentum, tall (D- ) is domi-
20. A plant breeder would like to develop a seedless variety of
nant to dwarf (dd) plant height, smooth fruit (P - ) is
cucumber from two existing lines. Line A is a tetraploid
dominant to peach fruit (pp), and round fruit shape
line, and line B is a diploid line. Describe the breeding
(O- ) is dominant to oblate fruit shape (oo). These
strategy that will produce a seedless line, and support
three genes are linked on chromosome 1 of tomato in
your strategy by describing the results of crosses.
the order dwarf–peach–oblate. There are 12 map units
21. In Drosophila, seven partial deletions (1 to 7) shown as between dwarf and peach and 17 map units between
gaps in the following diagram have been mapped on a peach and oblate. A trihybrid plant (DPO/dpo) is test-
chromosome. This region of the chromosome contains crossed to a plant that is homozygous recessive at the
genes that express seven recessive mutant phenotypes, three loci (dpo/dpo). The accompanying table shows the
identified in the following table as a through g. A progeny plants. Identify the mechanism responsible for
Problems 397
the resulting data that do not agree with the established b. Draw a conclusion about the organization of chromatin
genetic map. in the human genome from this gel.
27. Genomic DNA from the nematode worm Caenorhabditis
Progeny Phenotype Number elegans is organized by nucleosomes in the manner typical
Tall, smooth, round 473 of eukaryotic genomes, with 145 bp encircling each nucleo-
some and approximately 55 bp in linker DNA. When C. ele-
Dwarf, peach, oblate 476 gans chromatin is carefully isolated, stripped of nonhistone
Tall, smooth, oblate 12 proteins, and placed in an appropriate buffer, the chromatin
Dwarf, peach, round 8 decondenses to the 10-nm fiber structure. Suppose research-
ers mix a sample of 10-nm–fiber chromatin with a large
Tall, peach, oblate 17
amount of the enzyme DNase I that randomly cleaves DNA
Dwarf, smooth, round 13 in regions not protected by bound protein. Next, they remove
Tall, peach, round 0 the nucleosomes, separate the DNA fragments by gel elec-
trophoresis, and stain all the DNA fragments in the gel.
Dwarf, smooth, oblate 1
a. Approximately what range of DNA fragment sizes do
1000 you expect to see in the stained electrophoresis gel?
How many bands will be visible on the gel?
24. A boy with Down syndrome (trisomy 21) has 46 chromo- b. Explain the origin of DNA fragments seen in the gel.
somes. His parents and his two older sisters have a normal c. How do the expected results support the 10-nm–fiber
phenotype, but each has 45 chromosomes. model of chromatin?
a. Explain how this is possible. 28. A small population of deer living on an isolated island are
b. How many chromosomes do you expect to see in separated for many generations from a mainland deer
karyotypes of the parents? population. The populations retain the same number of
c. What term best describes this kind of chromosome chromosomes and but hybrids are infertile. One chromo-
abnormality? some (shown here) has a different banding pattern in the
d. What is the probability the next child of this couple island population than in the mainland population.
will have a normal phenotype and have 46 chromo-
somes? Explain your answer. Mainland Island
25. Experimental evidence demonstrates that the nucleo- p2.2 p3
somes present in a cell after the completion of S phase are
p2
composed of some “old” histone dimers and some newly p2.1
synthesized histone dimers. Describe the general design p1
p1
for an experiment that uses a protein label such as 35S to Centromere
Centromere
show that nucleosomes are often a mixture of old and new
histone dimers following DNA replication. q1 q1
q2 q2.1
26. DNase I cuts DNA that is not protected by bound pro-
q3.1 q2.2
teins but is unable to cut DNA that is complexed with
proteins. Human DNA is isolated, stripped of its nonhis- q3.2 q2.3
tone proteins, and mixed with DNase I. Samples are q4.1 q2.4
removed after 30 minutes, 1 hour, and 4 hours and run
separately in gel electrophoresis. The resulting gel is q4.2 q3.5
stained to make all DNA fragments in it visible, and the
results are shown in the figure. DNA fragment sizes in
base pairs (bp) are estimated by the scale to the left of
a. Describe how the banding pattern of the island popula-
the gel.
tion chromosome most likely evolved from the main-
land chromosome. What term or terms describe the
Time difference between these chromosomes?
30 min 1 hr 4 hr b. Draw the synapsis of these homologs during prophase
– I in hybrids produced from the cross of mainland with
bp island deer.
800 c. In a mainland–island hybrid deer, recombination takes
600 place in band q1 of the homologous chromosomes.
400
Draw the gametes that result from this event.
d. Suppose that 40% of all meioses in mainland–island
200 hybrids involve recombination somewhere in the chro-
+
mosome region between q2.1 and p2. What proportion
of the gametes of hybrid deer are viable? What is the
a. Examine the gel results and speculate why longer cause of the decreased proportion of viable gametes in
DNase I treatment produces different results. hybrids relative to the parental populations?
398 CHAPTER 10 Eukaryotic Chromosome Abnormalities and Molecular Organization
Collaboration And Discussion For answers to selected even-numbered problems, see Appendix: Answers.
29. A eukaryote with a diploid number of 2n = 6 carries the d. A man who is color blind and has hemophilia and a woman
chromosomes shown below and labeled (a) to (f). who is wild type have a daughter with triple X syndrome
(XXX) who has hemophilia and normal color vision.
(a)
32. A healthy couple with a history of three previous spon-
taneous abortions has just had a child with cri-du-chat
(c) syndrome, a disorder caused by a terminal deletion of
(b) chromosome 5. Their physician orders karyotype analysis
of both parents and of the child. The karyotype results for
chromosomes 5 and 12 are shown here.
35.3
35.2
35.1
34
33.3
33.2
33.1
33
31.3
31.2
31.1
23.3
23.2
23.1
22
21
15
14
13.3
13.2
13.1
12
11.2
11.1
11
12
13.1
13.2
13.3
14
15.1
15.2
15.3
(d) 5
(e) (f)
5
13.1
Mother
34.33
34.32
34.31
34.2
24.1
23
22
21.3
21.2
21.1
15
14
13.3
13.2
13.1
12
11
11.1
11.2
12.1
12.2
12.3
13.2
13.3
12
a. Carefully examine and redraw these chromosomes in any 12
valid metaphase I alignment. Draw and label the metaphase
plate, and label each chromosome by its assigned letter. 35.3
35.2
35.1
34
33.3
33.2
33.1
33
31.3
31.2
31.1
23.3
23.2
23.1
22
13.2
21
15
14
13.3
13.2
13.1
12
11.2
11.1
11
12
13.1
13.3
14
15.1
15.2
15.3
b. Explain how you determined the correct alignment of
5
homologous chromosomes on opposite sides of the
metaphase plate. 5
30. Human chromosome 5 and the corresponding chromo-
13.1
somes from chimpanzee, gorilla, and orangutan are shown Father
24.33
24.32
24.31
24.2
24.1
23
22
21.3
21.2
21.1
15
14
13.3
13.2
13.1
12
11
11.1
11.2
12.1
12.2
12.3
13.2
13.3
here. Describe any structural differences you see in the
other primate chromosomes in relation to the human 12
chromosome. 12
O
35.3
35.2
35.1
34
33.3
33.2
33.1
33
31.3
31.2
31.1
23.3
23.2
23.1
22
21
15
14
13.3
13.2
13.1
12
11.2
11.1
11
12
13.1
13.2
13.3
14
15.1
15.2
15.3
G 5
C 5
Child
13.1
H
24.33
24.32
24.31
24.2
24.1
23
22
21.3
21.2
21.1
15
14
13.3
13.2
13.1
12
11
11.1
11.2
12.1
12.2
12.3
13.2
13.3
2 .3.2
.33
.32
5 .31
.21
3 .22
.2
.1
.3
.2
.1
.3
.2
.1
.2
1 .1
1 .1
.2
.1
.1
.3
.1
.1
4 .2
.3
.1
1 .2
.3
.1
2 .2
.3
.1
3 .2
.3
.1
1 .2
.3
.1
3 .2
.3
.1
5 .2
.3
12
4
3
2
1 1 2 3
12
ESSENTIAL IDEAS
❚❚ Gene mutations are rare and random.
The baby kangaroo peeking out of its mother’s pouch has autosomal recessive
albinism, a condition that occurs in about 1 in 20,000 births due to spontane-
❚❚ Mutations change DNA sequence and
ous mutation.
can alter polypeptide composition and
protein function. Mutations can also
cause phenotypic variation.
M
❚❚ Spontaneous nucleotide changes can
utation can be defined most simply as a heritable lead to mutation.
change in DNA sequence, a definition that covers ❚❚ Chemical mutagens and radiation can
damage DNA and produce mutations.
an enormous range of changes. Mutation is indispensable in
❚❚ DNA repair systems can directly repair
two ways. First, from an evolutionary perspective, mutations DNA damage or can remove and replace
generate new hereditary variety. These new variants can influ- damaged segments.
ence the evolution of a species through the action of any of ❚❚ Specialized enzymes can bypass a block-
age of DNA replication caused by unre-
the four evolutionary processes described in Section 1.4. For
paired damage.
example, mutations that cause phenotypic changes can be ❚❚ Controlled DNA double-strand breaks
subject to natural selection. A few of these changes will have a initiate homologous recombination and
positive effect on the fitness of organisms, and natural selection also recombination between homolo-
gous chromosomes in meiosis.
will favor their perpetuation in populations. Many, however, will
❚❚ Transposable genetic elements move
have a negative consequence for the organism, and selection will throughout the genome and may mutate
genes and alter genomes.
399
400 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
operate to reduce the frequency of the mutant allele These high mutation rates are due to the large size of those
or eliminate it entirely from the population. genes. DYS is the largest gene in the human genome, span-
ning almost 2.5 million base pairs, and NF1 is well over
The second way in which mutations are indis-
1 million base pairs in length.
pensable has to do with their role in genetic analysis.
Whether for studying the phenotypic effects of variant
alleles on organisms, the processes that damage DNA, Proof of the Random Mutation Hypothesis
the molecular biology of DNA damage detection and Despite some difference in average mutation rates per gene
repair, the structure and function of genes, or the pat- (10-6 to 10-7 for most genes in most organisms), mutations
occur at random in genomes. The random nature of muta-
terns of hereditary transmission, mutation analysis is at tions was first experimentally demonstrated by Salvador
the heart of genetics. Luria and Max Delbrück in 1943 in an experiment called the
In this chapter, we focus on mutation at the level fluctuation test. This experiment tested the nature of bac-
of the individual gene—that is, gene mutation. We terial mutations that produced resistance to bacteriophage
infection and thus protected the bacterium Escherichia coli
first discuss the nature of gene mutation and then de- (E. coli) from lysis.
scribe spontaneous changes to DNA nucleotide base At the time there were two competing hypotheses to
structure and the occasional DNA replication errors explain the occurrence of mutations. One, the random
that can generate gene mutations. We also examine mutation hypothesis (which turned out to be the correct
hypothesis) predicted that mutations occur at random. With
the DNA-damaging actions of chemical and physical respect to cultures of bacteria and their bacteriophage resis-
agents and the role this damage plays in producing tance, the random mutation hypothesis predicted that different
gene mutation, after which we describe DNA dam- cultures would develop resistance mutations at different times.
age repair mechanisms and the connection between When a bacteriophage-resistant mutation occurs early in the
history of a bacterial culture, large numbers of resistant bac-
mechanisms of DNA double-strand break repair and teria will be present when the culture is tested. Other cultures
homologous recombination. We end the chapter with may develop a resistance mutation later in their history and will
a discussion of the role of transposition in generating have fewer resistant bacteria when tested. Under the random
mutations. mutation hypothesis, a comparison of the number of resistant
bacteria in several different cultures will reveal a great deal of
fluctuation, or variation, in the number, hence the name of the
experiment (Figure 11.1a). The alternative hypothesis, known
11.1 Mutations Are Rare and as the adaptive mutation hypothesis, proposed that envi-
ronmental change triggers mutation. For the bacterial culture
Random and Alter DNA Sequence experiment, this hypothesis predicted that cultures exposed at
the same time to a trigger (bacteriophage exposure, in this case)
Gene mutations are random and their occurrence is rare. would respond the same way. This means that the number of
The mutation rate is measured in two primary ways: at bacteriophage-resistant bacteria in each culture should be about
the phenotypic level, by counting the number of mutations equal, with little variation (Figure 11.1b).
affecting a phenotype; and at the molecular level, by deter- Luria and Delbrück began with a single large culture of
mining the frequency of mutations per base pair. By either bacteria that had never been exposed to bacteriophage. They
measure, average mutation rates are very low. Owing to the split the large culture into about two dozen smaller cultures
mechanisms of DNA replication that are shared by bacte- and allowed them to grow for multiple generations, still free
ria, archaea, and eukaryotes, there is a remarkable similarity from bacteriophage exposure. After several generations of
between mutation rates at the DNA level—on the order of growth, samples from each culture were plated on growth
about 10-9 per replicated base pair in all organisms. Muta- plates containing bacteriophage, and the number of phage-
tion rates at the phenotypic level are more frequent and more resistant bacterial colonies was counted on each. The results
variable among organisms; 10-6 to 10-8 per gene are typical. revealed a great deal of fluctuation in the number of phage-
Certain genes in specific genomes have elevated muta- resistant bacteria in different cultures, closely mirroring the
tion rates. These genes are identified as being mutation predictions of the random mutation hypothesis.
hotspots. There are multiple reasons why a gene might be a In genetics, the term “random mutation” means that
hotspot, but large gene size is a frequent cause. For example, mutations occur by chance, with each base pair having an
the human X-linked dystrophin (DYS) gene, whose muta- equal probability of mutating. Mutations do not occur with
tion causes Duchenne muscular dystrophy, and the autoso- any predetermined purpose, and they occur independently
mal gene NF1, whose mutation causes autosomal dominant of whether they will prove to be favorable or detrimental or
neurofibromatosis, each have a mutation rate of about 10-4. to have no impact on the fitness of an organism.
11.1 Mutations Are Rare and Random and Alter DNA Sequence 401
Germ-Line and Somatic Mutations mutations, or nonsense mutations. Table 11.1 and the follow-
ing discussions summarize the types of point mutations.
Gene mutation can occur in any cell at any time. Muta-
tions that occur in germ-line cells, such as those giving
Base-Pair Substitution Mutations
rise to sperm and egg, can be passed from one generation
to the next. These are identified as germ-line mutations. The replacement of one nucleotide base pair by another
The seven traits studied by Mendel and the various human is a common type of point mutation. These mutations,
autosomal and X-linked conditions described in this book called base-pair substitution mutations, are of two types.
are examples of inherited variation originating with germ- Transition mutations are those in which one purine replaces
line mutations. All of the body’s cells that are not in the the other (A S G; G S A) or one pyrimidine replaces the other
germ line are somatic cells, and mutations in these cells are (C S T; T S C). The four different transition mutations shown
called somatic mutations. Somatic mutations can be passed in the previous sentence are all that are possible. Transversion
to subsequent generations of cells in a cell lineage through
mitotic cell division, but only the direct descendants of the
original mutated cell carry the gene mutation. Table 11.1 Point Mutations
Type Consequence
Point Mutations
Coding-Sequence Mutations
The most common kinds of gene mutations are those that Synonymous No amino acid sequence change.
substitute, add, or delete one or more DNA base pairs. These Missense Changes one amino acid.
kinds of mutations are confined to a specific base pair or loca- Nonsense Creates stop codon and terminates
tion in a gene and are called point mutations. There are point translation.
mutations of several types, each having characteristic conse- Frameshift Wrong sequence of amino acids.
quences. They can occur anywhere in the genome. Those that Regulatory Mutations
occur in the coding sequence of a gene can lead to changes in
Promoter Changes timing or amount of
the amino acid composition of the protein product of the gene. transcription.
In contrast, those occurring in a regulatory sequence of a gene
Polyadenylation Alters sequence of mRNA.
can alter the amount of wild-type protein product produced
Splice site Improperly retains an intron or
by the gene. When base-pair substitution mutations occur in excludes an exon.
the coding sequence of a gene, they are further categorized DNA replication muta- Increases (or less often, decreases)
at the molecular level by the manner in which they alter the tion, e.g., triplet- number of short repeats of DNA.
informational content of the gene: They may be synony- repeat expansion
mous mutations (also known as silent mutations), missense
402 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
mutations, on the other hand, are those in which a purine is (a) Wild-type sequence
replaced by a pyrimidine (A S T, A S C, G S T, and G S Coding
DNA 5¿ TTA TTT AGA TGG TGT 3¿ strand
C), or a pyrimidine is replaced by a purine (T S A, T S G,
3¿ AAT AAA TCT ACC ACA 5¿ Template
C S A, or C S G). Eight different transversion mutations are strand
possible. Based on the number of different transition and trans- mRNA 5¿ UUA UUU AGA UGG UGU 3¿
version mutations that are possible, one might think that trans-
version mutations would outnumber transition mutations, but Polypeptide N Leu Tyr Arg Trp Cys C
the opposite is true. In nature, transition mutations are about
twice as common as transversion mutations.
(b) Synonymous mutation
The bias toward transition mutations has important impli- Coding
cations for base substitutions in the third position of codons. DNA 5¿ TTG TTT AGA TGG TGT 3¿ strand
Recall from the discussion of the redundancy of the genetic 3¿ AAC AAA TCT ACC ACA 5¿ Template
code in Section 9.4 (see also Figure 9.13), that for most strand
codons that end with a purine, either purine will code for the mRNA 5¿ UUG UUU AGA UGG UGU 3¿
same amino acid. Likewise, for most codons that end with a Polypeptide N Leu Tyr Arg Trp Cys C
pyrimidine, either pyrimidine will code for the same amino
acid. This pattern means that transition mutations in the third
positions of codons are likely to encode the same amino acid. (c) Missense mutation
Mutations of this type are known as synonymous mutations. Coding
DNA 5¿ TTA TTT AGA AGG TGT 3¿ strand
Synonymous Mutation A base-pair substitution producing 3¿ AAT AAA TCT TCC ACA 5¿ Template
strand
an mRNA codon that specifies the same amino acid as the
wild-type mRNA is known as a synonymous mutation (also mRNA 5¿ UUA UUU AGA AGG UGU 3¿
known as a silent mutation). Figures 11.2a and 11.2b illustrate Polypeptide N Leu Tyr Arg Arg Cys C
a synonymous mutation in which an A-T to G-C transition
mutation changes the wild-type leucine codon (5′-UUA-3′)
to a mutant codon (5′-UUG-3′) that also encodes leucine. (d) Nonsense mutation
Coding
DNA 5¿ TTA TTT AGA TGA TGT 3¿ strand
Missense Mutation A base-pair substitution that results in
an amino acid change to the protein is a missense mutation. 3¿ AAT AAA TCT ACT ACA 5¿ Template
strand
Figure 11.2c shows a T-A to A-T transversion mutation
mRNA 5¿ UUA UUU AGA UGA UGU 3¿
that alters the wild-type 5′-UGG-3′ codon to 5′-AGG-3′,
changing the amino acid from tryptophan to arginine. Pro- Polypeptide N Leu Tyr Arg STOP C
tein function may be altered by a missense mutation. The
specific consequence of the protein change (i.e., whether it Figure 11.2 The consequences of base-pair substitutions.
results in complete or only partial loss of protein function)
depends on what kind of amino acid change takes place and
where in the polypeptide chain the change occurs. The tall In addition to producing the wrong amino acids in a portion
versus short stature of pea plants (stem length) studied by of the polypeptide, frameshift mutations commonly gener-
Mendel is caused by a missense mutation. See Experimental ate premature stop codons that result in a truncated polypep-
Insight 11.1 for a discussion. tide. For these reasons, frameshift mutations usually result in
the complete loss of protein function and thus produce null
Nonsense Mutation A base-pair substitution that creates a alleles. The yellow versus green seed pod trait studied by
stop codon in place of a codon specifying an amino acid is a Mendel is caused by an insertion of six base pairs of DNA.
nonsense mutation. The G-C to A-T base-pair substitution Since the insertion is a multiple of three nucleotides, it adds
shown in Figure 11.2d that changes the UGG (Trp) codon to a two codons to the mutant allele mRNA without changing the
UGA (stop) codon is an example of a nonsense mutation. reading frame. Thus, this particular mutant is not the result
of a frameshift mutation. Nevertheless, the insertion of DNA
base pairs is a common mechanism producing frameshift
Frameshift Mutations
mutations. (See the discussion of Mendel’s pod-color muta-
Insertion or deletion of one or more base pairs in the coding tion in Experimental Insight 11.1 for details.).
sequence of a gene leads to addition or deletion of mRNA
nucleotides. This can alter the reading frame of the codon
sequence, beginning at the point of mutation. The result
Regulatory Mutations
would be a frameshift mutation, in which the mutant poly- Some point mutations have the effect of reducing or increas-
peptide contains an altered amino acid sequence from the ing the amount of wild-type gene transcript and the amount
point of mutation to the end of the polypeptide (Figure 11.3). of wild-type polypeptide without affecting the transcript
11.1 Mutations Are Rare and Random and Alter DNA Sequence 403
Mendel’s Mutations and it results in mature seed pods that are yellow. The
mutant allele produces a very poorly functioning enzyme,
Table 2.8 on page 57 and the accompanying text briefly largely disabling a critical step of chlorophyll breakdown.
describe the wild-type and mutant alleles of the four genes Consequently, chlorophyll is retained in mature pods, mak-
studied by Mendel that have been identified to date. For ing them green.
three of the genes, described in this Experimental Insight, the The mutant allele contains a 6-bp insertion that changes
inherited phenotype differences result from point mutations the enzyme product by adding two additional codons to
of various types. Variation of the fourth gene is described in mRNA and two amino acids to the protein. This insertion of 6
the Case Study at the end of this chapter. bp, being a multiple of three nucleotides, as found in a codon,
STEM LENGTH: A MISSENSE MUTATION does not change the reading frame. Thus, in the mutant pro-
The Le gene variation was identified in 1997 by research tein, the amino acid sequence is normal except for the pres-
groups led by Diane Lester and David Martin, who deter- ence of the two additional amino acids. Since the mutant
mined that the wild-type dominant allele of this gene (Le) protein is largely normal, it is able to retain partial function,
produces an enzyme active in the biosynthetic pathway that albeit significantly reduced in comparison with the wild type.
produces the growth hormone gibberellin 3b@hydroxylase.
The effect of the dominant allele is to generate the wild-type FLOWER COLOR: AN mRNA-SPLICING MUTATION
level of growth hormone production, which, in turn, produces Purple flower color is dominant in pea plants, and it results
the long stems that characterize tall pea plants. The recessive from the production of the pigment anthocyanin. The reces-
mutant allele (le) is unable to produce the enzyme, and this sive mutant phenotype is white flower color, and in these
reduces the biosynthesis of the growth hormone to about plants there is no anthocyanin production. A research group
5% of the wild-type level. The result is poor stem growth and led by Roger Hellens identified the bHLH gene as the source
short plants. of the white flower mutation in pea plants. This gene produces
The le allele is the result of a missense mutation that a transcription factor protein that helps activate the transcrip-
changes an alanine to a threonine in the polypeptide product tion of several genes, including some in the anthocyanin-
of the gene. This missense change is brought about by a G-C production pathway. In the absence of a functioning protein
to A-T transition mutation in the le allele’s DNA sequence. product from the bHLH gene, anthocyanin production does
It is an example of a missense mutation that inactivates the not take place.
function of the allele’s protein product. In this case, the con- The mutation in the recessive allele is a G-C to A-T base-
sequence of the mutation is the significant reduction of the pair substitution that alters the guanine at the 5′ splice site of
synthesis of a growth hormone. one of the introns of the allele. Recall that 5′ splice sites have
an invariant GU dinucleotide in mRNA (Section 8.4). The base
POD COLOR: AN INSERTION MUTATION substitution identified by Hellens changes the 5′ sequence
The 2007 studies of the Sgr (“stay green”) gene by research to an AU dinucleotide that is not recognized as a splice site.
groups led by Ian Armstead and Sylvain Aubry identified the An alternative splice site (known as a cryptic splice site; see
molecular basis for the dominant wild-type yellow seed pod the text for discussion) is used instead to process the mutant
and the recessive mutant green seed pod. The wild-type mRNA transcript. The aberrant splicing elongates the mature
allele produces an enzyme that participates in the break- mRNA by eight nucleotides. This addition of mRNA nucleo-
down of chlorophyll contained in the seed pod. This break- tides results in a frameshift during translation, and the protein
down normally occurs in conjunction with pod maturation, product is nonfunctional.
In intron 1 of the b@globin gene, two separate muta- Cryptic Splice Sites Certain base-pair substitution muta-
tions that substitute the guanine of the GT dinucleotide abol- tions produce new splice sites that replace or compete with
ish normal splicing entirely in mutations that are known as authentic splice sites during pre-mRNA processing. These
splicing mutations (Figure 11.4b). Additionally, one base- newly formed splice sites are known as cryptic splice sites.
pair substitution mutation at position 5 of intron 1 by itself Intron 1 of the human b@globin gene is 130 nucleotides in
also prevents the production of normally spliced mRNA. length. A base-pair substitution mutation that changes G to
The translation of the abnormally spliced transcripts pro- A at position 110 of intron 1 creates an AG dinucleotide that
duced by these three mutations does not produce wild-type is a cryptic splice site (Figure 11.5). The cryptic splice site
b@globin protein. Other base-pair substitution mutations in is spliced in about 90% of the intron 1 3′ splicing events.
intron 1 result in production of a mixture of normally and This aberrant splicing leaves 19 additional nucleotides in
abnormally spliced transcript and produce some wild-type the mature mRNA; these nucleotides have been removed in
b@globin protein, but in reduced amounts. the other 10% of mature transcripts, which are spliced at the
One of Mendel’s traits, the purple versus white flower authentic 3′ splice site for intron 1.
phenotype, is caused by a splicing mutation. See the discus-
sion of Mendel’s flower color mutation in Experimental Polyadenylation Mutations Processing of the 3′ end of
Insight 11.1 for details concerning this pre-mRNA–splicing eukaryotic mRNAs is initiated by the presence of a 5′-AAU-
mutation. AAA-3′ polyadenylation signal sequence (see Section 8.4),
11.2 Gene Mutations May Arise from Spontaneous Events 405
and mutation of this sequence can block proper 3′ process- 11.2 Gene Mutations May Arise
ing of mRNA. One example of this mutation is found in a
rare variant of the human a@globin gene in which the DNA from Spontaneous Events
coding strand sequence is mutated from 5′-AATAAA-3′ to
5′-AATAAG-3′. The A-T to G-C base substitution blocks Spontaneous mutations are naturally occurring mutations
recognition of the polyadenylation signal sequence, gener- that arise occasionally through errors during DNA replica-
ates abnormal mRNA, and leads to a severe reduction in the tion or, much more often, through spontaneous changes in
amount of functional a@globin protein. the chemical structure of nucleotide bases.
Phenotype
(b) Intragenic reversion
Figure 11.6 Reversion mutations. (a) This true reversion restores the wild-type amino acid sequence to
the polypeptide. (b) This intragenic reversion reverts a frameshift mutation caused by a 2-bp deletion by
insertion of 2 bp at a nearby site in the gene. (c) Second-site reversion restores a nearly wild-type pheno-
type through a compensatory mutation of a second gene.
Q Using the genetic code (Figure 9.13), make a base substitution mutation in a DNA triplet so as to
produce a nonsense mutation, and then make a reversion mutation in the triplet so as to produce the
original amino acid but with a different codon than the wild-type sequence.
Strand slippage mutations were first identified in the mid- the mutations block the production of wild-type mRNA and
1980s when an unusual X chromosome disorder called frag- reduce or eliminate the production of wild-type protein.
ile X syndrome (OMIM 309550) was shown to be caused by
increases in the number of DNA sequence repeats in a gene Mispaired Nucleotides The accuracy of DNA replica-
known as FMR1. Since then, a number of strand slippage tion is due in large measure to complementary base pairing,
mutations have been identified as the causes of several heredi- A with T and G with C. We saw in Section 9.4, however,
tary diseases in humans and other organisms. The human that third-base wobble during translation offers occasional
diseases are classified as trinucleotide repeat expansion exceptions to complementary base-pair rules. Similar non-
disorders (Table 11.2). The wild-type alleles of the genes in complementary base pairing occasionally occurs during
question normally contain a variable number of DNA trinucle- DNA replication. These so-called non–Watson-and-Crick
otide repeats. On rare occasions, these gene regions undergo base pairs can include the mispairing of guanine with thy-
mutations through strand slippage that cause the number of mine or the mispairing of cytosine with adenine. Both sets
trinucleotide repeats to increase. For each of these disorders, of mispaired nucleotides form two hydrogen bonds.
expansion of the number of trinucleotide repeats beyond the The mispairing of a nucleotide in a newly synthe-
wild-type range results in a hereditary disorder. Most often sized DNA strand is identified as an incorporated error.
GENETIC ANALYSIS 11.1
PROBLEM In a mutant analysis, a goal is often to identify the type of mutation that has occurred.
In this problem, a fragment of a polypeptide with the wild-type amino acid sequence is given:
BREAK IT DOWN: Use the wild-type amino acid
Met–His–Ala–Trp–Asn–Gly–Glu–His–Arg sequence to determine the mRNA sequence,
including all possible synonymous codons, as the
The amino acid sequences of three mutants are shown below. starting point for mutant analysis. (Use the genetic
For each mutant, identify the type of mutation that has occurred code, p. 330; see also inside the front cover)
and specify how the mRNA sequence has been changed.
BREAK IT DOWN: Identification of the
mutations requires deducing each mutant
Mutation 1: Met–His–Ala–Trp–Lys–Gly–Glu–His–Arg mRNA sequence and comparing it to the
Mutation 2: Met–His–Ala wild-type mRNA sequence. (pp. 401–402)
Mutation 3: Met–Met–Leu–Gly–Met–Ala–Glu–His–Arg
Evaluate
1. Identify the topic this problem 1. This problem concerns mutations affecting the amino acid sequence of a gene.
addresses and the nature of the The type of change causing each mutation must be identified, and the effect of
required answer. the mutation on mRNA must be described.
2. Identify the critical information given 2. The wild-type amino acid sequence and the corresponding portions of the
in the problem. mutant polypeptides are given.
Deduce
3. Determine the sequence of the wild- 3. The sequence of the wild-type mRNA is
type mRNA.
59-AUG CAU/C GCN UGG AAU/C GGN GAA/G CAU/CA/CGN-39
TIP: Use N if the position could be
TIP: Use the genetic code in Figure 9.13 or in
occupied by any nucleotide, A/G for
Table B inside the front cover.
the alternative purines, and U/C for
alternative pyrimidines.
Solve
4. Compare each mutant sequence with 4. Mutant 1: This is a missense mutation in which the mutant polypeptide has one
the wild-type polypeptide, and iden- amino acid changed from Asn to Lys.
tify the probable types of mutations. Mutant 2: This is a nonsense mutation in which a Trp codon is changed to a stop
codon.
Mutant 3: This mutant contains alterations of five consecutive amino acids, begin-
ning with the second amino acid (His to Met). The wild-type sequence is restored
beginning with the seventh amino acid (Glu). This mutant results from two frame-
shift mutations. The first alters the reading frame, and the second restores it.
5. Determine the mRNA change 5. The wild-type (Asn) codon is AAU/C, and the mutant (Lys) codon is AAA/G. This
producing the missense mutant. change results from either a transition or a transversion mutation.
6. Determine the mRNA change 6. The wild-type Trp (UGG) codon is changed to a stop codon. The change is either
producing the nonsense mutant. UGG to UGA or UGG to UAG. In either case, this is a transition mutation.
7. Determine the mRNA change 7. The appearance of Met in position 2 means the second codon of the frame-
producing the frameshift mutant. shift mutant is AUG. This change requires deletion of the first C of the wild-type
sequence and means that U, not C, is present as the sixth nucleotide of the wild
type. Beginning with Glu, the wild-type amino acid sequence is restored. This
requires insertion of G immediately after the Ala codon.
For more practice, see Problems 4, 11, 22, and 32. Visit the Study Area to access study tools. Mastering Genetics
Figure 11.8 provides an example in which DNA replication 2 is where the mispairing of nucleotides is converted into
cycle 1 produces a wild-type DNA duplex (with an A-T a mutation (the sequence change that will be transmitted
base pair highlighted in the box) and a second DNA duplex through replication) in an event known as a replicated
with a G-T base pair (highlighted) as an incorporated error. error. Here an A-T to G-C base-pair substitution takes
This means there is an abnormality in DNA that might be place. The figure shows the thymine-containing strand pro-
repaired or might lead to a mutation. DNA replication cycle ducing a wild-type DNA duplex with an A-T base pair, and
407
408 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
5 1 2 3 4 5 6 7 8 9 10 11
5¿ TAA CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG TC 3¿
3¿ ATT GTC GTC GTC GTC GTC GTC GTC GTC GTC GTC GTC AG 5¿
the guanine-containing strand producing a duplex with a DNA replication frequently plays an important role in per-
base substitution (G-C base pair) mutation. Both the wild manently establishing incorporated errors and other kinds
type and the mutant contain complementary base pairs, so of DNA damage as mutations. Several examples are seen in
there is no possibility of detection by DNA repair systems. this and the following section.
(a) (b)
NH2 O NH2 O
H H H CH3 H CH3
N 3’
4’ 5’ Deamination N 3’ 4’
5’ N 3’ 4’ 5’ Deamination N 3’ 4’
5’
2’ 2’ 2’
1’ 6’ 1’ 6’ 1’ 6’
2’
1’ 6’
O N H O N H O N H O N H
Sugar Sugar Sugar Sugar
Cytosine (C) Uracil (U) 5-Methylcytosine Thymine (T)
(5-meC)
(c)
5¿ CTT GAT T 3¿
Repair using T strand as template. Transition mutation (C–G T–A)
3¿ GAA CTA A 5¿
5¿ CTT GAT T 3¿
or
3¿ GAG CTA A 5¿
Figure 11.10 Deamination. (a) Unmethylated cytosine is deaminated to form uracil. (b) Deamination of
5-methylcytosine forms thymine that is mismatched to guanine. (c) Mismatch repair can create a C - G to
T-A transition mutation or can remove the thymine to restore wild-type sequence.
A different scenario occurs, however, when deamina- the types of damage done to DNA by mutagen exposure, the
tion takes place on a cytosine that has been methylated. A mutational process itself, or the organism’s repair responses
methylated cytosine has the hydrogen atom at the number to DNA damage. This section looks at some of the specific
5 carbon replaced with a CH3 (methyl) group. Deamination ways chemical mutagens and ionizing radiation interact
of 5-methylcytosine (5meC) creates thymine and generates with DNA to create particular mutations. Some mutagens
a base-pair mismatch between the newly formed thymine are exotic or rare, but others are routinely present in the
on one strand and the previously complementary guanine everyday life of an organism. For this reason, the study of
on the other strand (Figure 11.10b). If DNA base-pair mis- mutagenesis through the production of induced mutations is
match repair does not correct the mismatch, the next round an important form of biological and public health research.
of DNA replication will include a C-G to T-A base-pair
substitution (Figure 11.10c). Chemical Mutagens
Deamination of 5-methylcytosine is associated with
hotspots of mutation. Cytosines that are side by side with Chemical compounds that induce mutations do so by spe-
guanines in a DNA strand are identified as CpG dinucleo- cific and characteristic interactions with DNA nucleotide
tides; the p signifies the single phosphate group between the bases or with the DNA molecule. As a result, they can be
nucleotides, and the C is upstream of the G. Cytosines of classified by their mode of action on DNA, creating DNA
CpG dinucleotides are frequent targets for methylation, par- damage by acting as (1) nucleotide base analogs, (2) deami-
ticularly in mammalian promoters, where methylation helps nating agents, (3) alkylating agents, (4) oxidizing agents,
regulate transcription (Section 13.2). Experimental evidence (5) hydroxylating agents, or (6) intercalating agents. As we
shows that base-pair substitution mutations at CpG dinucle- discuss these various mutagen–DNA interactions, remem-
otides are common in mammals. ber that each category of chemical mutagen reacts in a
specific way with DNA and produces a consistent and par-
ticular kind of mutation as a result. Compounds in each of
these categories and the types of mutations they cause are
11.3 Mutations May Be Caused by listed in Table 11.3.
Chemicals or Ionizing Radiation
Nucleotide Base Analogs A nucleotide base analog is
Mutations can be produced by interactions between DNA a chemical compound that has a structure similar to one
and chemical agents or between DNA and ionizing radia- of the DNA nucleotide bases and therefore can work its
tion. The agents generating mutation-inducing DNA damage way into DNA, where it pairs with a nucleotide base in
are called mutagens. These mutations can occur in nature, the DNA duplex. DNA polymerases are unable to dis-
but frequently mutagens are used in an experimental setting tinguish nucleotide base analogs from normal nucleotide
to generate induced mutations for the purpose of studying bases due to their similarity in molecular size and shape.
11.3 Mutations May Be Caused by Chemicals or Ionizing Radiation 411
T Bu (keto) Bu (enol) C
Deaminating Agents Nitrous acid (HNO2) is a deaminat-
ing agent, meaning an agent that removes an amino group
(d) Base
(NH2) from a nucleotide base with a mutagenic effect. One mispair in Mutation in
example is how nitrous acid deaminates adenine. The prod- Bu replication replication
uct of this deamination is the modified nucleotide hypoxan- Wild-type incorporation cycle 1 cycle 2
thine that can mispair with cytosine and lead to an A-T to G G A A
G-C base-pair substitution mutation (Figure 11.12a). C Bu (enol) Bu (keto) T
Alkylating Agents Alkylating agents add bulky side Figure 11.11 Mutation by incorporation of the nucleotide base
analog 5-bromouridine (BU).
groups such as methyl (CH3) and ethyl (CH3 9CH2) groups
to nucleotide bases. Ethyl methanesulfonate (EMS) is Q Does a mutation exist before or after DNA replication cycle
a powerful alkylating agent that adds an ethyl group to 1? Why?
412 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
Mutational mechanism
New partner Original Modified
Normal base Mutagen Modified base nucleotide base pair pair Mutation
(a) H H
H N N H H N O H N H
Nitrous acid Transition
(HNO2) A HNO2 H* G
N N N N H N H mutation
T C C (A–T G–C)
N Deamination converts N N
H adenine to hypoxanthine H O
Adenine (H*), which mispairs with Hypoxanthine Cytosine
cytosine.
(c) HO H
H NH2 H N H N N H
Hydroxylamine Transition
(NH2OH) C (NH2OH) C* T
H N H N H N N mutation
G A A (C–G T–A)
N Hydroxylation of N N
O cytosine produces O H
hydroxylaminocytosine
Cytosine (C*), which mispairs Hydroxylamino- Adenine
with adenine. cytosine
Figure 11.12 Examples of the action of chemical mutagens. In (a), H* is hypoxanthine. In (b) and
(c), the asterisks (*) denote modified nucleotides.
thymine, producing 4-ethylthymine, or an ethyl group to strand nicking (the breakage of a phosophdiester bond on
guanine, creating O6@ethylguanine (Figure 11.12b). This one DNA strand) that is not efficiently repaired. In the next
interferes with normal DNA base pairing by distorting DNA replication cycle, the nicked strands can gain or lose
the DNA double helix. EMS induces transition mutations one or more nucleotides. As a result, intercalating agents
through its action on guanine. cause frameshift mutations.
P O H
N
Proflavin O
N T O
H H H
H H H CH3 Adjacent
Intercalation P H CH3 thymines
H2N N NH2 O
H H N T O
N
O H
P
UV light
Benzo(a)pyrene (BaP)
H H H H OH
H H Intercalation O H O H
O N O N
H OH N T O N T O
H 6 6 5
HO
H CH3 H CH3
H H H CH3 H CH3
P P
H H 6 5
O N T 4 O O N T O
N N
O H O H
Figure 11.13 DNA intercalating agents. Proflavin and
benzo(a)pyrene intercalate into the double helix and distort its Figure 11.14 UV photoproducts. UV irradiation forms photo-
shape, generating strand nicking that can produce frameshift products from adjacent pyrimidines, distorting the double helix
mutations. and potentially blocking replication.
bonds that create aberrant structures called photoproducts. increase the mutation frequencies of genes and therefore
These photoproducts most often form between two adjacent pose a hazard to our health? Occasionally, the mutagenic or
pyrimidine nucleotide bases in a DNA strand. Two adjacent carcinogenic potential of a compound is so great that evi-
thymines are the most frequent locations for the creation of dence of its danger is relatively easy to identify. Much more
UV photoproducts that contain one or two additional cova- often, however, the mutagenicity of a compound is more
lent bonds (Figure 11.14). One common photoproduct called subtle, and careful analysis of experimental data is required
a thymine dimer contains two additional covalent bonds to ascertain it.
that join the 5 and 6 carbons of adjacent thymines. Another, For nearly 40 years, thousands of natural and synthetic
called a 6-4 photoproduct, also joins adjacent thymines, in compounds have been assayed for mutagenic potential by a
this case by formation of a bond between the 6 carbon of simple biological test developed by Bruce Ames. This proce-
one thymine and the 4 carbon of the other thymine. dure, called the Ames test, exposes bacteria to experimental
Organisms that experience regular UV exposure—and compounds in the presence of a mixture of purified enzymes
they range from bacteria to humans—have DNA repair sys- produced by the mammalian liver. In animals, ingested
tems that identify and correct most pyrimidine dimers. But chemicals are routed to the liver, where they are broken down
a few pyrimidine dimers may escape repair; and when they by detoxifying enzymes. Using a critical subset of detoxify-
do, DNA replication can be disrupted. These disruptions ing liver enzymes called the S9 extract, the Ames test mimics
lead to mutations, and they are a primary cause of the strong the biological defense processes that take place in the liver of
association between excessive UV exposure and skin cancer. animals exposed to chemical compounds. During enzymatic
Some of the specific DNA repair mechanisms that repair UV- breakdown in the liver, numerous intermediate products can
induced photoproducts are discussed in the following section. be produced, some of which may be mutagenic, even if the
original compound was not. The purpose of the Ames test is
to detect whether the original compound or any of its normal
The Ames Test
breakdown products is mutagenic.
In our day-to-day lives, we encounter scores of naturally The Ames test most commonly uses strains of the bac-
occurring and synthetic chemicals and compounds—in the terium Salmonella typhimurium that carry mutations affect-
food we eat, the air we breathe, the cars we drive, and even ing their ability to synthesize the amino acid histidine. These
the books we read. Each year new chemical compounds are bacteria are designated his - to indicate that their mutation
introduced as part of various commercial and industrial pro- prevents histidine synthesis. They will not grow unless
cesses. How do we determine which of these chemicals can they are provided with a medium that is supplemented with
414 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
histidine. The Ames test measures rates of new mutations base-pair substitution mutations and of those that induce
by identifying the rate of reversion mutations (his - to his +) frameshift mutations, by comparing reversion rates in experi-
that restore the ability of bacteria to synthesize their own mental bacterial cultures exposed to potential mutagens with
histidine, thus eliminating the need for histidine supplemen- spontaneous reversion rates in control bacterial cultures.
tation of the growth medium. In the experimental cultures, the S9 extract is added
The Ames test uses multiple his - strains of S. typhimur to different mutant strains of S. typhimurium. Because the
ium, each carrying different kinds of mutations of histidine- S. typhimurium strains have different mutations, research-
synthesizing genes. Some of the strains carry base-pair ers are able to test both the base substitution potential and
substitution (transition and transversion) mutations; others the frameshift potential of a test compound. Each mix-
carry frameshift mutations. The use of these different mutant ture is separately plated onto a medium lacking histidine
strains allows detection both of compounds that induce (Figure 11.15). The test compound is then added to a filter
Example Experiment
2 his− 1 is a base-substitution
his− 1 his− 2 mutant, his− 2 is a frameshift
base substitution frameshift mutant.
mutant mutant
7 An insignificant number of
revertant colonies indicates
the test compound does not
induce frameshift mutations.
A significant Few, if any, Few, if any,
number spontaneous revertant
of revertant revertant colonies colonies
colonies
Figure 11.15 The Ames test for potential mutagenicity of chemical compounds.
Q On the culture plate containing the Ames test of a powerful chemical mutagen, there will often be
an empty zone, with no revertants growing, immediately surrounding the paper disk. Farther away
from the disk, however, revertants will grow. Why does this zone of no growing revertants occur?
11.4 Repair Systems Correct Some DNA Damage 415
pair substitutions
tains a frameshift mutation. This result indicates that aflatoxin
B1 actively induces reversion of base-pair substitution mutants
but not of frameshift mutants. Genetic Analysis 11.2 guides you
1000 through an analysis of an Ames test of potential mutagens.
Evaluate
1. Identify the topic this problem 1. This problem concerns interpretation of the results of an Ames test of three
addresses and the nature of the compounds. The answer must identify which if any of the compounds are muta-
required answer. genic and describe the nature of that mutagenicity.
2. Identify the critical information given 2. The number of revertant colonies is given for each compound. A control result is
in the problem. also given. The cause of auxotrophy in each mutant strain is identified.
Deduce
3. Describe the meaning of growth 3. The control plates have had no test compound added. The growing colonies
results on the control plate. on these plates are spontaneous revertants from each of the auxotrophic tester
strains.
4. Deduce the meaning of growth 4. Compound A produces many revertants in strain 1 but no reversion over spon-
results on each of the experimental taneous levels in strain 2. Compound C generates many revertants in strain 2
plates. but does not produce revertants at a rate greater than the control in strain 1.
Compound B does not increase the reversion rate above the spontaneous level
in either strain.
Solve Answer a
5. Identify the mutagenic compounds 5. Compounds A and C are mutagenic, but compound B is not. The large num-
and justify your answer. bers of revertant colonies on the strain 1 test of compound A and the number
of revertants on the strain 2 test of compound C identify these compounds as
mutagens. Compound B does not show an increased rate of reversion relative to
the background numbers on the control plates.
Answer b
6. Describe the nature of mutagenicity 6. Compound A causes frameshift mutations by inducing a high rate of reversion of
for each compound. his- strain 1 auxotrophs. Compound C causes a high rate of reversions of strain
2 auxotrophs by inducing base-pair substitution reversions.
TIP: Base-pair substitution mutagens generally
revert base-pair substitution auxotrophs, and
frameshift mutagens revert frameshift auxotrophs.
For more practice, see Problems 9, 30, and 35. Visit the Study Area to access study tools. Mastering Genetics
416
11.4 Repair Systems Correct Some DNA Damage 417
Photoreactive repair Repair of UV-induced photoproducts catalyzed by photolyase activated by visible light
Base excision repair (BER) Removal of an incorrect or damaged DNA base and repair by synthesis of a new strand
segment (nick translation)
Nucleotide excision repair (NER) Removal of a strand segment containing DNA damage and replacement by new DNA
synthesis
Mismatch repair Removal of a DNA base-pair mismatch by excision of a segment of the newly synthesized
strand followed by resynthesis of the excised segment
DNA damage is through photoreactive repair. This direct illustrates an example of BER that is initiated by the rec-
DNA repair mechanism is found in bacteria, single-celled ognition and removal of a uracil that is mispaired with a
eukaryotes, plants, and some animals (e.g., Drosophila) guanine. The uracil was derived from the deamination of
but not in humans. Photoreactive repair utilizes the enzyme 5-methylcytosine, as described in the previous section.
photolyase to bind to a UV-induced photo product. Once The enzyme DNA N-glycosylase removes the uracil, cre-
bound, photolyase uses visible light to direct energy into ating an AP (apyrimidinic) site. The enzyme AP endonu-
breaking the bonds that produce the photoproduct. In clease then cuts the sugar-phosphate backbone at the 5′
E. coli, photolyase is the product of the phr (photoreactive side of the AP site. This single-stranded break is called a
repair) gene. Mutations of this gene result in a substantial “nick.” In a process called nick translation, DNA poly-
increase in UV-induced mutations in bacteria. Photolyase merases recognize the nick and initiate the removal and
mutations in other organisms similarly result in increases in replacement of DNA nucleotides, including the AP site.
the mutation rate. Nick translation is essentially identical to the process that
removes and replaces the RNA primer during DNA 1 rep-
Base Excision Repair Damage to a DNA base or the pres- lication. After replacing several nucleotides, DNA ligase
ence of an incorrect base can initiate the direct DNA-repair seals the sugar-phosphate backbone, and repair is com-
process known as base excision repair (BER). This pro- plete. Different DNA polymerases undertake BER in bac-
cess first identifies and removes the damaged DNA base. It teria and eukaryotes, and the precise mechanisms of repair
then breaks one strand of DNA near the excised base and vary somewhat, but the overall process of BER is the same
utilizes the single-stranded break as the site from which to in all organisms.
initiate synthesis of a short DNA segment that replaces sev-
eral nucleotides, including the damaged base. Figure 11.17 Nucleotide Excision Repair A third process for directly
repairing DNA damage is nucleotide excision repair
(NER). It is a very common repair process found in virtu-
ally all bacterial and eukaryotic species, including humans.
5¿ AGTCGACTTAG 3¿ 1 DNA N-glycosylase NER is frequently used to repair UV-induced damage to
3¿ TCAGUTGAATC 5¿ recognizes a base-
pair mismatch... DNA. For this reason it is also known as ultraviolet (UV)
repair (Figure 11.18). In UV-damage repair, NER is car-
5¿ 3¿
ried out by the protein products of four UV repair genes
AGTCGACTTAG 2 ...and removes the
TCAG TGAATC incorrect uracil, called uvrA, uvrB, uvrC, and uvrD. Two molecules of UVR
3¿ 5¿
creating an AP A protein and one molecule of UVR B protein bind on
(apyrimidinic) site. one strand of DNA opposite the site of the photoproduct.
5¿ AGTCGACTTAG 3¿ 3 AP endonuclease The two molecules of UVR A dissociate from the strand,
3¿ TCAG TGAATC 5¿ generates a single- and a molecule of UVR C joins UVR B to form a UVR
Nick stranded nick on 5¿ BC complex. Each UVR C cleaves a bond about four or
side of the AP site...
five nucleotides to the 3′ side or the 5′ side of the pho-
5¿ 3¿
toproduct. The single-stranded fragment of approximately
AGTCGACT TAG 4 ...and DNA polymerase
removes and replaces 12 nucleotides containing the photoproduct is released
3¿ TCAGCTGAATC 5¿
several nucleotides with the help of UVR D, which is a DNA helicase. A DNA
Nick-translated of the nicked strand by polymerase binds to the exposed 3′ OH end created by the
segment nick translation.
removal of a strand segment and synthesizes a replacement
Figure 11.17 Base excision repair. DNA N-glycosylase and for the lost segment, using the complementary strands as a
AP (apurinic) endonuclease remove mismatched and damaged template. When synthesis is complete, DNA ligase binds
nucleotides from DNA. to the gap to reseal the sugar-phosphate backbone. This
418 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
UV-damaged DNA their cancer risk, and in extreme cases they must even
5¿ TT 3¿ avoid exposure to fluorescent lights, which emit a low
3¿ 5¿ level of UV irradiation.
GATC
5¿
3¿
CTAG
MutH
M
C MutS
5¿
2 MutS binds to a base-pair
3¿ mismatch and attracts
T MutL, and the complex
MutL contacts MutH.
GATC
5¿
3¿
CTAG
MutH
M
C
5¿ 3 MutH cleaves the
unmethylated (new)
3¿ DNA strand, generating
a single-strand gap.
MutH
cleavage site
5¿
3¿
CTAG
M Repaired
mismatch
C
5¿ 4 The gap is filled by DNA
polymerase activity to repair
3¿ the mismatch.
G
GATC
5¿
3¿
CTAG
M
of genes involved in NER, mutations of the base-pair mis- regulated genetic processes involving numerous genes and
match repair process appear to lead to the accumulation of proteins. In humans and other mammals, a certain multipro-
mutations and to the development of cancer. tein complex acts as a genomic sentry to identify damage.
This damage-response process is active throughout the cell
cycle and is especially important in regulating the G 1@to@S
DNA Damage-Signaling Systems
transition, preventing the cell cycle from progressing to
The biochemical mechanisms that recognize DNA dam- S phase until the cell has adequately repaired any mutations.
age and mount a damage-repair response are crucial to the One important protein in this process is BRCA1, the
health and survival of an organism. They consist of tightly product of the first gene implicated in familial breast and
420 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
ovarian cancer susceptibility (see Experimental Insight 5.1, outcome. E. coli cells that undergo extensive damage might
p. 171). A second protein that plays a pivotal role in com- also die, but there is a second repair mechanism that can be
municating DNA damage is called ATM. DNA damage activated in E. coli in response to massive DNA damage.
acquired through chemical or radiation exposure is sensed This repair system, called SOS repair, has been known for
using ATM as a signal transduction molecule to activate decades but has only recently been understood at the molec-
transcription of the p53 gene that produces the protein p53. ular level. The system takes its name from the old maritime
By this mechanism, ATM activates the “p53 repair pathway” phrase “save our ship,” used when sinking was imminent.
that controls cellular response to mutation by deciding either Recent research demonstrates that SOS repair is
(1) to pause the cell cycle at the G 1@to@S transition to allow accomplished by activating specialized translesion DNA
time for mutation repair or (2) to direct the cell to the apop- polymerases in a process known as translesion DNA
totic pathway, in which it undergoes programmed cell death. synthesis. This short-lived process allows DNA replica-
In healthy cells, p53 level is low, but the level increases tion by alternative polymerases able to bypass lesions that
in response to DNA damage. A high level of p53 initiates block the action of DNA polymerase III (pol III), the main
G 1 arrest of the cell cycle. The p53-induced pause in the cell DNA-replicating polymerase in E. coli. Translesion DNA
cycle allows time for the repair of DNA damage. The com- polymerases only function in the synthesis of short DNA
pletion of DNA repair depletes p53, and the cell cycle transi- segments, and they lack the DNA proofreading capability
tions to S phase. If the p53-induced pause goes on too long, of DNA pol I and DNA pol III. Translesion polymerases are
however, the pathway senses that there is a large amount considered to be “error-prone,” since their use can lead to
of DNA damage that cannot be quickly repaired. The long uncorrected DNA replication errors.
pause allows the apoptotic pathway to go forward, and the The SOS system in E. coli operates through a transle-
cell undergoes programmed cell death. As these alternatives sion DNA polymerase identified as polymerase V (“poly-
indicate, p53 sits at a critical junction of cell behavior, deter- merase five”), or pol V. When pol III stalls at damaged
mining whether the cell is merely pausing in its replication DNA, a protein called RecA coats the template strand ahead
cycle or whether it will self-destruct by apoptosis. of the lesion. This part of the template strand is already
Given the critical role of p53, it may not come as a surprise bound by single-stranded binding protein (SSB). Recall that
that mutation of the p53 gene is strongly associated with can- SSB coats the single, separated DNA strands ahead of the
cer development. We take up details of the connection between replication fork (see Foundation Figure 7.14). The RecA
mutations of p53, the occurrence of cancer, and the transmis- protein in the resulting DNA–RecA–SSB complex acti-
sion of p53 mutation in the human familial cancer syndrome vates transcription of several genes, including pol V. Pol V
known as Li–Fraumeni syndrome (OMIM 151623) in Appli- displaces polymerase III, synthesizes a short portion of the
cation Chapter C: The Genetics of Cancer. Mutation of p53 is daughter strand across the DNA lesion, and is then replaced
also implicated, more broadly, in other cancers. Information by pol III, which resumes its normal replication activity.
accumulated over the past two decades indicates that p53 is The evidence indicates that eukaryotes use a similar system
one of the most commonly mutated genes in cancer cells. of specialized DNA polymerases to bypass DNA damage
that blocks replication.
11.5 Proteins Control Translesion
DNA Synthesis and the Repair of Double-Strand Break Repair
A frequent feature of the DNA repair mechanisms that
Double-Strand Breaks circumvent replication blockage is the use of the template
strand to guide DNA repair, replacement, and synthesis by
The repair mechanisms described to this point are able to
specialized polymerases. These repair systems are effec-
repair DNA damage, but not all DNA damage is repaired in
tive as long as one strand of DNA is intact and can serve
those ways. Damage that escapes repair before the initiation
as a template. But what happens if both strands of DNA
of DNA replication has the potential to block replication.
are damaged in a manner that does not provide a template
Circumventing this potential blockage requires mechanisms
strand for strand repair? Such lesions are known as double-
that can permit replication to progress despite the presence
strand breaks (DSBs). Because they can cause chromo-
of damage that is potentially mutagenic. Another challenge
some instability and incomplete replication of the genome,
to organisms is the occasional breakage of one or both DNA
double-strand breaks are potentially lethal to cells and ele-
strands, which can also block DNA replication and may lead
vate the risk of cancer and the chance of chromosome struc-
to cell death if it is not successfully overcome.
tural mutations.
To protect organisms from the unpleasant consequences
Translesion DNA Synthesis of double-strand breaks, two mechanisms have evolved
In response to widespread DNA damage, molecular activi- to carry out double-strand break repair. The first is an
ties in the cell may direct the cell to initiate apoptosis. The error-prone repair process known as nonhomologous end
activity of the p53 protein in eukaryotic cells can lead to this joining that repairs double-strand breaks occurring before
11.5 Proteins Control Translesion DNA Synthesis and the Repair of Double-Strand Breaks 421
Ku80
and Figure 11.21 shows a double-stranded break. Notice
Ku70 2 Ku80–Ku70–PKCS that one chromatid has two broken DNA strands but that
protein complex the sister chromatid is undamaged. SDSA begins with
binds DNA ends.
trimming of one of the broken strands. This is followed by
attachment of the protein Rad51. Rad51 binds to the strands
and facilitates the invasion of the intact chromatid by the
resected end of a strand from the sister chromatid. This
strand invasion process displaces one strand of the intact
3 Ends are
trimmed, duplex and creates a displacement (D) loop. DNA replica-
resulting in a loss tion within the D loop synthesizes new DNA strands from
of nucleotides. intact template strands. The sister chromatids are reformed
by dissociation and annealing of the nascent strands to
repair the breaks. By accomplishing the removal of DNA
4 DNA ligase in the immediate vicinity of a double-stranded break and
ligates blunt the replacement of the excised DNA with a duplex identi-
ends to reform cal to that in the sister chromatid, SDSA carries out error-
an intact duplex.
free repair of double-stranded breaks. This mechanism
Figure 11.20 Nonhomologous end joining (NHEJ). NHEJ is an for repairing double-strand breaks is closely related to the
error-prone system that rejoins DNA strands following a double- molecular mechanism that generates homologous recombi-
stranded break. nation during meiosis.
422 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
11.6 DNA Double-Strand Breaks is a homolog of the eukaryotic and archaeal protein Rad51,
which performs a similar function in those organisms (recall
Initiate Homologous Recombination its role in synthesis-dependent strand annealing, above). The
multiprotein complex known as RecBCD then attaches to the
Homologous recombination is the exchange of genetic region of a bacterial chromosome to which RecA is bound,
material between homologous molecules of DNA. All and this complex promotes single-strand invasion and the
organisms undertake homologous recombination. In bac- formation of D loops. The process is highly similar in appear-
teria, homologous recombination occurs during events ance to the strand invasion and D-loop formation we saw in
such as conjugation and as a consequence of the repair SDSA. RecBCD activity is followed by binding of RuvAB
of double-strand breaks. Archaea undertake homologous and RuvC proteins. The Ruv complex completes homolo-
recombination under circumstances similar to those in bac- gous recombination between the bacterial DNA molecules.
teria. In eukaryotes, recombination between homologous
chromosomes is essential in prophase I of meiosis, where The Double-Stranded Break Model
it is initiated by controlled double-strand DNA breaks in a
of Homologous Recombination
process that is reminiscent of synthesis-dependent strand
annealing. The bacterial RecBCD pathway of homologous recom-
bination was the starting point for the study of meiotic
The Holliday Model recombination in eukaryotes, where numerous protein
homologies have since been identified. The outline of the
The first viable molecular model of meiotic recombination
current model of meiotic recombination was proposed in
was proposed by Robin Holliday in 1964 and was based on
1983 by Jack Szostak, Terry Orr-Weaver, Rodney Roth-
the study of homologous recombination in E. coli. Known
stein, and Franklin Stahl. Their model was the first to pre-
as the Holliday model, it offered a plausible scheme for
dict that the creation of double-stranded breaks controlled
meiotic recombination by hypothesizing that spontane-
by the activity of a specific protein was the foundation of
ously generated single-stranded breaks in one chromatid
meiotic recombination. The accumulated experimental evi-
led to invasion of a homologous molecule. Holliday’s
dence has confirmed this view, and the research has added
scheme for breaking and rejoining DNA strands suggested
major new details to the original proposal by Szostak and
that some encounters between homologous chromosomes
his colleagues.
would produce crossovers whereas others would not.
Among these new findings is the determination that
The original Holliday model ultimately proved to be
the double-strand breaks that precede meiotic recombina-
too simplistic and has been superseded by more accurate
tion are under precise protein control. This is in contrast
models of meiotic recombination. The more recent mod-
to a more generalized and diverse process of generating
els rely on some of the features of the Holliday model but
double-strand breaks in bacterial DNA. A second finding is
incorporate new knowledge and steps. Perhaps the most
the strong homology that exists between the genes and pro-
important features distinguishing the current model of mei-
teins involved in bacterial homologous recombination and
otic recombination from the original Holliday model are,
homologous recombination in archaea and eukaryotes.
first, that meiotic recombination is now known to be initi-
As currently understood, eukaryotic homologous recom-
ated by double-stranded DNA breaks and, second, that the
bination is initiated by the protein Spo11 (“Spoh eleven”) that
double-stranded breaks initiating meiotic recombination are
was first discovered in yeast (Foundation Figure 11.22 1 ).
generated in a programmed manner by the activity of a spe-
The proteins Mrx and Exo1 (homologs of RecBCD helicase
cialized enzyme.
and nuclease) associate with Spo11 and help trim the cut
strands 2 . Two RecA homolog proteins, Rad51 and Dmc1,
The Bacterial RecBCD Pathway join at the trimmed region 3 . Rad51 and Dmc1 are RecA
Homologous recombination in all organisms shares many homologs. This protein complex helps form a strand-exchange
features in terms of the mechanical processes involved as assemblage, facilitating strand invasion and formation of a D
well as the homologies of proteins that are active in recom- loop 4 , 5 . (Note the similarity of this structure to the D loop
bination. The first, and still the most detailed, molecular formed during SDSA.)
description of homologous recombination comes from The invading strand pairs with the complementary
research on E. coli. This homologous recombination model strand in the D loop. Outside the D loop, the two strands
describes the action of several proteins that are critical to that appear to cross over one another form a Holliday
initiating and completing homologous recombination. junction, an interim structure proposed in the original
Known as the RecBCD pathway, the system of homol- Holliday model. Notice that there is also a heteroduplex
ogous recombination in bacteria relies on the occurrence of region, containing two complementary strands of DNA
DNA double-strand breaks to initiate the process. Double- that originated in different homologs. Also identified as
strand DNA breaks attract the protein RecA (described heteroduplex DNA, these regions are a molecular signa-
above as part of the SOS system in E. coli). Bacterial RecA ture of homologous recombination. Because the two strands
F O U N D A T I O N F I G U R E 11.22
4 The strand-exchange filaments promote strand 3 Dmc1 and Rad51c assemble strand-
invasion. exchange nucleoprotein filaments.
Dmc1 + Rad51c
B1 A1 B1 A1
5¿ 3¿ 3¿ 5¿ 3¿ 3¿
3¿ 5¿ 3¿ 5¿
3¿
3¿ 3¿ 5¿ 3¿ 5¿
5¿ 3¿ 5¿ 3¿
B2 A2 B2 A2
5 Strand invasion creates one D loop and the first 6 Strand extension by DNA polymerase displaces
heteroduplex region. Rad52, Rad59, and other D loop DNA and pairs with complementary
proteins participate. single-stranded DNA to form the second
heteroduplex region.
Holliday junction Heteroduplex region Rad52 and Rad59
B1 A1 B1 A1
5¿ D loop 3¿ 5¿ 3¿
3¿ 5¿ 3¿ 5¿
3¿ 5¿ 3¿ 5¿
5¿ 3¿ 5¿ 3¿
B2 A2 B2 A2
DNA synthesis
Heteroduplex region
8 Double Holliday junctions form after the nick is 7 DNA pol strand extension and ligation fills the single-
sealed; chromatids contain offset heteroduplexes. stranded gap in the strand paired with D loop DNA.
Holliday Strand extension
Heteroduplex region junction and ligation
B1 A1 B1 A1
5¿ 3¿ 5¿ 3¿
3¿ 5¿ 3¿ 5¿
3¿ 5¿ 3¿ 5¿
5¿ 3¿ 5¿ 3¿
B2 A2 B2 A2
Holliday Heteroduplex
junction region
423
424 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
of the heteroduplex DNA originate in different homologs, The resolution of Holliday junctions generates genetic
there may be mismatched base pairs between them. In other recombination. Figure 11.23a shows the double Holliday
words, if heterozygosity is present in the DNA sequences junction structure present at the end of the events in Founda-
forming a heteroduplex region, one or more base pairs will tion Figure 11.22. Through opposite sense resolution—that
be mismatched in the heteroduplex DNA. is, the cutting and rejoining of the DNA strands in one of the
Extension of the invading strand and DNA synthesis Holliday junctions (the one on the left is illustrated here) and
within the broken strand are guided by intact template strands cutting and rejoining of the two strands outside the second
6 and are assisted by additional proteins, including Rad52 Holliday junction—genetic recombination is achieved, leav-
and Rad59, that are RecBCD homologs 7 . At this point, a ing heteroduplex DNA in both recombinant chromosomes.
second heteroduplex region has formed. The 3′ end of the One recombinant chromosome is B1A2 and the other is B2A1.
invading strand next connects with the 5′ end of a strand The same outcome can be achieved if the two Holliday junc-
segment that was initially part of the invading strand 8 , to tion strands at the right are cut and rejoined and the strands
form a second Holliday junction. Now the nonsister chroma- outside the Holliday junction on the left are cut and rejoined.
tids of the recombining chromosomes are interconnected to Figure 11.23b illustrates same sense resolution, the resolu-
one another by the presence of double Holliday junctions tion that comes about by cutting and rejoining of the DNA
(DHJs): The recombining chromosomes contain DHJs and strands in both Holliday junctions. In this case, heteroduplex
two heteroduplex regions. DNA is present in both resulting chromosomes, but genetic
The DHJs appearing in step 8 of Foundation Figure recombination does not take place. Opposite sense resolu-
11.22 are present during prophase I of meiosis. For meiosis tion is more common than same sense resolution; thus,
to proceed, the homologous chromosomes must be disentan- homologous recombination in meiosis is likely to lead to the
gled. This involves cutting and rejoining the DNA strands production of recombinant chromosomes. There is evidence
in at least one of the Holliday junctions, and the pattern of that some homologous recombination events do not produce
cutting and rejoining is what leads to genetic recombination recombinant chromosomes, however; and same sense reso-
between the homologs. lution explains this outcome.
Figure 11.23 Resolution of the double Holliday junctions of homologous chromosomes. (a) Opposite
sense resolution generates genetic recombinants. (b) Same sense resolution does not produce genetic
recombination.
11.7 Transposable Genetic Elements Move throughout the Genome 425
11.7 Transposable Genetic genetic element from E. coli that has 5-bp flanking direct
repeats on either side of the element and two 6-bp terminal
Elements Move throughout inverted repeats surrounding the central region that contains
the Genome the DNA sequence of the transposable element.
Terminal inverted repeats are part of the sequence of a
transposable element, but flanking direct sequences are not.
Transposable genetic elements are DNA sequences of
The transposition process generates these flanking sequences,
various lengths and sequence composition that have evolved
as shown in Figure 11.25. Staggered cuts of both strands of
the ability to move within the genome by an enzyme-driven
a DNA sequence targeted for insertion leave short single-
process known as transposition. Transposition has two
stranded ends where the cut occurred. The DNA target
principal effects on genomes. First, transposition can be a
sequence can potentially be any sequence in the genome. The
mutational event—one that has a biological basis as opposed
enzyme transposase, produced by the transposable element,
to a chemical or physical (irradiation) cause. Second, trans-
is the enzyme that generates the staggered cuts of the target
position can increase genome size through duplication of
sequence. Figure 11.25 shows that transposition of the genetic
the transposable genetic elements.
element into the target sequence leaves short single-stranded
The movement of transposable genetic elements
gaps. These are filled by DNA synthesis, completing the inser-
throughout the genome occurs in two ways. One is through
tion of the transposable genetic element. The insertion event
the excision of a transposable element from its initial loca-
illustrated here generates the same 5-bp flanking direct repeats
tion and its insertion in a new location. This process is
as are next to the transposable element shown in Figure 11.24
potentially mutagenic, but it does not contribute to a mean-
(which also has the same 6-bp terminal inverted repeats).
ingful increase in genome size. The second mechanism of
Transposable elements fall into two categories. DNA
transposition is a duplication mechanism that generates a
transposons (also called Class II transposable elements)
copy of the transposable element for insertion in a new loca-
transpose as DNA sequences. Their transposition produces
tion. As a result, the genome is left with both the original
flanking direct repeats at the site of insertion. At a mini-
copy of the element and the new copy as well. This pro-
mum, all DNA transposons carry the transposase gene that
cess can be mutagenic and can also lead to an increase in
produces the transposase enzyme required for the move-
genome size, particularly when large numbers of copies of
ment of the transposon, but many DNA transposons also
the transposable element are present.
carry other genes.
The second category of transposable elements are
The Characteristics and Classification of retrotransposons (also called Class I transposable ele-
Transposable Elements ments), which transpose through an RNA intermediate.
Transposable elements have been found in all organisms. Retrotransposons are composed of DNA, but they are tran-
They exist in a wide array of types that vary from the simplest, scribed into RNA before transposition, and the RNA tran-
encoding only the information required for transposition of the script is then copied back into DNA by the specialized
element, to much more complex structures that encode numer- enzyme reverse transcriptase. The reverse-transcribed DNA
ous functions beyond transposition. Antibiotic resistance is an is then inserted into a new location, where flanking direct
example of the additional functions that can be included. repeats are formed. Some, but not all, retrotransposons
Despite these differences, transposable elements have carry the reverse transcriptase gene, an enzyme that copies
two distinctive sequence features that make them recog- single-stranded RNA into DNA. Retrotransposons carrying
nizable in genomes and leave a “molecular signature” of the reverse transcriptase gene can initiate their own transpo-
their presence: (1) The transposable element itself contains sition, whereas those lacking the gene must utilize reverse
terminal inverted repeats on both its ends, and (2) the transcriptase synthesized by another retrotransposon.
inserted transposable element is bracketed by flanking DNA transposons follow one of two modes of insertion.
direct repeats. Figure 11.24 illustrates a transposable Replicative transposition can be thought of as a “copy-and-
paste” process, whereby the original copy of the transpos-
able element remains in place and a new copy is transposed
Central to another location. Alternatively, some DNA transposons
region undergo nonreplicative transposition; this can be thought
5¿ TGAACTAAATC GATTTATGAAC 3¿ of as a “cut-and-paste” mechanism. In this process, the orig-
3¿ ACTTGATTTAG CTAAATACTTG 5¿ inal copy of the transposon is excised, and it is then rein-
serted into a new location. As indicated above, both modes
Terminal inverted
of transposition can cause mutations, but whereas replicative
repeats transposition increases the transposable element copy num-
ber and potentially increases genome size, nonreplicative
Flanking direct repeats
transposition does not. Retrotransposons are also a frequent
Figure 11.24 The general structure of DNA transposons. source of increases in genome size in eukaryotic genomes.
426 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
Terminal inverted
repeats
Flanking direct repeats
The Mutagenic Effect of Transposition by Mendel. Research led by Cathie Martin in the early
1990s identified the gene Mendel examined and described
Transposable elements create mutations by their inser- its mutation by the insertion of a transposable element.
tion into wild-type alleles. The insertion of new DNA into The Case Study at the end of this chapter describes how
a functional gene is the equivalent of inserting a random transposition alters the DNA, mRNA, and protein from the
string of letters into a sentence. Just like the sentence is ren- mutant allele.
dered unintelligible and therefore nonfunctional by a ran-
dom insertion, so is the wild-type allele rendered unable to Transposable Elements in Bacterial
produce a wild-type gene product and thus nonfunctional by
the random insertion of a transposon. This mutational pro-
Genomes
cess is known as insertional inactivation. Bacterial genomes, as well as plasmids and viruses, con-
Numerous examples of insertional inactivation muta- tain three types of transposable elements: (1) simple trans-
tions by transposition are known in bacteria, plants, and posons known as insertion sequences (ISs), containing
animals, including humans. A number of human heredi- sequences encoding terminal inverted repeats surrounding
tary conditions are caused by transposition. The blood a gene (sometimes two genes) encoding transposase, (2)
clotting disorder hemophilia A (OMIM 300841) is caused composite transposons, designated Tn in bacteria, that con-
by absence of activity of the blood clotting protein factor tain a transposase gene, two flanking IS elements, and one
VIII (“factor eight”). The F8 gene is X-linked, and one of or more additional genes and, (3) noncomposite transpo
the many mutations of the gene is the result of the inser- sons, similar to composite transposons but lacking insertion
tion of a transposable element. A second example is Cof- sequences.
fin–Lowry syndrome (OMIM 303600). This X-linked
condition produces skeletal malformations, growth retar- Insertion Sequences Numerous IS elements are found
dation, hearing deficits, and mental impairment. Among in bacterial, archaeal, and viral genomes and also in
numerous mutations of the RPS6KA3 gene that controls plasmids. These are simple DNA sequences that con-
Coffin–Lowry syndrome is one involving the insertion of tain only the genetic information necessary for their own
a transposable element that inactivates the gene. But the transposition. Ranging between about 800 and 2000 bp,
original example of mutation by insertional inactivation IS elements insert by either replicative or nonreplica-
was the round versus wrinkled pea phenotype examined tive transposition. All IS elements have terminal inverted
11.7 Transposable Genetic Elements Move throughout the Genome 427
repeats surrounding the transposase gene, and a few have Genetic Analysis 11.3 guides you through an assess-
one additional gene. The smallest of the IS elements, ment of potential terminal inverted repeat sequences of IS
designated IS1, is typical of many IS elements. Totaling elements.
768 bp in length, IS1 contains the transposase gene sur-
rounded by two 23-bp terminal inverted repeats and two Transposable Elements in Eukaryotic
9-bp flanking direct repeats.
Genomes
Composite Transposons Bacterial composite transposons Transposable genetic elements are plentiful and highly
(Tn) are considerably larger than IS elements, and they can varied in eukaryotic genomes. These elements fall into
contain multiple genes in addition to their transposase gene. two groups. The first are similar to bacterial transposable
The additional genes in Tn elements are variable and are elements. These are generally short sequences that carry
contained in a central region that is flanked by the two IS inverted repeats. Examples of these bacterial-like transpos-
elements. The genes in the central region confer character- able elements, described in this section, include Ac and Ds
istics such as antibiotic resistance and resistance to the toxic elements in maize and P elements in Drosophila. The sec-
consequences of heavy metal exposure. These transposable ond category of eukaryotic transposable elements are the
elements can thus carry genes that may confer a growth retrotransposons, which transpose through an RNA inter-
advantage in certain environments. mediate. Examples of these elements, also discussed in this
Tn10 has a structure typical of most composite trans- section, are human Alu sequences and Ty and copia ele-
posons (Figure 11.26). It contains two copies of the IS10 ments of yeast and Drosophila, respectively.
element, each with its terminal inverted repeats. These are Eukaryotic genome sequence analysis finds that sub-
designated IS10R on the right (R) side and IS10L on the left stantial proportions of many genomes are composed of
(L) side, and they flank the central region. Each of the IS transposable DNA. For example, nearly half of the human
elements is about 1300 bp in length, and the Tn10 central genome—well more than 1 billion base pairs—is com-
region is about 6600 bp in length. It contains a Tet R gene posed of transposable DNA. Much of this DNA is repeti-
for resistance to the antibiotic tetracycline. The total length tive in sequence, indicating that up to tens to hundreds of
of Tn10 is about 9300 bp. The Tn10 transposon readily thousands of copies of various transposable elements are
inserts into plasmid DNA, allowing rapid dissemination of present. Many eukaryotic genomes exhibit a similar pro-
tetracycline resistance among bacterial strains that carry the file, evidence that transposition has been a major factor
plasmid. in eukaryotic genome evolution. It is equally evident that
transposition continues to play an active role in the evolu-
Noncomposite Transposons Bacteria can also carry a
tion of genomes and in mutation.
third type of DNA transposon, known as a noncomposite
transposon. These transposons do not contain insertion
sequences but do carry additional genes. They transpose in The Discovery of Ds and Ac Elements
the same manner as composite transposons. The noncompos- in Maize
ite transposon Tn3, for example, carries two 38-bp inverted Transposable genetic elements were discovered in
repeats flanking a 4957-bp central region that encodes three eukaryotes. Barbara McClintock discovered transposi-
enzymes: transposase and resolvase, both of which are tion in a series of studies of a mutant phenotype of ker-
required for transposition, and b@lactamase, which provides nel color in maize (Zea mays) that she conducted in the
resistance to the antibiotic ampicillin. 1930s. When McClintock proposed her model of trans-
position it was not well received. The overwhelming
prevailing notion at the time was that except for the rare
occurrence of gene mutations, genomes were stable, and
IS10L IS10R the idea that pieces of the genome could jump from place
1329 bp ~6600 bp 1329 bp to place seemed untenable. Resistance to McClintock’s
Transposase Transposase ideas barely wavered for more than two decades before it
gene gene began to crumble in the face of the discovery of transpos-
Tet R able genetic elements in bacteria.
McClintock had been studying the C gene for maize.
The dominant wild-type allele C produces purple ker-
Terminal Tetracycline Terminal
inverted resistance inverted nels, and a mutant c1 allele produces yellow kernels. One
repeats gene repeats gene that is closely linked to C produces plump (Sh) or
Flanking Flanking shrunken (sh) kernels, and a second closely linked gene
direct direct
repeat repeat
produces shiny (Wx) or waxy (wx) kernels (Figure 11.27a).
In experiments with several trihybrid strains of maize
Figure 11.26 Structure of a composite transposon, Tn10. with the genotype C Sh Wx/c1 sh wx, McClintock found a
GENETIC ANALYSIS 11.3
PROBLEM Each pair of DNA sequences shown below occur on the same strand of DNA and are separated by a large number
of nucleotides. Which of these sequences might be found flanking an insertion sequence? Explain your answer, and identify the
relevant parts of your selected sequences.
a. 5′-TTAGCAC . . . CAGGATT-3′ BREAK IT DOWN: Terminal inverted repeat
sequences are characteristically found at the
b. 5′-GGCCAAT . . . ATTGGCC-3′ ends of insertion sequences (p. 425).
c. 5′-CCGACCGTA . . .CCGACCGTA-3′
d. 5′-AGTATACCGC . . .GCGGTATGGC-3′
Evaluate
1. Identify the topic this problem 1. This problem requires you to recognize DNA sequences that might flank a bacte-
addresses and the nature of the rial insertion sequence. You must identify one or more of the given choices as
required answer. candidate flanking sequences, explain your answer, and identify the relevant por-
tions of the sequences.
2. Identify the critical information given 2. We are given four single-stranded segments of DNA, each identifying the
in the problem. sequences sitting on opposite sides of potential insertion sequences.
Deduce
3. Determine the double-stranded 3. The double-stranded sequences are
sequences for each of the single-
a. 5′-TTAGCAC...CAGGATT-3′
stranded sequences listed.
3′-AATCGTG...GTCCTAA-5′
b. 5′-GGCCAAT...ATTGGCC-3′
3′-CCGGTTA...TAACCGG-5′
c. 5′-CCGACCGTA...CCGACCGTA-3′
3′-GGCTGGCAT...GGCTGGCAT-5′
d. 5′-AGTATACCGC...GCGGTATGGC-3′
3′-TCATATGGCG...CGCCATACCG-5′
4. Review what you know about the 4. The sequences flanking insertion elements are inverted repeat sequences.
sequences flanking insertion
elements.
Solve
5. Identify any sequence that might be 5. Sequences b and d in step 3 are the ones most likely to be found flanking inser-
found flanking an insertion sequence. tion sequences. The inverted repeat sequences in double-stranded DNA are
highlighted.
5′-GGCCAAT...ATTGGCC-3′
3′-CCGGTTA...TAACCGG-5′
5′-AGTATACCGC...GCGGTATGGC-3′
3′-TCATATGGCG...CGCCATACCG-5′
For more practice, see Problems 12 and 19. Visit the Study Area to access study tools. Mastering Genetics
few unusual kernels that were mostly purple but had yel- mostly yellow but with purple spots. McClintock inves-
low (colorless) sectors that varied among different ker- tigated the production of yellow kernels and the frequent
nels. Invariably, however, the purple regions were plump appearance of purple spots.
and shiny, but the yellow sectors were shrunken and waxy. Her first clue to the puzzle came from the observation
At the same time, the kernel color mutation appeared to be that chromosome breakage frequently occurred at a gene
unstable. Specifically, the appearance of the mutant yellow designated Ds (short for “dissociation,” meaning chromo-
kernel phenotype often changed to an appearance that was some breakage), but only when another gene called Ac (for
428
11.7 Transposable Genetic Elements Move throughout the Genome 429
(a) Trihybrid, wild-type phenotype when a Ds element inserted into a C gene was excised
wx sh c1 by Ac action. This process took place cell by cell, result-
ing in purple spotting in those segments of a kernel
that were derived from a cell in which Ds was excised
Ds Wx Sh C
(Figure 11.27d). Kernel segments in which Ds remained
Purple
plump in C were yellow.
(b) Partial deletion, mutant phenotype shiny
wx sh c1
Drosophila P Elements
Colorless The genome of Drosophila melanogaster carries several dif-
Ds shrunken
Ac-activated
Wx
Chromosome
waxy
ferent types of transposable elements, but the most promi-
chromosome Sh fragment is lost. nent of these is a DNA transposon called a P element. These
breakage at Ds
C DNA transposons were not part of the genome of D. mela
nogaster collected from the wild before about 1960. Today,
however, all D. melanogaster collected in the wild carry P
(c) Unstable colorless mutant
elements in their genome, suggesting that P elements were
wx sh c1 introduced into D. melanogaster about 1960, perhaps by
cross-species transfer from a distantly related species. Since
Wx Sh c1Ds their introduction to the genome, P elements have quickly
Ds proliferated. The Drosophila life cycle can produce 20 to 25
c1Ds generations per year; thus, P elements have been evolving
for about 1000 generations or so in D. melanogaster since
first being introduced into the genome.
(d) Reversion of unstable mutant
phenotype, purple spots The P elements exist in multiple forms. Full-length P
elements encode transposase and are capable of autono-
wx sh c1
mous transposition. These P elements are approximately
2900 bp in length, and they have a central region contain-
Wx Sh C ing a gene for transposase that is encoded in four exons
and three introns flanked by 31-bp inverted repeats.
C Transcription and translation of the transposase gene in
Ac-activated full-length P elements produces an 87-kD transposase
Ds enzyme that activates P element transposition in germ-
excision of Ds
line cells. Several types of nonfunctional P elements are
Figure 11.27 Production of colorless sectors and reversion also found in the D. melanogaster genome, none pro-
of the unstable colorless mutation in maize by the transposable ducing functional transposase and all being shorter than
genetic elements Ds and Ac. 2900 bp.
The P elements were discovered in D. melanogaster
“activator,” meaning it activated chromosome breakage) by Margaret Kidwell in 1985, when she identified hybrid
was present. Ac elements contain a transposase gene that dysgenesis, a phenomenon in which sterility occurs in
is used to activate transposition. The Ds elements appeared the F1 progeny of a cross between laboratory-bred female
to move around the maize genome, and they appeared to flies and males derived from natural populations (Figure
be the cause of the unstable kernel color mutation. She 11.28). In these crosses, the female laboratory fly has
called Ds a “control element,” meaning that it controlled the so-called M cytotype (M is for “maternal”), and the
the expression of other genes. Ds elements do not contain wild-type male fly has the P (“paternal”) cytotype. The
a transposase gene and require an Ac element to activate P-cytotype male has three to four dozen P elements scat-
their transposition. tered throughout its genome. In contrast, the M-cytotype
McClintock’s examination of maize chromo- female has no P elements. The progeny of this cross
somes and kernel color revealed that when the trihybrid between laboratory (female, M cytotype) and wild flies
(C Sh Wx/c1 sh wx) had both chromosomes intact and (male, P cytotype) are hybrids that have a normal external
complete, kernels were purple. Chromosome break- appearance, but they are dysgenic—in other words they
age and loss of the C Sh Wx chromosome segment pro- are biologically deficient. The term hybrid dysgenesis
duced a yellow sector that was also shrunken and waxy refers to the combination of sterility, a high mutation rate,
(Figure 11.27b). The Ac-activated transposition of Ds into and a propensity for chromosomal aberrations and nondis-
the C gene inactivated the expression of C, and kernels junction present in these flies. Importantly, the mutations
were yellow (Figure 11.27c). Lastly, she discovered that found in dysgenic flies are unstable, reverting to wild-type
purple spotting of otherwise yellow kernels came about or mutating again at a high rate. Curiously, the reciprocal
430 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
F1 hybrids
Wild-type offspring
cross—a P-cytotype female (this genome contains P synthesize a DNA copy of the retrotransposon transcript for
elements) crossed to an M-cytotype male (this genome is insertion into new genome locations.
P element-free) results in normal flies that show no evi- Fully functional retroviruses that infect cells encode
dence of hybrid dysgenesis. at least three genes, called gag, env, and pol. Gag and env
The key to the action of P elements appears to be that encode proteins that form the retroviral particle. New retrovi-
the transposase genes are silenced by a suppressor protein in ral particles are produced within infected cells and perpetu-
P-cytotype strains. This inhibits their transposition and poten- ate the infection by invading new cells. The pol gene encodes
tial for causing mutations. In matings of P-cytotype males the enzyme reverse transcriptase that directs the synthesis of
and M-cytotype females, sperm from P-cytotype males con- double-stranded DNA from single-stranded RNA.
tains virtually no cytoplasmic material. The chromosomes Figure 11.29 illustrates the structures of three eukary-
carry P-elements, but as there is no cytoplasmic material, otic retrotransposons that transpose within eukaryotic
sperm do not possess the transposition repressor protein. The genomes. Two constant features of retrotransposons are
eggs of M-cytotype females contain abundant cytoplasmic seen. First, all retrotransposons encode reverse transcriptase
material but carry no transposition repressor protein because (pol) to catalyze transposition, and some contain gag, but
the chromosomes in the M cytotype are free of P elements. none contains env. Second, the gene or genes carried by ret-
At fertilization, sperm add P element–laden chromosomes rotransposons are flanked by long terminal repeats (LTRs)
into an egg lacking transposition-repressing protein. Exten- that may be up to several hundred base pairs in length.
sive transposition takes place, creating multiple mutations by
insertion of P elements into functional genes or by inducing LINE, SINE, and Alu Elements of Humans As mentioned
chromosome breaks that result in hybrid dysgenesis. above, more than 45% of the human genome is composed
of transposable DNA. Among the functional transposable
genetic elements in the human genome, LINE (long inter-
Retrotransposons spersed nuclear elements) and SINE (short interspersed
Retrotransposons are the most common transposable ele- nuclear elements) families of elements stand out because of
ments in eukaryotic genomes. They are related to RNA- their relative abundance and their ability to cause spontaneous
containing retroviruses that reverse transcribe their genetic human gene mutations. LINEs are up to several thousand base
information into DNA to parasitize host cells, but retrotrans- pairs in length and have an average length of about 900 bp.
posons do not infect cells, instead they transpose throughout SINEs are much shorter and have their sequences truncated
the genome. Retrotransposons use reverse transcriptase to at one end of the element, most likely because the reverse
Case Study 431
transcription process used for their transfer terminates before Ty Elements of Yeast Many different forms of Ty ret-
the entire sequence has transposed. rotransposons of yeast are found, all sharing the common
Almost 1 million copies of LINE sequences are found in features of retrotransposons. In Ty elements, the central
the human genome. Collectively, these sequences constitute element is approximately 6 kb, flanked by LTRs that are
a little more than 20% of the total genome sequence. Human each about 330 bp in length. Both LTRs contain promot-
L1 elements are the most common members of the LINE ers that direct the transcription of different genes in the
family of elements in the human genome, which contains central region. Approximately 50 to 100 copies of Ty ele-
approximately 600,000 copies of L1 alone, constituting more ments are present in the typical Saccharomyces cerevisiae
than 17% of the total genome. The L1 elements vary in length genome. The Ty elements cause mutation in yeast genes by
from about 6500 to 8000 bp. Full-length L1 elements encode insertion.
a protein with nuclease and reverse transcriptase function and
may also encode a second RNA-binding protein, but shorten- Copia Elements of Drosophila Multiple forms of the
ing of the element affects its ability to transpose. Full-length retrotransposon copia are found in the Drosophila genome.
L1 elements actively transpose in the human genome and Copia elements have a central element of 5 to 8.5 kb that
produce mutations. The transposable element referred to ear- contains pol and gag genes and is flanked by LTRs of 250
lier in the chapter as the cause of the X-linked blood-clotting to 600 bp each. The word copia comes from the Latin for
disorder hemophilia A is an L1 element. “abundance,” and befitting this designation, more than 5%
SINE elements, too, are common, composing a little of the Drosophila genome is composed of copia retrotrans-
more than 10% of human genome sequence. The most com- posons. This abundance leads to many mutations throughout
mon human SINE element is called an Alu element. Alu ele- the genome that are usually the result of insertion of copia
ments vary in length from 100 to 300 bp and are each flanked into a wild-type gene.
C A SE S T U D Y
Mendel’s Peas Are Shaped by Transposition
Gregor Mendel left good descriptions, data, and analyses that the recessive mutant allele, r, is altered by the insertion
of the crosses he used for establishing the law of segrega- of approximately 800 bp of DNA. The insertion is of transpos-
tion and the law of independent assortment, but he did not able DNA, and its effect is insertional inactivation of the ability
leave any seeds to give geneticists direct access to the genes to produce a starch-branching enzyme that is the normal gene
themselves. Experimental Insight 11.1 identifies three of the product. The researchers also provide a physiological explana-
genes studied by Mendel that have now been identified and tion for the appearance of wrinkled seed shape.
analyzed. Details of the discovery in 1990 of a fourth gene
are described here. It is the gene responsible for the round PROTEIN ANALYSIS Prior to the start of this study, consid-
and wrinkled seed shapes described by Mendel, now known erable evidence already suggested that seed shape variation
as SBE1, the starch-branching enzyme 1 gene. was due to differences in starch synthesis. Among candidate
The gene was identified and shown to be responsible enzymes known to be important in starch synthesis was SBE1.
for the seed shape variation Mendel reported by a laboratory The researchers used RR (pure-breeding round) plants as
group led by Cathie Martin (Bhattacharyya et al., 1990). In their a source of SBE1 to raise an antibody for use as a probe for
paper, the group reports DNA, mRNA, and protein evidence the enzyme. They then used protein gel electrophoresis and
432 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
1 RR rr 3.5 kb
No spot DNA-fragment
analysis
4 EcoRI EcoRI
R allele
Protein-detection
3.5 kb
analysis
MESSENGER RNA ANALYSIS The researchers next tested EcoRI Transposable EcoRI
element (800 bp)
mRNA from the SBE1 gene for evidence of the basis of the
mutation. Testing mRNA from RR and rr, the researchers
detected a 3300-nucleotide mRNA derived from RR plants r allele
and a 4100-nucleotide mRNA from rr plants. They found as 4.3 kb
well that the larger transcript from rr plants was about tenfold
less abundant than the smaller transcript from RR plants 2 . WRINKLED SEED DEVELOPMENT The physiological
These results indicate that the transcript of SBE1 in rr plants explanation of wrinkled seed development is tied to the loss
is longer than in RR plants and that it is produced at just a of function of SBE1. In mature round peas, almost half of the
fraction of the percentage present in RR plants. dry weight is starch. About 35% of the starch is in a simple
linear form known as amylose. The remainder is in complexly
2 RR rr branched forms, most commonly a form known as amylo-
pectin. Free molecules of sucrose make up about 5% of the
dry weight. Amylose is actively converted to amylopectin
4100 nt by SBE1 in round seeds. In wrinkled seeds, only about 30%
of starch is amylopectin, and about 70% is amylose. Amy-
lose readily loses molecules of free glucose, and the sugar
3300 nt accounts for more than 10% of the dry weight of wrinkled
seeds.
During early seed development, SBE1 is active in imma-
mRNA analysis ture seeds that will become round, but it is inactive due to
mutation in immature seeds that will become wrinkled.
DNA ANALYSIS DNA encoding the SBE1 gene was isolated In seeds that will be wrinkled, the high percentage of free
from RR and rr plants and was fragmented for analysis by sucrose causes cells to import large amounts of water to
DNA gel electrophoresis. This analysis revealed a DNA frag- dilute the excess sugar. The extra water results in larger cells
ment approximately 3.5 kb in length from RR plants and a cor- and larger immature seeds that stretch the seed membrane.
responding DNA fragment of about 4.3 kb from rr plants 3 . As all pea seeds mature, they dehydrate to the same level,
One possible explanation for this result is the insertion of and this is when wrinkling appears in rr seeds. The over-
approximately 800 bp of DNA into the r allele. Subsequent stretched membranes of those seeds collapse, much like an
analysis revealed that an 800-bp insertion of transposable overinflated balloon that has lost air, causing the seeds to
DNA into the R allele was the mutational event that gener- look wrinkled. Membranes of RR and Rr seeds have not been
ated the r allele 4 . This event caused insertional inactivation stretched by extra water importation. They are resilient, and
of the r allele of SBE1. the seeds appear round.
SU MMA RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
11.1 Mutations Are Rare and Random and ❚❚ Base-pair substitution mutations can be either transitions or
Alter DNA Sequence transversions.
❚❚ Base-pair substitutions can change one amino acid of a
❚❚ Mutations occur at random in genomes. polypeptide, can create a new stop codon, or can leave the
❚❚ Mutation frequencies are low in all organisms. polypeptide unchanged.
❚❚ Mutational hotspots are genes or regions where mutations
occur much more often than average.
Summary 433
❚❚ Frameshift mutations result from the insertion or deletion 11.5 Proteins Control Translesion DNA
of one or more base pairs that shift the mRNA reading Synthesis and the Repair of Double-Strand
frame during translation.
Breaks
❚❚ Regulatory mutations alter gene transcription or pre-mRNA
splicing. ❚❚ SOS repair, controlled by the RecA protein, is a specialized
❚❚ Forward mutation alters a wild-type allele to mutant form, process activated during replication in bacteria in response
and reversion changes a mutant back to wild-type or near to widespread DNA damage.
wild-type form. ❚❚ Translesion DNA synthesis uses translesion DNA polymer-
ases to complete replication when damage is present.
❚❚ Nonhomologous end joining repairs double-strand DNA
11.2 Gene Mutations May Arise from breaks occurring before DNA replication.
Spontaneous Events ❚❚ Synthesis-dependent strand annealing repairs double-strand
❚❚ DNA replication errors can substitute base pairs, and strand breaks occurring after the completion of replication.
slippage can modify the number of repeats of a DNA
sequence. 11.6 DNA Double-Strand Breaks Initiate
❚❚ Different kinds of spontaneous changes in nucleotide struc- Homologous Recombination
ture can result in mutation of DNA sequence by base-pair
mismatching. ❚❚ Homologous recombination is controlled by the RecBCD
pathway in bacteria. In eukaryotes, meiotic recombination
is initiated through the activity of Spo11 that regulates the
11.3 Mutations May Be Caused by Chemicals production of double-strand breaks.
or Ionizing Radiation ❚❚ In meiotic recombination, strand invasion and new DNA
synthesis form heteroduplex DNA in both homologous
❚❚ Mutagenic chemicals interact in characteristic reactions chromosomes.
with DNA nucleotides and generate specific mutations. ❚❚ Heteroduplex DNA contains base-pair mismatches if DNA
❚❚ Chemical compounds may create mutations by acting as sequences are heterozygous.
nucleotide base analogs, adding or removing side groups ❚❚ DNA strands forming double Holliday junctions are cut and
from nucleotides, or intercalating into DNA. rejoined to different homologs before their separation in
❚❚ Energy in the ultraviolet range and higher (shorter in wave- meiosis.
length) is mutagenic. Ultraviolet radiation induces the for- ❚❚ Resolution of double Holliday junctions generates hetero-
mation of photoproducts that lead to base-pair substitution duplex DNA and can produce recombinant or nonrecombi-
mutations. nant chromosomes.
❚❚ The Ames test identifies mutagenic chemical compounds
by testing for increased reversion rates in auxotrophic bac-
teria exposed to a test compound in the presence of detoxi- 11.7 Transposable Genetic Elements Move
fying enzymes from the eukaryotic liver. throughout the Genome
❚❚ Transposable genetic elements, found in all genomes, are
DNA sequences that can move about the genome by either
11.4 Repair Systems Correct Some DNA a “cut-and-paste” or a “copy-and-paste” process.
Damage ❚❚ DNA transposons encode transposase and perhaps other
❚❚ Direct repair of DNA lesions removes damaged nucleotides genes and transpose as DNA sequences.
and prevents mutation. ❚❚ Retrotransposons encode reverse transcriptase and perhaps
❚❚ Mismatched DNA nucleotides, photoproducts induced other genes and transpose through an RNA intermediate.
by UV radiation, and modified nucleotide side chains are ❚❚ Transposase is the enzyme responsible for transposition,
removed by direct repair. and it is encoded by many transposable genetic elements.
❚❚ Nucleotide excision repair and UV repair remove ❚❚ Transposition can produce mutations through insertional
segments of DNA single strands containing damaged inactivation that modifies gene expression or by con-
nucleotides and direct new synthesis to fill the resulting tributing to unequal crossing over between homologous
single-stranded gap. chromosomes.
❚❚ Genetically controlled systems monitor the genome and ❚❚ Transposable elements can contribute to great expansion of
regulate DNA repair. genome size.
434 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
PRE PA R IN G F O R P R O BLE M S O LV I NG
In addition to the list of problem-solving tips and sugges- 3. Be prepared to describe the molecular mechanisms that
tions given here, you can go to the Study Guide and Solu- generate mutations.
tions Manual that accompanies this book for help at solving
4. Know the processes that ensure the accuracy of DNA
problems.
replication.
1. Understand how to analyze and predict the effects of
5. Understand the molecular basis of homologous
mutations on DNA, mRNA, and proteins.
recombination.
2. Understand how the phenotypic effects of mutations are
described and analyzed.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Identify two general ways chemical mutagens can alter 11. Two different mutations are identified in a haploid strain
DNA. Give examples of these two mechanisms. of yeast. The first prevents the synthesis of adenine by a
nonsense mutation of the ade-1 gene. In this mutation, a
2. Nitrous acid and 5-bromodeoxyuridine (BU) alter DNA
base-pair substitution changes a tryptophan codon (UGG)
by different mechanisms. What type of mutation does
to a stop codon (UGA). The second affects one of several
each compound produce?
duplicate tRNA genes. This base-pair substitution muta-
3. What is the difference between a transition mutation and a tion changes the anticodon sequence of a tRNATrp from
transversion mutation?
3’-ACC-5’ to 3’-ACU-5’
4. What are the differences between a synonymous a. Do you consider the first mutation to be a forward
mutation, a missense mutation, and a nonsense mutation or a reversion? Why?
mutation? b. Do you consider the second mutation to be a forward
5. UV irradiation causes damage to bacterial DNA. What mutation or a reversion? Why?
kind of damage is frequently caused and how does pho- c. Assuming there are no other mutations in the genome,
tolyase repair the damage? will this double-mutant yeast strain be able to grow on
minimal medium? If growth will occur, characterize
6. Ultraviolet (UV) radiation is mutagenic. the nature of growth relative to wild type.
a. What kind of DNA lesion does UV energy cause? 12. What is the phenotypic effect of inserting a Ds element
b. How do UV-induced DNA lesions lead to mutation? into the maize C gene? How do Ds and Ac produce maize
c. Identify and describe two DNA repair mechanisms that kernels that are mostly yellow with purple spots?
remove UV-induced DNA lesions.
13. Answer the following questions concerning the accuracy
7. Researchers interested in studying mutation and mutation
of DNA polymerase during replication.
repair often induce mutations with various agents. What
kinds of gene mutations are induced by a. What general mechanism do DNA polymerases use
to check the accuracy of DNA replication and identify
a. chemical mutagens? Give two examples.
errors during replication?
b. radiation energy? Give two examples.
b. If a DNA replication error is detected by DNA poly-
8. The effect of base-pair substitution mutations on protein merase, how is it corrected?
function varies widely from no detectable effect to the c. If a replication error escapes detection and correction,
complete loss of protein function (null allele). Why do the what kind of abnormality is most likely to exist at the
functional consequences of base-pair substitution vary so site of replication error?
widely? d. Identify two mechanisms that can correct the kind of
9. Describe the purpose of the Ames test. How are his - abnormality resulting from the circumstances identi-
bacteria used in the Ames test? What mutational event is fied in part (c).
identified using his - bacteria? e. If the kind of abnormality identified in part (c) is not
corrected before the next DNA replication cycle, what
10. In numerous population studies of spontaneous muta- kind of mutation occurs?
tion, two observations are made consistently: (1) most f. DNA mismatch repair can accurately distinguish
mutations are recessive, and (2) forward mutation is more between the template strand and the newly replicated
frequent than reversion. What do you think are the likely strand of a DNA duplex. What characteristic of DNA
explanations for these two observations? strands is used to make this distinction?
Problems 435
14. Several types of mutation are identified and described purine, T/C if it could be either pyrimidine, N if any
in the chapter. These include (1) promoter mutation, (2) nucleotide could occur at a site, or the alternative
splice site mutation, (3) missense mutation, (4) frameshift nucleotides if a purine and a pyrimidine are possible.
mutation, and 5) nonsense mutation. Match the follow- 21. The two DNA and polypeptide sequences shown are for
ing mutation descriptions with the type(s) of mutations alleles at a hypothetical locus that produce different poly-
listed above. More than one mutation type might match a peptides, both five amino acids long. In each case, the
description. lower DNA strand is the template strand:
a. A mutation that changes several amino acids in a
protein and results in a protein that is shorter than the allele A1 5′. . . ATGCATGTAAGTGCATGA . . . 3′
wild-type product.
3′. . . TACGTACATTCACGTACT . . . 5′
b. A mutation that produces about 5% of the wild-type
amount of an mRNA. A1 polypeptide N–Met–His–Val–Ser–Ala–C
c. A mutation that produces a mutant protein that dif- allele A2 5′. . . ATGCAAGTAAGTGCATGA . . . 3′
fers from the wild-type protein at one amino acid
position. 3′. . . TACGTTCATTCACGTACT . . . 5′
d. A mutation that produces a protein that is shorter than A2 polypeptide N–Met–Gln–Val–Ser–Ala–C
the wild-type protein but does not have any amino acid
changes in the portion produced. Based on DNA and polypeptide sequences alone, is there
e. A null mutation that does not produce any functional any way to determine which allele is dominant and which
protein product. is recessive? Why or why not?
15. A 1-mL sample of the bacterium E. coli is exposed 22. Many human genes are known to have homologs in the
to ultraviolet light. The sample is used to inoculate a mouse genome. One approach to investigating human
500-mL flask of complete medium that allows growth hereditary disease is to produce mutations of the mouse
of all bacterial cells. The 500-mL culture is grown on homologs of human genes by methods that can precisely
the benchtop, and two equal-size samples are removed target specific nucleotides for mutation.
and plated on identical complete-medium growth plates. a. Numerous studies of mutations of the mouse homo-
Plate 1 is immediately wrapped in a dark cloth, but logs of human genes have yielded valuable informa-
plate 2 is not covered. Both plates are left at room tion about how gene mutations influence the human
temperature for 36 hours and then examined. Plate 2 disease process. In general terms, describe how and
is seen to contain many more growing colonies than why creating mutations of the mouse homologs can
plate 1. Thinking about DNA repair processes, how do give information about human hereditary disease
you explain this observation? processes.
16. A strain of E. coli is identified as having a null mutation b. Despite the homologies that exist between human and
of the RecA gene. What biological property do you expect mouse genes, some attempts to study human hereditary
to be absent in the mutant strain? What is the molecular disease processes by inducing mutations in mouse
basis for the missing property? genes indicate there is little to be learned about human
disease in this way. In general terms, describe how
17. Describe the difference between DNA transposons and and why the study of mouse gene mutations might fail
retrotransposons. to produce useful information about human disease
processes.
18. How are flanking direct repeat sequences created by
transposition? 23. The fluctuation test performed by Luria and Delbrück is
consistent with the random mutation hypothesis. Briefly
19. Using the adenine–thymine base pair in this DNA describe their experiment and identify how the results
sequence match the prediction of the random mutation hypothesis.
...GCTC... What would have to be different about the experimental
results for them to agree with the prediction of the adap-
...CGAG... tive mutation hypothesis?
a. Give the sequence after a transition mutation. 24. In this chapter, three features of genes or of DNA
b. Give the sequence after a transversion mutation. sequence that contribute to the occurrence of mutational
20. The partial amino acid sequence of a wild-type protein is hotspots were described. Identify those three features and
briefly describe why they are associated with mutational
. . . Arg–Met–Tyr–Thr–Leu–Cys–Ser . . .
hotspots.
The same portion of the protein from a mutant has the
25. Briefly compare the production of DNA double-strand
sequence
breaks in bacteria versus the double-strand breaks that
. . . Arg–Met–Leu–Tyr–Ala–Leu–Phe . . . precede homologous recombination.
a. Identify the type of mutation. 26. During mismatch repair, why is it necessary to distinguish
b. Give the sequence of the wild-type DNA template between the template strand and the newly made daughter
strand. Use A/G if the nucleotide could be either strand? Describe how this is accomplished.
436 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
27. Following the spill of a mixture of chemicals into a Wild-type N . . . Thr–His–Ser–Gly–Leu–Lys–Ala . . . C
small pond, bacteria from the pond are tested and show polypeptide
an unusually high rate of mutation. A number of mutant
Mutant 1 N . . . Thr–His–Ser–Val–Leu–Lys–Ala . . . C
cultures are grown from mutant colonies and treated with
known mutagens to study the rate of reversion. Most of Mutant 2 N . . . Thr–His–Ser–C
the mutant cultures show a significantly higher reversion Mutant 3 N . . . Thr–Thr–Leu–Asp–C
rate when exposed to base analogs such as proflavin and Mutant 4 N . . . Thr–Gln–Leu–Trp–Ile–Glu–Gly . . .
2-aminopurine. What does this suggest about the nature of
the chemicals in the spill? a. Use the available information to characterize each
-
28. In an Ames test using his Salmonella bacteria a researcher mutant.
determines that adding a test compound plus the S9 extract b. Determine the wild-type mRNA sequence.
produces a large number of his + revertants, but mixing the c. Identify the mutation that produces each mutant
his - strain plus the test compound without adding S9 does polypeptide.
not produce an elevated number of his + revertants. 31. Experiments by Charles Yanofsky in the 1950s and 1960s
a. What is the reason for the different experimental helped characterize the nature of tryptophan synthesis in
results described? E. coli. In one of Yanofsky’s experiments, he identified
b. Is the test compound still considered to be a potential glycine (Gly) as the wild-type amino acid in position 211
mutagen? Explain why or why not. of tryptophan synthetase, the product of the trpA gene.
He identified two independent missense mutants with
29. A wild-type culture of haploid yeast is exposed to ethyl
defective tryptophan synthetase at these positions that
methanesulfonate (EMS). Yeast cells are plated on a
resulted from base-pair substitutions. One mutant encoded
complete medium, and 6 colonies (colonies numbered 1
arginine (Arg) and another encoded glutamic acid (Glu).
to 6) are transferred to a new complete medium plate for
At position 235, wild-type tryptophan synthetase contains
further study. Four replica plates are made from the com-
serine (Ser), but a base-pair substitution mutant encodes
plete medium plate to plates containing minimal medium
leucine (Leu). At position 243, the wild-type polypeptide
or minimal medium plus one amino acid (replica plates
contains glutamine, and a base-pair substitution mutant
numbered 1 to 4) with the following results:
encodes a stop codon. Identify the most likely wild-type
codons for positions 211, 235, and 243. Justify your
Complete answer in each case.
medium
1 3
32. Alkaptonuria is a human autosomal recessive
6
4
2 5 disorder caused by mutation of the HAO gene that
encodes the enzyme homogentisic acid oxidase. A map
of the HAO gene region reveals four BamHI restric-
Replica plate tion sites (B1 to B4) in the wild-type allele and three
BamHI restriction sites in the mutant allele. BamHI
utilizes the restriction sequence 5′-GGATCC-3′. The
Plate 1 Plate 2 Plate 3 Plate 4 BamHI restriction sequence identified as B3 is altered to
1 1 3 1 1 5′-GGAACC-3′ in the mutant allele. The mutation
2
4 4 4 4 5 results in a Ser-to-Thr missense mutation. Restriction
maps of the two alleles are shown below, and the binding
Minimal Minimal Minimal Minimal sites of two molecular probes (probe A and probe B) are
+ histidine + arginine + leucine identified.
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
37. In a mouse-breeding experiment a new mutation called 39. Thinking back to the discussion of gain-of-function
Dumbo is identified. A mouse with the Dumbo mutation and loss-of-function mutations in Section 4.1, and put-
has very large ears. It is produced by two parental mice ting those concepts together with the discussion of
with normal ear size. Based on this information, can you base substitution mutations in this chapter, explain why
tell whether the Dumbo mutation is a regulatory mutation gain-of-function mutations are often dominant and why
or a mutation of a protein coding gene? Why or why not? loss-of-function mutations are often recessive. Give
38. Considering the Dumbo mutation in Problem 37, what an example of a type of gain-of-function mutation that
kinds of additional evidence would help you determine is dominant and of a loss-of-function mutation that is
whether Dumbo is a mutation of a regulatory sequence or recessive.
of a protein coding gene?
438 CHAPTER 11 Gene Mutation, DNA Repair, and Homologous Recombination
40. Common baker’s yeast (Saccharomyces cerevisiae) is 41. The two gels illustrated contain dideoxynucleotide DNA-
normally grown at 37ºC, but it will grow actively at tem- sequencing information for a wild-type segment and
peratures down to approximately 25ºC. A haploid culture mutant segment of DNA corresponding to the N-terminal
of wild-type yeast is mutagenized with EMS. Cells from end of a protein. The start codon and the next five codons
the mutagenized culture are spread on a complete-medium are sequenced.
plate and grown at 25ºC. Six colonies (1 to 6) are selected
from the original complete-medium plate and transferred Wild type Mutant
to two fresh complete-medium plates. The new complete A T C G A T C G
plates (shown) are grown at 25ºC and 37ºC. Four replica –
plates are made onto minimal medium or minimal plus
adenine from the 25ºC complete-medium plate. The new
plates are grown at either 25ºC or 37ºC and the growth
results are shown.
25°C 37°C
1 2
3 4 3
Complete 5 6 5 6
medium
Replica plate
1 2 1 2
3 3 4 3 3
6 5 6 6 5 6
a. Which colonies are prototrophic and which are auxo- a. Write the DNA sequence of both alleles, including
trophic? What growth information is used to make strand polarity.
these determinations? b. Identify the template and nontemplate strands of DNA.
b. Classify the nature of the mutations in colonies 1, 2, c. Write out the mRNA sequences encoded by each tem-
and 5. plate strand, and underline the start codons.
c. What can you say about colony 4? d. Determine the amino acid sequences translated from
these mRNAs.
e. What is the cause of the mutation?
Regulation of Gene
Expression in Bacteria
and Bacteriophage
12
CHAPTER OUTLINE
12.1 Transcriptional Control of Gene
Expression Requires DNA–
Protein Interaction
12.2 The lac Operon Is an Inducible
Operon System under
Negative and Positive Control
12.3 Mutational Analysis Deciphers
Genetic Regulation of the lac
Operon
12.4 Transcription from the
Tryptophan Operon Is
Repressible and Attenuated
12.5 Bacteria Regulate the
Transcription of Stress
Response Genes and Also
Translation
12.6 Riboswitches Regulate
Bacterial Transcription,
Translation, and mRNA
Stability
12.7 Antiterminators and Repressors
Control Lambda Phage
Infection of E. coli
Jacques Monod (left), André Lwoff (middle), and François Jacob (right)
ESSENTIAL IDEAS
on October 14, 1965, following the announcement of the awarding ❚❚ Gene expression in bacteria is controlled
of the Nobel Prize in Physiology or Medicine for their work describing primarily through transcriptional regula-
the lactose (lac) operon in E. coli. tion, often by regulating groups of genes
known as operons.
T
❚❚ Transcription of lactose (lac) operon
ake a moment to think about the ever-changing environ- genes is induced by lactose and is
ment endured by the billions of Escherichia coli (E. coli) repressed in the absence of lactose.
❚❚ Transcription of the repressible
that populate your intestinal tract. These bacteria are accus- tryptophan (trp) operon adjusts to the
tomed to a diverse and constantly shifting set of environmental level of available tryptophan.
❚❚ Specialized regulatory processes control
factors and nutritional conditions, as well as to competition
transcriptional response to environmen-
from the many other bacterial species in your gut. In all these tal stress and regulate translation.
rapidly changing environmental conditions, bacterial survival ❚❚ Bacteria can regulate transcription,
translation, and the stability of mRNA
depends on the cell’s ability to deal with whatever conditions
with mRNA sequences and RNA-binding
prevail at the moment. Although certain bacteria engage in regulatory proteins.
quorum sensing, a mechanism causing certain genes to be ❚❚ Bacteriophage use transcriptional
regulation to express the genes
coordinated among the individuals within a dense population responsible for infecting their hosts.
of bacteria, each individual bacterial cell is largely self-reliant ❚❚ Competition between regulatory pro-
when it comes to producing the proteins necessary to carry out teins determines the course of bacterio-
phage lambda infection in bacteria.
439
440 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
metabolism and to generate the compounds it needs binding regulatory proteins and regulatory DNA
to stay alive and to reproduce. sequences regulates transcription. Next we explore
What is the best strategy for the survival of E. coli the organization, function, and regulation of the E. coli
in a rapidly changing environment? Should the organ- lactose (lac) operon system, whose gene transcription
ism transcribe and translate all its genes at all times, or is induced (turned on) by the presence of the sugar
should gene transcription and translation be regulated lactose in the growth medium. This topic is followed
in a closely monitored manner that can respond in a by a discussion of mutational analysis and the molecu-
matter of minutes to changes in growth conditions lar explanation for the transcriptional control of lac
as they arise? Answering these kinds of questions operon genes. We then turn our attention to the
was critically important to understanding how evolu- genetic structure and molecular control of transcription
tion has shaped the processes of gene expression in of the tryptophan (trp) operon that contains the genes
organisms. On one hand, if bacteria transcribed and needed to synthesize the amino acid tryptophan. After
translated all their genes at all times, they could be moving on to discussions of posttranscriptional regula-
instantly ready for almost any environmental shift that tion of bacterial genes and of a mechanism that uses
might occur. On the other hand, continuously express- regulatory mRNA sequences to control gene expres-
ing all genes would be terribly costly in metabolic sion, we examine the regulatory process that controls
terms and entail a great deal of unnecessary transcrip- infection of bacterial cells by bacteriophage
tion and translation. Unregulated gene expression l (lambda).
could also result in antagonistic interactions between
proteins operating in different metabolic systems.
Biologists in the 1950s and 1960s hypothesized that
energetic and metabolic expenditures associated with
12.1 Transcriptional Control
regulated gene expression would be evolutionarily of Gene Expression Requires
favored over the high cost of continuous gene expres- DNA–Protein Interaction
sion. But to demonstrate the validity of that hypoth-
Certain bacterial genes—specifically, those whose products
esis, examples of regulated gene expression had to be
are needed continuously to perform routine tasks—undergo
identified and studied. constitutive transcription, a term identifying the genes as
The first research describing the gene actions being transcribed continuously with no regulatory control.
and molecular mechanism for regulated gene expres- In contrast, the need for agile and calibrated responses to
changing environmental conditions has resulted in the evo-
sion was by Francois Jacob, Jacques Monod, André
lution of mechanisms for the regulated transcription of
Lwoff, and others, who showed how the lactose (lac) many bacterial genes.
operon system in E. coli was transcriptionally regu- Regulation of the transcription of bacterial genes is the
lated in response to the presence or absence of the predominant mode by which bacteria regulate responses to
the environment, and it takes place at two levels. At both
milk sugar lactose. This research was a milestone
levels, control results from interactions between DNA-
in biology that introduced a new way of thinking binding proteins and specific regulatory sequences of DNA.
about the expression of genes. It opened the door The first level of control regulates the initiation of tran-
to research on mechanisms that regulate gene scription, determining whether a particular gene or group of
genes is transcribed at all. The second transcriptional con-
expression—research that is just as active today
trol level determines the amount of transcription, regulating
as it has ever been. either the duration of transcription or the amount of mRNA
In this chapter, the regulatory systems we discuss transcript produced from the gene.
are principally found in E. coli, the most widely used Additionally, posttranscriptional regulatory mecha-
nisms are important, controlling mRNA stability, the level
model bacterium. We begin with a general introduc-
of translation of mRNA, or the activity of proteins and
tion to regulated gene expression and introduce enzymes. Table 12.1 provides an overview of bacterial regu-
the concept that the interaction between DNA- latory mechanisms that are described in this chapter.
12.1 Transcriptional Control of Gene Expression Requires DNA–Protein Interaction 441
(a) Effect of allosteric effector compound To achieve protein–DNA specificity in these interac-
Activator protein tions, the protein must simultaneously contact multiple
Effector nucleotides. A common motif in the structures of DNA-
No RNA polymerase binding regulatory proteins is the formation of protein sec-
transcription Transcription ondary structures, most commonly a helices, containing the
amino acids that contact regulatory nucleotides. Frequently,
Promoter Gene two protein segments contact the DNA target sequence.
Absence of effector prevents Effector binding to the The paired DNA-binding regions of a regulatory protein
activator protein binding and activator protein facilitates form in two ways. In one type of interaction, a single poly-
transcription. transcription by positive peptide folds to form two domains that bind specific DNA
regulation.
sequences. In the other type, the regulatory protein con-
(b) Effect of allosteric inhibitor compound sists of two or more polypeptides joined to form a multi-
Activator protein meric complex of two (dimeric), three (trimeric), or four
(tetrameric) polypeptides. When identical polypeptides join
RNA polymerase together, the prefix homo- is used. A “homodimer” contains
No
Inhibitor transcription Transcription two identical polypeptides in the functional protein. When
different polypeptides join together, the complex is identi-
Promoter Gene fied by the prefix hetero-, as in “heterodimer.”
Binding of inhibitor to activator Absence of an inhibitor allows Extensive studies of transcription-regulating proteins
protein prevents activator binding of activator protein in bacteria have identified the characteristic structural fea-
binding and transcription. and transcription by positive tures of DNA-binding regulatory proteins and the DNA
regulation.
sequence they bind. Bacterial regulatory DNA sequences
Figure 12.2 Mechanisms of positive control of transcription. frequently contain inverted repeats or direct repeats. Each
polypeptide of a homodimeric regulatory protein, or each
Q Briefly describe the difference between negative control of
of the binding regions of a folded polypeptide, interacts
transcription and positive control of transcription.
with one of the inverted repeat segments. By far the most
nucleotide bases and the sugar-phosphate backbone of common structural motif seen in these proteins in bacteria
DNA. The proteins make their contact with specific base is the helix-turn-helix (HTH) motif (Figure 12.3). In the
pairs located in the major groove and the minor groove of HTH motif, two a@helical regions in each of two polypep-
the DNA helix using the unique patterns of hydrogen, nitro- tides in a homodimer interact with inverted repeat regula-
gen, and oxygen atoms that characterize each base pair. tory sequences in DNA. In each of the polypeptides, one
(a)
Recognition Recognition
helix Turn helix
region
5¿
G G G A A T T G G G T A A T T C C A C A
T T T T G C C A A C 3¿
A G A A
C
A A C C T T C
A C G C C T A T T
G T
A A G G T G T
A T G
C C T 5¿
3¿
of the two a@helical regions is the recognition helix that fits Lactose Metabolism
into the major groove of DNA and binds the inverted repeat
sequences. The second helix of each polypeptide is the sta- The monosaccharide sugar glucose is the preferred energy
bilizing helix. It lies across the major groove and contacts source of E. coli, just as it is for your cells. Glucose is
the sugar-phosphate backbone, ensuring a strong DNA– metabolized by the biochemical pathway called glycolysis, a
protein interaction and properly orienting the recognition sequence of biochemical reactions that oxidizes glucose, and
helix to sit in the major groove. The recognition helix and closely related compounds, to produce pyruvate and ATP
the stabilizing helix of each polypeptide are connected by (adenosine triphosphate), the compound used universally
a short amino acid string identified as the “turn,” hence the by cells to store and produce energy. This pathway occurs in
name of the helix-turn-helix motif. Many different DNA- virtually all cells as part of fermentation and cellular respira-
binding regulatory proteins with the HTH motif have been tion. Glycolysis is the principal energy-producing reaction in
identified in bacteria as we discuss in later sections of this your cells and those of E. coli. But like humans and other
chapter. organisms, E. coli is capable of metabolizing sugars such as
galactose, lactose, and fructose as well. Glucose is the pre-
ferred sugar because it can be directly metabolized in glycol-
ysis. The alternative sugars require separate metabolism to
12.2 The lac Operon Is an Inducible first produce glucose or a glucose derivative that can then be
processed by glycolysis. Thus, E. coli will consume all avail-
Operon System under Negative able glucose before a genetic switch is flipped that changes
and Positive Control the metabolic pathway to one that uses an alternative sugar.
The genetic switch to lactose utilization requires that
One conclusion evolutionary biologists have reached in lactose be present in the cell, but the lactose is not used
comparing the genomes of different forms of life is that by the cell until after glucose has been depleted. The lac
evolution has operated to restrict the total size of bacte- operon, whose genes and regulatory sequences control lac-
rial genomes compared with most others and to limit the tose utilization in E. coli, is an inducible operon system,
percentage of repetitive (noncoding) DNA in them to less meaning that under the specific circumstances of lactose
than 15 percent on average. These limitations are imposed presence in the growth medium and glucose absence, tran-
by various factors, including the dependence of bacteria scription of the operon genes is activated, or induced. The
on their abilities to reproduce rapidly and respond quickly inducible nature of the lac operon and other inducible oper-
to environmental changes. Possession of a relatively small ons also means that expression of operon genes is limited to
genome and small percentage of noncoding DNA speeds the the circumstance of the inducer compound being available.
DNA replication process and shortens the time required to Other nutritional requirements may have to be met as well
replicate the genome during cell division. The need for rapid for transcription induction to occur.
responsiveness to environmental change and for restricted Lactose is a disaccharide consisting of two monosac-
genome size dictates another evolutionary adaptation in charides, glucose and galactose, that are joined by a cova-
bacteria: the clustering and coordinated transcriptional regu- lent b@galactoside linkage (Figure 12.4). Bacteria that have a
lation of genes involved in the same metabolic processes. lac+ phenotype (“lack plus”) are able to grow on a medium
Clusters of genes undergoing coordinated transcrip- containing lactose as the only sugar. lac + strains accom-
tional regulation by a shared regulatory region are called plish this growth by producing a gated channel at the cell
operons. Operons are common in bacterial genomes, and the membrane that allows lactose to enter the cell. The chan-
genes that are part of a given operon almost always partici- nel is formed by the enzyme permease. On entering the
pate in the same metabolic or biosynthesis pathway. Besides cell 1 , lactose is processed by the enzyme b@galactosidase
having a single promoter shared by the operon genes, oper- that processes lactose in two ways. The principal activ-
ons contain additional regulatory DNA sequences that inter- ity of b@galactosidase is to break the b@galactoside linkage
act with promoters to exert transcriptional control. to release glucose and galactose 2 . Glucose produced by
In this discussion, we focus on the lactose (lac) operon lactose breakdown can immediately enter glycolysis. The
of E. coli. This operon is responsible for the production of molecule of galactose can be further processed to produce
three polypeptides that permit E. coli to utilize the sugar lac- glucose. In addition to producing glucose and galactose,
tose as a carbon source for growth and metabolic energy. In b@galactosidase also converts some lactose to an isomer
this section, we explain how the lac operon works, describe called allolactose 3 . Allolactose plays a critical role in
the circumstances under which its genes are transcribed, regulating the transcription of lac operon genes by acting
and identify the regulatory mechanisms that control operon as the inducer compound. Allolactose that is not used for
gene transcription. In the following section, we turn our induction can be cleaved by b@galactosidase 4 . Bacteria
attention to mutational and molecular analyses of the lac that are unable to grow on a lactose-containing medium are
operon to understand the function of operon genes and to identified as having a lac− phenotype (“lack minus”).
explore the molecular interactions that regulate operon gene These strains are either unable to import lactose to the cell,
transcription. unable to break it down once it is in the cell, or both.
444 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
(a)
Lactose operon
Regulatory
Repressor region Structural gene region
(b)
Promoter region
Shine–Dalgarno
fMet
mRNA sequence
Met
Glu
Gln
Thr
Gly
Ser
–80 –70 –60 –50 –40 –30 –20 –10 +1 +10 +20 +30
5¿ GAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATG 3¿
3¿ CTTTCGCCCGTCACTCGCGTTGCGTTAATTACACTCAATCGAGTGAGTAATCCGTGGGGTCCGAAATGTGAAATACGAAGGCCGAGCATACAACACACCTTAACACTCGCCTATTGTTAAAGTGTGTCCTTTGTCGATACTGGTAC 5¿
–35 sequence –10 sequence
CAP–cAMP binding region
Figure 12.5 The lactose (lac) operon of E. coli. (a) The repressor protein (lacI) is encoded by a 1040-bp segment
under separate transcriptional regulation. The transcription regulatory region consists of a CAP binding site, a pro-
moter consensus sequence region, and an operator sequence. The three structural genes of the lac operon encode the
enzymes b@galactosidase (lacZ), permease (lacY), and transacetylase (lacA). (b) The DNA sequence of the regulatory
region of the lac operon, including the - 10 and - 35 consensus sequences, the operator, and the CAP binding site.
Q Describe the position of the lac operator with respect to the positions of the lac promoter and the +1
nucleotide.
an inducible operon. With synthesis of b@galactosidase, the By itself, however, RNA polymerase is very ineffective
production of allolactose occurs. By binding to the alloste- at accomplishing transcription of the lac operon genes. This
ric domain of the repressor protein, allolactose forms the is due to the absence of binding of the CAP–cAMP com-
inducer–repressor complex. The formation of this com- plex at the CAP binding site (more on this in a moment).
plex induces an allosteric change that alters the conforma- RNA polymerase by itself is only able to manage basal
tion of the DNA-binding domain of the repressor protein transcription (Figure 12.6c)—transcription that produces
to a form that does not recognize or bind the operator. An only a small number of polycistronic mRNAs and leads
essential part of the induction of transcription is the binding to the translation of a few molecules of b@galactosidase,
of the CAP–cAMP complex to the CAP binding site, which permease, and transacetylase per cell.
facilitates achievement of the highest level of transcription. Basal transcription driven solely by RNA polymerase
The polycistronic mRNA is synthesized, and translation that gains access to the lac promoter through the inducer–
produces b@galactosidase, permease, and transacetylase. repressor complex mechanism is insufficient to generate
When both glucose and lactose are available, E. coli enough copies of the polycistronic mRNA to drive active
utilize glucose. The presence of lactose, however, gener- lactose metabolism. A second regulatory process featuring
ates a small amount of allolactose that carries out its nor- positive control of transcription is required to fully acti-
mal inducer function by binding to repressor protein. The vate lac operon gene transcription. Positive control of lac
inducer–repressor interaction opens the promoter region, operon transcription lies in a DNA–protein interaction that
and RNA polymerase binds. occurs at the CAP–cAMP binding region of the lac operon
446 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
production of a few molecules of b@galactosidase and per- transcription regulation at the DNA sequence level. We dis-
mease. This small amount of permease and b@galactosidase, cuss several of their analyses of lac operon mutants and
amounting to no more than a few molecules per cell, is suf- elements of the molecular analysis of lac operon transcrip-
ficient to bring a small number of lactose molecules across tional regulation in this section. As you read this discussion,
the cell membrane and to generate allolactose. This trickle refer to Tables 12.2 and 12.3 for a list of lac operon genes
of lactose quickly induces more transcription, launching a and regulatory sequences, as well as example genotypes and
transcriptional cascade that soon causes the cell to switch its phenotypes associated with mutations we discuss. You can
metabolism to lactose utilization. also refer to Research Technique 6.1, which discusses the
The second way also involves the production of a tiny determination of the genotype of a bacterial strain based on
amount of permease and b@galactosidase:in this case, its pattern of growth and no growth in various media.
through basal transcription that takes place when both glu-
cose and lactose are available to a cell. Basal transcrip- Analysis of Structural Gene Mutations
tion becomes fully activated transcription when glucose is
exhausted and only lactose is available to a cell. The genetic analysis of the lac operon by Jacob, Monod,
and colleagues was made possible by the study of operon
mutations. Several dozen lac - mutants were generated
by treatment of E. coli with mutagens. The mutants were
12.3 Mutational Analysis first subjected to genetic complementation experiments to
determine whether the lac - phenotypes of different mutants
Deciphers Genetic Regulation resulted from mutation of the same gene or from mutations
of the lac Operon of different genes. Investigations showed that lac - mutants
formed two complementation groups, indicating that two
The identification and description of the lac operon began genes are responsible for the lac - phenotype. The two com-
with a series of publications in the early 1960s by FranÇois plementation groups are today known to correspond to lacZ
Jacob, Jacques Monod, André Lwoff, and several other (b@galactosidase) and lacY (permease).
colleagues. Their genetic analysis of numerous lac operon The complementation analysis was carried out using
mutants led to the identification of each gene and regulatory partial diploid bacterial strains that were produced by con-
region, and to the functional description of the operon as jugation between F′ (lac) and F - bacteria (see Section 6.3).
provided in the previous section. Jacob, Monod, and Lwoff Recall that exconjugants produced by F′ * F + conjugation
were awarded the Nobel Prize in Physiology or Medicine have two copies of a portion of the genome and are thus par-
in 1965 for this work (see the chapter opener photo). Their tially diploid. In the case of lac operon partial diploids, one
work also laid the foundation for a description of lac operon copy of the lac operon information resides on the recipient
Table 12.3 Synthesis of b-Galactosidase and Permease by Haploids and Partial Diploids with Structural Gene
Mutations
bacterial chromosome, and the second copy of the operon is of the lac operon polycistronic mRNA, producing three
acquired on the F′ plasmid. The genotype of partial diploids polypeptides. The presence of the polar (nonsense) muta-
is written with the F′ segment on the left and the recipient tion in the lacZ gene stops translation by the ribosome.
chromosome on the right. The homologous chromosomes As there is no other Shine–Dalgarno sequence in the tran-
are separated by a slash (/). For example, the genotype of a script, the ribosome is unable to translate the lacY or lacA
partial diploid demonstrating complementation of lac gene sequences. Thus, when a polar mutation occurs in the lacZ
mutations can be written as follows: gene, no permease is produced, even if the strain is lacY +.
F′ I + P + O + Z + Y - / I + P + O + Z - Y +
lac Operon Regulatory Mutations
Analyzed as haploid genotypes, each portion of the partial
diploid genotype above would produce the lac - pheno- Mutations of regulatory components of the lac operon alter
type. The F′ haploid lacks the ability to produce perme- the inducible response of the operon to the presence of lac-
ase (lacY -), and the bacterial haploid is unable to produce tose and allolactose in the cell. Certain mutations of the
b@galactosidase (lacZ -). Genetic complementation occurs lac operon lead to constitutive mutants, which are unre-
in this partial diploid, however, and the resulting pheno- sponsive to the presence or absence of lactose in the growth
type is lac + (see Table 12.3). The molecular basis of genetic medium. These mutants continuously transcribe the operon
complementation in this case is that the F′ portion of the genes, rather than transcribing the genes in an inducible
partial diploid provides b@galactosidase by its lacZ + gene, manner. Other regulatory mutations block all response to
and the recipient portion of the partial diploid provides per- lactose and render the cell lac -. Eventually, genetic mapping
mease by its lacY + gene. Based on the analysis of structural of constitutive mutations identified two distinct sites of con-
gene mutations, Jacob, Monod, and colleagues concluded stitutive mutations of the lac operon: lacO and lacI. Consti-
that there are two protein-producing genes required for lac + tutive mutations of lacO render the operator DNA sequence
growth behavior and that lacZ and lacY wild-type alleles are unrecognizable to the wild-type DNA-binding portion of
usually dominant to mutant alleles. Recombination mapping the repressor protein. On the other hand, constitutive muta-
analysis revealed close genetic linkage of the three struc- tions of lacI result in production of a repressor protein with
tural genes of the lac operon, but the order of these struc- a mutated DNA-binding region that is unable to recognize
tural genes (lacZ–lacY–lacA) was ultimately determined by and bind wild-type operator sequence. Both mutations pre-
mutational analysis. vent negative regulation of lac operon transcription.
Another type of structural gene mutation that proved It was the initial discovery of the existence of two sites
useful for understanding the process of translation of the of lac operon constitutive mutations that suggested to Jacob
lac polycistronic mRNA was base substitution nonsense and Monod that a negative regulatory system with two com-
mutations that generate stop codons in inappropriate loca- ponents exercises transcriptional control of the structural
tions. If one of these mutations, known as polar mutations, genes. They postulated that one constitutive mutation site is
occurs early in the lacZ portion of the polycistronic mRNA, the gene producing a regulatory protein and the second is
it has the curious effect of significantly reducing or pre- the target DNA-binding site for the regulatory protein.
venting translation of the other gene sequences in the tran-
script. How could this be? The answer is that there is just Operator Mutations The genetic evidence indicating
one Shine–Dalgarno sequence in the lac operon mRNA. It that the operator is the DNA sequence binding the repres-
occurs upstream of the start codon for the lacZ gene (see sor protein comes from the finding that lac operator (lacO)
Figure 12.5). Normally, individual ribosomes identify the mutations are exclusively cis-acting; that is, they influence
Shine–Dalgarno sequence and translate the entire length the transcription of genes only on the same chromosome.
12.3 Mutational Analysis Deciphers Genetic Regulation of the lac Operon 449
(a) l + (wild type) In the wild-type organism, lacI + produces repressor pro-
lacl lacP lacO tein that has an allosteric (allolactose) binding domain and
Repressor binds operator
a functional operator binding domain. Repressor protein
when the inducer is absent uses its operator binding domain to bind the regulatory
and forms an inducer– sequence and block transcription (Figure 12.8a). Bacteria
repressor complex when with operator mutations are constitutive for transcription
inducer is present.
Lac repressor of lac operon genes and have the genotype I + P + OC Z + Y +
protein Allolactose (Figure 12.8b and Table 12.4). The OC allele designation
signifies an “operator-constitutive mutation.” In OC mutants,
(b) OC (operator constitutive mutation) the nucleotide sequence of the operator region is altered
and is no longer recognized by wild-type repressor protein.
lacl lacP lacO In the absence of repressor protein bound to the operator
Operator-site mutation sequence, constitutive transcription of the operon genes
prevents repressor protein
binding and leads to
takes place and b@galactosidase and permease are produced
constitutive synthesis of continuously.
the lac operon. The crucial experiments revealing the cis-acting nature
Lac repressor
of lacO were performed with partial diploids. First it was
protein
shown that creation of partial diploids by conjugation of a
(c) l – (repressor mutation)
constitutive lac + strain (I + P + OC Z + Y +) with a lac - strain
producing defective b@galactosidase (I + P + O + Z - Y +) does
lacl lacP lacO not alter the constitutive transcription of b@galactosidase.
Repressor protein mutation Note that lacOC in the partial diploid appears dominant to
prevents repressor binding lacO +. Dominance on the part of lacOC arises because tran-
to the operator and
produces constitutive scription of the wild-type lacZ + allele is exclusively con-
synthesis of the lac operon. trolled by the lacOC mutation, since these two alleles are on
Mutant the same chromosome. The wild-type operator has no effect
protein
on the lacZ + allele because operator DNA is a cis-acting
element, not a trans-acting element.
(d) l S (super-repressor mutation)
In a second experiment, the lacZ alleles were on different
lacl lacP lacO chromosomes (Z + was with O + and Z - was with OC), and the
Repressor protein mutation partial diploid genotype F′ I + P + OC Z - Y +/I + P + O + Z + Y -
blocks binding to the inducer, was produced using two lac - strains. In this case, the F′
preventing formation of the
inducer–repressor complex.
strain is constitutive for permease production but does not
Mutant repressor protein produce functional b@galactosidase due to a lacZ muta-
Super-repressor binds to the operator, tion. The bacterial recipient strain produces b@galactosidase
mutant preventing transcription. by the wild-type inducible mechanism, but it does not
produce functional permease, due to mutation of lacY.
Figure 12.8 Regulatory mutations of lacI and lacO. (a) Wild-
type lacI and lacO. (b) Operator-constitutive (lacO C) mutation.
The partial diploid produces permease constitutively, but
(c) lacI - (operator-binding domain) mutation. (d) lacI S (super- b@galactosidase is produced only when transcription is
repressor) mutation of the allosteric binding domain. induced by lactose. This result could occur only if the opera-
tor is a cis-acting element. In this case, because the operator
Q In which of the mutants shown in (b), (c), and (d) is the allele in cis to Z + is wild type, b@galactosidase production
allosteric domain wild type, and in which is it mutated?
falls under the inducible control of the wild-type operator for lac operon transcription. These mutants produce mutant
sequence. Notice that in this partial diploid, the wild-type repressor protein with an altered allosteric domain. The
operator appears to be dominant to the OC mutant. mutant proteins are unable to bind allolactose and are unre-
The apparent difference in the dominance relationship sponsive to lactose addition or removal from cells. The
of O + and OC alleles is understandable if the lac operator DNA-binding domain is unaffected by the allosteric domain
is a cis-acting element that only controls the transcription mutation, but as a result of the nonfunctional allosteric
of genes on the same DNA molecule. Taken together, the domain, mutant repressor proteins cannot release the operator
two experiments reveal the lac operator to be cis-dominant, even in the presence of allolactose.
meaning that the only genes the operator is able to influence Haploids and partial diploids with mutations of the
are genes located downstream on the same gene. For the lac allosteric domain of the repressor protein are identified
operon, the “dominant” operator allele can differ, depend- as I S mutants and are designated super-repressors. These
ing on the alleles for the structural genes carried on each mutants are noninducible, meaning that operon gene tran-
chromosome. If both wild-type structural genes are in cis scription cannot be induced (Figure 12.8d and Table 12.4).
to lacOC, the mutant operator is dominant because it consti- Haploids with the genotype I S P + O + Z + Y + produce a
tutively transcribes both genes. This is the case in the first repressor protein that binds normally to operator sequence,
experiment. On the other hand, if wild-type structural genes but lacking a functional allosteric domain, the protein is
are on different chromosomes, as in the second experiment, not removed from the operator by lactose in the cell. Such
then the lacO + allele is dominant because it exerts inducible mutants are lac - and cannot be induced to metabolize lac-
transcriptional control on one of the two genes required for tose. Cultures of partial diploid bacteria with the genotype
lactose metabolism. F′ I S P + O + Z + Y +/I + P + O + Z + Y + may initially have some
inducible responsiveness to lactose, but this ability is lost as
Constitutive Repressor Protein Mutations Experimental mutant repressor protein binds to operator sequences. This
evidence supporting the hypothesis that the repressor gene partial diploid reveals the dominance of I S over I +.
produces a regulatory protein comes from the analysis of
mutants that constitutively transcribe lac operon genes Promoter Mutations Mutations of promoter consensus
where the mutant allele is recessive to the wild-type allele. sequences significantly reduce transcription or may elimi-
To see the dominance relationship of these alleles, nate it entirely (see Figure 8.12). To know the specific
let’s first consider a haploid cell with the lac operon effect of a promoter mutation usually requires direct test-
genotype I - P + O + Z + Y +. This cell constitutively tran- ing of transcription in the mutant organism. Promoters, like
scribes and produces both b@galactosidase and permease operators, are cis-acting regulatory sequences, and most
(Figure 12.8c). Similarly, a haploid strain with the genotype mutations of lacP significantly reduce, and may entirely
I - P + O + Z + Y - produces b@galactosidase constitutively, but eliminate, transcription of lacZ and lacY genes, which are
no permease is produced, and bacteria with the genotype located in cis. This reduces b@galactosidase and permease
I - P + O + Z - Y + constitutively produce permease but do not production to such a low point that haploid bacteria with the
produce b@galactosidase. genotype I + P - O + Z + Y + are lac -.
In contrast, a partial diploid with the genotype Table 12.5 summarizes the conditions for lac operon
F′ I + P + O + Z - Y +/I - P + O + Z + Y - expresses both enzymes gene transcription given the presence or absence of glucose
in their normal inducible manner. The I + allele can be on and lactose. Active transcription of operon genes takes place
either the F′ plasmid or the recipient chromosome and have only when glucose is depleted from the cell and lactose is
the same effect, inevitably resulting in the dominance of I + present. Under these conditions, the following events occur:
over I -. This outcome indicates that lacI produces a regu-
latory protein that is trans-acting—capable of influencing 1. Cyclic AMP level rises as a result of the availability
the expression of genes on other chromosomes. In this con- of adenylcyclase.
text, trans refers to a protein capable of diffusing through 2. CAP–cAMP complex forms and binds to the CAP
the cell and binding to a cis-acting target sequence. site of the lac promoter, thus activating transcription.
The molecular explanation of the trans-acting ability 3. Allolactose is produced by a side reaction of the
of the lac repressor protein is that a lacI - mutant alters the metabolism of lactose by b@galactosidase.
DNA-binding domain of the protein, rendering it incapable
4. Repressor protein conformation is modified by
of binding the operator sequence. In the absence of nega-
interaction with allolactose, causing the protein to
tive control, transcription is constitutive. In partial diploids
release from the operator, thus allowing operon gene
that are I +/I -, however, repressor protein with a functional
transcription.
DNA-binding domain is present in the cell and responds
normally to the addition or removal of lactose from the cell. Basal transcription occurs when both glucose and lac-
tose are present, due to the presence of allolactose to bind
Super-Repressor Protein Mutations A second set of repressor protein. When lactose is absent, no inducer–
repressor protein mutations produces a different consequence repressor complex can form, and no transcription takes
12.3 Mutational Analysis Deciphers Genetic Regulation of the lac Operon 451
Evaluate
1. Identify the topic this problem addresses 1. This problem concerns an analysis of patterns of transcriptional regula-
and the nature of the required answer. tion and the production of functional b@galactosidase and permease by lac
operon genotypes. The answer requires a determination of whether the
enzymes are produced inducibly, constitutively, or not at all.
2. Identify the critical information given in 2. The lac operon genotypes of three partial diploids are given.
the problem.
Deduce
3. Describe the consequences of any 3. The I - mutation produces a repressor protein that is unable to bind operator
mutations in genotype a. sequence. The Z - mutation will not produce functional b@galactosidase, and
the Y - mutation will not produce functional permease.
TIP: Assess regulatory mutations PITFALL: You must understand the wild-type func-
first; then consider the consequences tion of each operon component before evaluating
for structural gene transcription in genotypes. Do not attempt to memorize patterns
each partial diploid by evaluating the of ; + < and ; - < for operon components in hopes of
effect of each allele on transcription. determining lac + or lac - phenotypes.
4. Describe the consequences of any 4. The O C mutation alters the operator sequence and prevents binding and
mutations in genotype b. transcriptional repression by repressor protein. The Z - and Y - mutations
block production of functional b@galactosidase and permease.
5. Describe the consequences of any 5. The I S mutation produces a super-repressor protein that has an altered allo-
mutations in genotype c. steric domain and will not interact with allolactose. The O C and Z - alter func-
tion as described above.
Solve Answer a
6. Determine the expression pattern of 6. Wild-type repressor protein is trans-active and binds the wild-type operator.
functional enzymes for partial diploid a. This cis-acting operator blocks transcription of Z + and Y + when lactose is not
in the cell, but permits transcription when lactose is present. Therefore, both
enzymes are produced inducibly.
Answer b
7. Determine the expression pattern of 7. O C is cis-active on Z +, resulting in constitutive transcription. Y + is under the
functional enzymes for partial diploid b. cis-active transcriptional control of O +. Therefore, b@galactosidase is pro-
duced constitutively, and permease is produced inducibly.
Answer c
8. Determine the expression pattern of 8. The O C sequence is not recognized by either the wild-type repressor or the
functional enzymes for partial diploid c. super-repressor. Both repressors have wild-type DNA-binding sequences.
Cis-active O C constitutively transcribes Y +. The super-repressor binds O +,
and its cis activity renders Z + and Y + noninducible. Therefore,
b@galactosidase is noninducible, and permease production is constitutive.
For more practice, see Problems 5, 6, 17, and 18. Visit the Study Area to access study tools. Mastering Genetics
452
12.3 Mutational Analysis Deciphers Genetic Regulation of the lac Operon 453
1 2 3
–84 abc de f g
Footprint
DNA in lane 1 is
fully digested since Secondary
no DNA-binding O3 sequence
protein is added operator
C1
–10
RNA polymerase
+1
Repressor
Secondary
O2 sequence
+39
operator
Repressor footprint
5¿ 3¿
lacl lacZ
3¿ –84 +1 +39 5¿
–35 region –10 region
DNase l footprint protection analysis of the lacP and lacO regions Lac repressor protein footprint protection and DNA
and model. binding.
454 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
Repressor
protein Repressor
protein
In addition to the negative feedback mechanism, cer- pairs. The five structural genes transcribed in the operon are,
tain repressible operons have a second regulatory capability in order, trpE, trpD, trpC, trpB, and trpA. Together, the pro-
known as attenuation that has the ability to fine-tune tran- tein products of these genes are responsible for synthesis of
scription to match the moment-to-moment requirements of the amino acid tryptophan. Outside the operon, a sixth gene,
the cell, achieving a more-or-less steady state of compound trpR, encodes the repressor protein that is not activated until
availability. The difference between attenuation and induc- it pairs with tryptophan.
ibility can be clarified by an analogy. Inducible operons, Transcription of trp operon genes is regulated by a
such as lac, are akin to light switches that provide illumina- feedback inhibition system that responds to free tryptophan
tion in one setting (“on”) and no illumination in the alterna- in the cell. In this system, tryptophan acts as a corepressor
tive setting (“off”). Inducible operons are turned on and off by binding to and activating the Trp repressor protein that is
by molecular switches controlled by DNA-binding proteins. not active without its bound corepressor. Feedback inhibi-
Attenuation, on the other hand, works more like a dimmer tion is the principal mechanism turning on and turning off
switch that allows illumination to be incrementally adjusted trp operon gene transcription (Figure 12.14). In the absence
up or down. For several amino acid operons, the regulation of tryptophan, the inactive repressor is unable to bind trpO,
of gene expression has evolved to maintain steady amino and operon gene transcription takes place. When trypto-
acid levels in cells. In such systems, feedback inhibition phan is present, however, it binds the repressor to activate
turns off operon gene transcription when the amino acid is it, and the repressor–corepressor complex binds the operator
readily available, and attenuation fine-tunes the amino acid to block transcription. This is an efficient mechanism that
level to maintain a steady-state concentration. shuts down transcription of genes whose expression is not
needed at the moment. Such systems have evolved because
Feedback Inhibition of Tryptophan they save metabolic energy that would otherwise be wasted
transcribing unneeded mRNA and later recycling the unused
Synthesis transcript.
The tryptophan (trp) operon (“trip operon”) in the E. coli Based on this description, and knowing about the feed-
genome contains five structural genes that share a regulatory back inhibition of gene transcription, one might expect
region containing a promoter (trpP), an operator (trpO), and that trpR- bacteria that are mutant for the repressor pro-
a leader region (trpL) that contains the attenuator region tein would show constitutive transcription of operon genes
(Figure 12.13). The regulatory region spans 312 base pairs, regardless of whether tryptophan is present. Surprisingly,
and the five structural genes span approximately 6800 base however, this is not the case. In wild-type bacteria (trpR+),
(a) Tryptophan absent region are critical to its attenuation function. First, the four
Transcription repeat sequences, designated 1, 2, 3, and 4, can form dif-
trpP trpO trpL trpE trpD trpC trpB trpA ferent stem-loop structures (Figure 12.15b–d). (Stem-loop
structures are discussed in Section 8.2 in connection with
intrinsic transcription termination in bacteria; see Fig-
Polycistronic mRNA ure 8.7.) Second, among the codons for the 14 amino acids
encoded by trpL mRNA, there are two back-to-back trypto-
The inactive repressor does not phan codons (UGG) that function to sense the availability of
Repressor bind trpO, and transcription of the tryptophan and are essential for attenuation.
(inactive) operon genes occurs. The formation of stem loops of trpL mRNA is directly
tied to the continuation or termination of transcription of the
(b) Tryptophan present five trp operon genes. In the trpL region mRNA, region 1 is
No transcription complementary to region 2, region 2 is complementary to
trpP trpO trpL trpE trpD trpC trpB trpA region 3, and region 3 is complementary to region 4. Two
Active of these stem-loop structures, the 3–4 stem loop and the 2–3
repressor stem loop, are central to attenuation. The third type of stem
loop, the 1–2 stem loop, plays a minor role in attenuation.
The 3–4 stem loop of mRNA, which is the termination
The repressor is activated by the
Repressor corepressor tryptophan and binds trpO stem loop, signals transcription termination. This is iden-
to block operon gene transcription. tified as the transcription termination site in Figure 12.15.
+
Formation of the 3–4 stem loop halts RNA polymerase
Tryptophan (corepressor)
progress along the DNA, terminating transcription in the
Figure 12.14 Trp operon transcription regulation by the leader region before it reaches the structural genes of the
repressor, with tryptophan absent (a) and with tryptophan operon (Figure 12.16a). Notice that region 4 is followed
present (b). immediately by a poly-uracil sequence (a poly-U tail). This
configuration—an mRNA stem loop followed by a uracil
tryptophan synthesis is very low when tryptophan is pres- string—is the same as one described in connection with
ent in the cell, but whereas tryptophan synthesis by trpR- intrinsic termination of transcription in bacteria (see Figure
strains is higher under the same conditions, it is not at 100% 8.7). Formation of a 3–4 stem loop may be accompanied
capacity (Table 12.6). Both trpR+ and trpR- strains syn- by formation of a 1–2 stem loop, which can induce a pause
thesize tryptophan at 100% of capacity when tryptophan is in transcription, as part of the attenuation process. Forma-
absent. This suggests that a second regulatory mechanism is tion of the 1–2 stem loop occurs when a ribosome does not
also affecting transcription of trp operon genes. affiliate with the nascent trp operon leader mRNA. In the
absence of an RNA-bound ribosome, regions 1 and 2 form
Attenuation of the trp Operon a double-stranded stem. This leads, in turn, to subsequent
formation of a 3–4 stem loop that terminates transcription.
The second mechanism regulating trp operon gene tran- The alternative to the 3–4 stem loop is the 2–3 stem
scription is attenuation, controlled by alternative folding of loop, which is the antitermination stem loop. This stem
the mRNA synthesized from the 162-bp trpL region. RNA loop forms when region 1 is unavailable for immediate
polymerase binds to trpP and initiates transcription of trpL. pairing with region 2, a situation that leads region 2 to pair
The trpL region contains four repeat DNA sequences, and with region 3. In turn, formation of the 2–3 stem loop pre-
the mRNA transcript of this region contains complemen- cludes the formation of a 3–4 stem loop (Figure 12.16b).
tary repeats that lead to the folding of mRNA into double- The antitermination stem loop allows RNA polymerase to
stranded regions. The trp leader region also encodes a start continue transcription through the leader region and into the
codon, a short polypeptide of 14 amino acids (including the structural genes of the trp operon, beginning with the tran-
methionine of the start codon), and a stop codon. Transla- scription of trpE. If transcription progresses past region 4,
tion of this 14–amino acid polypeptide plays a pivotal role a polycistronic mRNA spanning the five trp genes is pro-
in attenuation (Figure 12.15a). Two features of the trpL duced. Translation of the five enzymes required for trypto-
phan synthesis follows.
Table 12.6 Percentage of Full Tryptophan Each mRNA transcribed from the trpL operon eventu-
Expression for trpR + and trpR − Strains ally forms either a 2–3 stem loop or a 3–4 stem loop, but
what determines the type of stem loop an mRNA will form?
Tryptophan Present Tryptophan Absent
The coupling of transcription and translation that is a promi-
+ 8% 100%
trpR nent feature of bacterial gene expression plays a critical role
trpR - 33% 100% in deciding this outcome. Transcription of the trpL region
begins at the +1 nucleotide after RNA polymerase initiates
12.4 Transcription from the Tryptophan Operon Is Repressible and Attenuated 457
C G G G C A G U G UA U U C A C C A U G C G U A
A A G G U U G G U G G C G C A C U U C CU G
AA
GA G C G G G C U UUUUUU GAACAA A
A U C A G A UACC CAG C C C G C CU
Ser
Thr
Arg
Region 1 2 3 4
Trp
Trp
U-rich termination
Gly sequence
5¿ (3–4 stem loop only)
ACG U Lys
G GU A Met Gln Thr
AA
1
C
G U U
AA
AG
10 A C Leu
U
A G Met 50 A A G C A
GA
20 A Val
C Lys 150 A C A A U G C A A A C A 3¿
AA Ala Ile Phe UA
U
100
AU
UG UA A
A
C
A GAG 162
AA
AGC GU
AAUUUUC Beginning of trpE
Beginning of trpL 30 40 coding sequence
coding sequence
(b) Pause stem loop (1–2 stem loop) (c) Antitermination stem loop (2–3 stem loop) (d) Termination stem loop
A (3–4 stem loop)
AA
C G G G C A G U G U AU U C A C C A U G C G U A
CU G
AA AAA 5¿ C A AUC
GU A
A U C A G A U A C C C A G C C C G C CU
GG U U G G U G G C G C A C U U C
G A G C G G G C U UUUUUU GAA C A A A A U
AUGC
GC
GA
A AU
UA C C C A G C C C G C C U
CC
3 4 CA
A
UU U U U U 3¿
C G G G C A G U G U AU U C
G
U G A G C G G G C UU
1 2
A U ACCCAGCCCG C
3 4
2 3
A A
AAA
A AG C A 4
UG
C U AA G A G CGG G C UUU 3¿
AA
.. 1
C
.
..
. A
5¿ 3¿ 5¿ UC C U G
Figure 12.15 The trpL attenuator region mRNA transcript. (a) The trpL attenuator contains 162 nucleotides that
include a 14-amino acid coding sequence and four inverted repeat sequences that encode regions 1 through 4 in
trpL mRNA. (b–d) Three alternative stem loops can form in trpL mRNA.
transcription. Transcription across repeat regions 1 and 2 10 and 11 of the mRNA specify tryptophan, making com-
can lead to formation of a 1–2 stem loop that temporarily pletion of translation dependent on tryptophan availability;
pauses the progress of RNA polymerase. The pause is only and (3) region 4 is followed immediately by a poly-U string,
momentary, however; it lasts just long enough for a ribo- a feature associated with intrinsic termination of transcrip-
some to bind at the start codon in trpL and begin transla- tion. As coupled transcription and translation proceed, the
tion of the 14–amino acid polypeptide starting with the relative positions of RNA polymerase and the ribosome are
AUG codon identified in Figure 12.15. Translation initiation determined by how efficiently the ribosome can progress
breaks the 1–2 stem loop, RNA polymerase resumes tran- along the mRNA. This process, in turn, is tied directly to the
scription, and the ribosome and RNA polymerase begin availability of tryptophan and the rapidity with which tryp-
their coupled progression. tophan is inserted into the nascent polypeptide chain. When
Notice three features of the leader mRNA depicted the cell has an adequate supply of tryptophan, the ribosome
in Figures 12.15 and 12.16: (1) The polypeptide-coding makes steady progress along trpL mRNA until it arrives
sequence overlaps the entirety of leader region 1, and the at the stop codon, where it partially overlies region 1 and
stop codon is immediately adjacent to region 2; (2) codons region 2. Simultaneously, RNA polymerase is transcribing
458 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
p
Ile loop terminates transcription after the
hr Arg Tr
Ala
Lys Met
poly-U string. (b) In tryptophan starva-
tion, the 2–3 (antitermination) stem loop
Ribosome leads to polycistronic mRNA synthesis.
Ser T
Q If codons 10 and 11 shown in (b)
CAA were GGG (Gly) instead of UGG (Trp),
G
UC G
would antitermination still occur at a
AAA
A
U A low level of tryptophan? Why or why
not?
Region 1 Region 2
C G
U A C C CAG C C C G C C U
UGGUGGCGCACUUCCUGAAACGGGCAGUCUAUUCACCAUG
Codons:Trp Trp Arg Thr Ser Stop
10 11 12 13 14 U UUUUUUU trpE
U G AG C G G G C U
14–amino acid
polypeptide region Region 3 Region 4
Ribosome completes translation of trpL coding
sequence and occupies regions 1 and 2. Regions
3 and 4 pair, and transcription terminates.
A
A
U A AA
M
s G G
Ly C
G Ala C
Ile A
e U
Ph A
A
C U
al
CA C
ys Leu V
A
C G
Region U A Region
2 U 3
Ribosome A U
Gly L
U A
G C
U C
G C
A A
C G
G C
G C
Region 1 G C Region 4
C G CCUAAUGAGCGGGCUUUUUUUUU
UGGUGGCGCACUUCCUGAAA trpE
Codons:Trp Trp Arg Thr Ser Stop
10 11 12 13 14 Ribosome stalls at region 1, and regions 2 and
3 pair. Transcription continues into operon
14–amino acid genes.
polypeptide region
region 3, followed by region 4. With a portion of region 2 When the cell is starved for tryptophan, the supply of
occupied by the ribosome and unavailable for pairing in a charged tRNATrp is low. The ribosome is forced to pause
stem loop, region 3 forms a stem loop with region 4, the momentarily at codons 10 and 11 to await the arrival of a
only available complementary segment of the mRNA. The charged tryptophan tRNA that will incorporate tryptophan
3–4 stem loop, being immediately followed by a poly-U into the nascent polypeptide. As the ribosome pauses, its
string, causes transcription to spontaneously terminate at the mass covers region 1. Meanwhile, RNA polymerase con-
end of region 4 by the intrinsic process. Formation of the tinues to transcribe trpL. As RNA polymerase transcribes
3–4 stem loop (the termination stem loop) stops transcrip- region 3, the region 3 mRNA finds a complementary partner
tion of the trp operon in the leader sequence before RNA in region 2, leading to 2–3 stem-loop formation. Region 3
polymerase reaches the beginning of the trpE gene. Tran- is not followed by a poly-U string, making intrinsic termi-
scription thus ceases only when the system senses that no nation impossible. Transcription continues through region
additional tryptophan is needed to supply translation. 4 and on into the structural gene region of the operon to
12.5 Bacteria Regulate the Transcription of Stress Response Genes and Also Translation 459
produce the polycistronic mRNA transcript of the operon. pairs binding these two regions destabilize the termination
Formation of a 2–3 stem loop (the antitermination stem stem loop and reduce the efficiency of the mutated operon
loop) thus permits transcription and translation of the system in repressing structural gene transcription. Genetic
enzymes necessary to synthesize tryptophan when the sys- Analysis 12.2 examines mutations of the trp operon.
tem senses that the available supply of tryptophan is insuf-
ficient to support translation. Attenuation in Other Amino Acid Operon
Each trpL mRNA makes a molecularly based “deci-
Systems
sion” about whether to form a 3–4 or a 2–3 stem loop,
depending on the availability of charged tRNATrp at the Attenuation represses transcription of structural genes
moment tRNATrp is needed by ribosomes. It is likely that in several amino acid operon systems in bacteria such as
at any given moment in time, a single bacterial cell con- E. coli and Salmonella typhimurium. Like the trp operon,
tains a mixture of trpL mRNAs with 2–3 stem loops and these other amino acid operons also contain multiple
trpL mRNAs with 3–4 stem loops. The balance shifts in the codons for the target amino acid in their leader transcripts
direction of more 3–4 stem loops and fewer 2–3 stem loops (Figure 12.18). For example, the leader polypeptide of the
at higher levels of tryptophan concentration and shifts in the E. coli histidine operon contains a run of seven consecutive
opposite direction—more 2–3 stem loops and fewer 3–4 histidine residues in the attenuator. Similarly, the phenylala-
stem loops—as tryptophan concentration falls. The result- nine leader polypeptide contains seven phenylalanine resi-
ing fine-tuning allows each cell to maintain a relatively dues in a span of nine amino acids in the attenuator region.
steady concentration of tryptophan by turning tryptophan Like the trp operon, these operons use attenuation based
synthesis up or down to meet the needs of the cell. on the formation of antitermination stem loops to regulate
operon gene transcription.
Attenuation Mutations
The attenuation model is supported by mutagenesis experi- 12.5 Bacteria Regulate the
ments. For example, experiments altering one or both of the
two adjacent tryptophan codons (in positions 10 and 11 of Transcription of Stress Response
the trpL mRNA) by missense mutation to specify another Genes and Also Translation
amino acid have provided evidence of the importance of
the back-to-back tryptophan codons in the trpL transcript. The need on the part of bacteria to respond rapidly to chang-
Mutation of one tryptophan UGG codon affects the atten- ing environmental conditions suggests that transcriptional
uator responsiveness to tryptophan. If both tryptophan regulation must accommodate both common and rare cir-
codons are altered by missense mutation, the attenuator no cumstances, and also that the regulation of translation must
longer senses tryptophan concentration and instead senses be available under certain circumstances. This section pres-
the availability of the amino acid encoded by the mutated ents examples of transcriptional regulation in bacteria under
codons. Mutagenesis experiments have also targeted regions rarely encountered conditions and then describes how bacte-
3 and 4 of the leader sequence (Figure 12.17). Base substi- ria regulate translation.
tutions that reduce the percentage of complementary base
Alternative Sigma Factors and Stress
Remainder of Response
trpL mRNA
The operon mechanisms described to this point are exam-
Nucleotide position 110 A Poly-U string ples of the regulatory strategies employed by bacterial cells
C under conditions they encounter routinely. In response to
C 140
rare or unusual environmental circumstances, however,
C UUUUUU
A U
G U his operon:
A
Region 3 C C A
U Met TheArgVal Gln Phe Lys His His His His His His His ProAsp
U C G U Region 4
A C G A C leu operon:
A G G
Mutations that reduce Met Ser His lle ValArg Phe Thr Gly Leu Leu Leu Leu Asn AlaPhe
U C C
C G complementarity pheA operon:
U A Met Lys His lle Pro Phe Phe Phe Ala Phe Phe Phe Thr Phe Pro
A AG
125 thr operon:
Met Lys Arg lle Ser ThrThr lle Thr Thr Thr lle Thr lle Thr Thr Gly
Figure 12.17 Mutations of trpL. Mutational analyses identify
10 base-pair substitutions in regions 3 and 4 of trpL that each Figure 12.18 Four bacterial amino acid operons with attenu-
decrease the efficiency of transcriptional regulation in the attenua- ator control of transcription. The regulatory amino acid for each
tor region by disrupting formation of the 3–4 stem loop. operon is shown in bright red.
GENETIC ANALYSIS 12.2
PROBLEM Describe the effects on attenuation and on tryptophan synthesis of the following
mutations of the tryptophan codons (UGG) in the attenuator region of the trp operon.
a. The tryptophan codons are mutated to UAGUGG.
BREAK IT DOWN: You should be able to define
b. The tryptophan codons are mutated to UUGUUG. attenuation and to describe how the presence of two
tryptophan codons in the trp operon leader transcript
participates in determining whether the termination
(3–4) stem loop or the antitermination (2–3) stem loop
forms in the transcript. See Figure 12.16 (p. 458).
For more practice, see Problems 7, 15 and 25. Visit the Study Area to access study tools. Mastering Genetics
bacteria switch gene transcription patterns to use genes that promoter-recognition capacity of the RNA polymerase core
are not normally expressed. The response of E. coli to heat enzyme. Recall that the RNA polymerase core enzyme is
stress illustrates how expression of an alternative sigma (s) bound by a sigma subunit to form the holoenzyme (see
factor alters gene transcription by activating the transcrip- Section 8.2). Under normal growth conditions, the RNA
tion of specialized heat stress response genes. polymerase holoenzyme recognizes bacterial promoters
Escherichia coli grow vigorously at 37°C and can toler- containing an AT-rich Pribnow box at the -10 site. The
ate only narrow temperature variation. At low temperatures, common sigma subunit, identified as s70, forms part of this
their growth slows—an important reason refrigeration is holoenzyme that transcribes a wide array of bacterial genes
used to preserve foods. At the other extreme, high tempera- under normal physiological conditions.
tures kill the bacteria. This is the reason cooking is so effi- Bacteria grown at 45°C undergo several changes,
cient at reducing bacterial contamination of food. At the less including initiation of the expression of heat shock pro-
dramatically elevated temperature of 45°C, E. coli change teins, which are expressed only at high temperature, and
their pattern of transcription by activating the expression of of chaperone proteins, a class of proteins that either refold
genes that are part of the heat shock response by the cell. or degrade other proteins damaged by high heat. At these
The heat shock response protects E. coli cells from certain higher temperatures, s70 is unstable, and RNA polymerase
kinds of heat-induced damage. Similar mechanisms are containing it functions very poorly. To explain the tran-
common in other microorganisms as well as in fruit flies, scription of heat shock proteins in the presence of poorly
plants, and animals, including humans. functioning s70@containing RNA polymerase, researchers
Heat shock response in bacteria involves expres- proposed and quickly found genetic evidence pointing to an
sion of an alternative sigma (s) subunit that changes the alternative, high-temperature s subunit.
460
12.5 Bacteria Regulate the Transcription of Stress Response Genes and Also Translation 461
(a) Promoter sequences recognized by different sigma factors The promoter for rpoH is recognized by s70@containing
Promoter sequences RNA polymerase when the temperature is elevated. The
–35 –10 sigma factor translated from rpoH mRNA (that is, s32) is
s70 T T G A C A ...16–18 bp... T A T A A T very active in stimulating transcription of heat shock genes.
Recognized by: In addition, transcription of a third sigma subunit known
s32 C T T G A A...13–15 bp...C C C C A T N T
as s24, which is normally present in E. coli cells at a very
low level, is greatly increased at elevated temperatures. The
(b) Events at elevated temperature RNA polymerase holoenzyme containing s24 also recog-
RNA core enzyme nizes the rpoH promoter and transcribes the gene at elevated
s 70
s24 temperatures that inactivate s70.
A second transcriptional change that occurs as a con-
or
sequence of high heat is a change in the chaperon proteins.
At normal growth temperatures, several chaperon proteins
bind the small amount of s32 present in the cell to inhibit its
ability to form holoenzyme. At high temperatures, chaper-
Transcription one proteins release s32, leaving it free to join an RNA poly-
PrpoH merase core enzyme and form a holoenzyme. Free chaperon
rpoH mRNA
proteins are redirected to bind heat-damaged cellular pro-
teins instead. In this role, chaperon proteins either degrade
the proteins they bind or assist in refolding the proteins.
s³² Several additional examples of the use of alternative
sigma factors in bacteria have been described. For example,
s³² Bacillus subtilis is a bacterium that normally propagates by
vegetative growth, but poor growth conditions switch the
Transcribes heat growth mode to sporulation by activating the expression of
shock genes.
alternative sigma factors. The gene transcription evidence
shows that as growth conditions deteriorate, transcription
Figure 12.19 Alternative sigma factors for heat shock of the common sigma factor is replaced by the transcrip-
genes. (a) Promoter sequences recognized by s70@ and tion of two alternative sigma factors. The new sigma factors
s32@containing RNA polymerase. (b) At elevated temperature, s70 recognize the promoters for genes used only in sporulation.
and s24 transcribe rpoH, which encodes s32 that in turn joins the A broad array of evidence shows that switching transcrip-
RNA core enzyme to transcribe heat shock genes. tion from the normal sigma factor to alternative sigma fac-
tors induces a genome-wide change in the pattern of gene
The evidence came from studies of mutant, temperature- expression that silences previously active genes and initi-
sensitive E. coli that grow normally at 37°C but fail to grow ates transcription of specialized genes that are used only
at 45°C. This temperature sensitivity is a conditional lethal under restrictive or extreme growth conditions. Table 12.7
mutation affecting a gene called rpoH, which encodes an compares the mechanisms of gene regulation in bacterial
alternative sigma subunit known as s32. When s32 binds an systems.
RNA polymerase core enzyme, the resulting holoenzyme
recognizes different promoter sequences than are recog-
Translational Regulation in Bacteria
nized by holoenzymes containing s70 (Figure 12.19). In
contrast to the AT richness that characterizes the Pribnow Transcriptional regulation is far and away the predominant
box sequence of bacterial promoters, the -10 region of pro- mode of controlling gene expression in bacteria, but bacte-
moters recognized by s32@containing RNA polymerase is ria are also capable of translational regulation. Translational
rich in G-C base pairs. regulation takes place by two mechanisms, one that binds
(a) Low TPP: Riboswitch not active (b) High TPP: Riboswitch active
Riboswitch
sequences
Transcription continues
into the thi operon;
TPP Termination
proteins are produced.
... stem loop
mRNA 5¿
mRNA 5¿ UUUUU 3¿ Intrinsic
Antitermination termination;
stem loop no proteins
are produced
Low TPP concentration generates High TPP concentration initiates
antitermination and thi operon the riboswitch that terminates
transcription. transcription by intrinsic termination.
mRNA that produces the enzymes used in TPP synthesis. Riboswitch Regulation of Translation
Alternatively, when TPP concentration is high, TPP binds
to the riboswitch regulatory sequence (Figure 12.21b). This The regulation of TPP synthesis in E. coli also uses a ribo-
generates a termination stem loop that is immediately fol- switch, but in this bacterium TPP production is controlled at
lowed by a poly-U sequence, leading to intrinsic termina- the level of translation. The thiMD operon in E. coli contains
tion of transcription before RNA polymerase reaches the thi genes used for TPP synthesis. When TPP concentration is
operon genes. Because the genes of the thi operon are not low, the 5′ UTR region of mRNA folds into a secondary
transcribed, no polycistronic mRNA is generated, and no structure that contains a Shine–Dalgarno antisequestor
protein production occurs. The TPP riboswitch is an attenu- stem loop (Figure 12.22a). The antisequestor stem loop
ation mechanism that is able to sense the concentration of allows the Shine–Dalgarno sequence to bind to 16S rRNA
TPP so as to produce more when the concentration is low in the small ribosomal subunit (see Figure 9.7). This places
and less as the concentration rises. the start codon (AUG) in position to act as the translation
(a) Low TPP: Riboswitch not active (b) High TPP: Riboswitch active
Riboswitch
sequences
Shine−Dalgarno sequence
TPP
AUG
Shine−Dalgarno
3¿ Translation occurs. sequestor stem loop
AUG
Figure 12.22 A riboswitch mechanism regulating translation of Q In a sentence or two, describe how this mechanism of
mRNA. (a) TPP is produced by translation of E. coli thiMD mRNA riboswitch transcriptional regulation differs from the mechanism
at low TPP concentration. (b) At high TPP concentration, translation illustrated in the previous figure (Figure 12.21).
is inhibited by TPP binding to riboswitch sequences.
464 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
initiator codon. The proteins of the thiMD operon are pro- 12.7 Antiterminators and
duced and TPP synthesis follows.
An alternative thiMD mRNA configuration forms when Repressors Control Lambda
TPP concentration is high to inhibit translation of the operon Phage Infection of E. coli
genes (Figure 12.22b). In this configuration, TPP bound to
the riboswitch sequence induces the formation of an mRNA
Bacteriophage (or phage, for short) are viruses that infect
stem loop that contains the Shine–Dalgarno sequence and
bacterial cells. Like all viruses, they must infect host cells
the start codon sequence. In this configuration, the Shine–
to reproduce (see Section 6.4). Their tiny genomes do not
Dalgarno sequence is not available to bind 16S rRNA, nor
contain all the genes necessary for replication, transcription,
is the start codon able to initiate translation. No proteins are
and translation, so phage are obligate parasites that use an
produced from the thiMD operon genes with this mRNA
ingenious array of tricks to accomplish these molecular pro-
configuration.
cesses. The secret to their reproductive success lies in their
ability to commandeer bacterial proteins and enzymes to
preferentially express phage genes over bacterial genes.
Riboswitch Control of mRNA Stability
Given the limited content of phage genomes, some of
The third regulatory riboswitch mechanism affects the sta- the most important genes for phage reproduction are those
bility of mRNA. Figure 12.23a shows that in B. subtilus, that redirect the activity of bacterial host genes to serve
transcription and translation of the glmS gene produces phage requirements. Successful phage infection requires
the enzyme called glutamine:fructose-6-phosphate amido- (1) that genetic regulatory switches be controlled through
transferase. This enzyme participates in the production of phage gene expression to redirect the action of host genes
a sugar abbreviated as GlcN6P. Transcription of glmS and and (2) that phage gene expression initiate a sequence of
translation of its mRNA occur when the cellular concentra- events leading the bacterium to participate in the expression
tion of GlcN6P is low and more is needed in the cell. The of phage genetic information. In no bacteriophage is there
riboswitch regulatory activity occurs when GlcN6P concen- a clearer picture of the processes that control regulatory
tration is high and no more need be produced. Under this genetic switching than in lambda (l) phage.
circumstance, GlcN6P binds to the riboswitch sequences in Recall that all bacteriophage are capable of infecting
the 5′ UTR of glmS mRNA (Figure 12.23b). This induces and reproducing within the host bacterial cell. The infec-
cleavage of the mRNA that prevents it from attaching to a tion ends with the lysis of the host cell, in a process called
ribosome and undergoing translation. the lytic cycle (see Figure 6.16). But certain bacteriophage
(a) Low GlcN6p concentration: Riboswitch not active (b) High GlcN6p concentration: Riboswitch active
Riboswitch
sequence Shine−
Dalgarno
sequence Start
AUG codon GlcN6P S–D AUG
glmS glmS
mRNA 5¿ 3¿ mRNA 5¿ 3¿
Translation Cleavage
portion mRNA
cleavage
glutamine:fructose-6-
phosphate amidotransferase
Precursor GlcN6P +
GlcN6P
Low GlcN6P concentration permits transcription of High GlcN6P concentration leads to GlcN6P binding
glmS, translation of its mRNA, and production of to the riboswitch sequences and cleavage of the
GlcN6P. mRNA.
Figure 12.23 Control of mRNA stability by a riboswitch. (a) Transcription and translation lead
to production of GlcN6P in B. subtilis when GlcN6P concentration is low. (b) When GlcN6P is in
high concentration, it binds to riboswitch sequences and generates mRNA cleavage.
12.7 Antiterminators and Repressors Control Lambda Phage Infection of E. coli 465
cIII N PL
bet
PR cro cII
exo
O
Delayed early genes Late genes P
The second molecular decision to be made involves Cro Protein and the Lytic Cycle
direct competition between the Cro protein and the l repres-
sor protein. They compete for binding to operator sites, with Entry into the lytic cycle requires the transcription of
the winning molecule determining whether the lytic cycle or late genes that are regulated by late promoters and late
the lysogenic cycle is established. In the following discus- operators. These genes are rightward of PR, and are involved
sion, we focus on the competitive binding between l repres- in the synthesis of head and tail proteins, as well as products
sor protein and Cro protein. that lyse the host cell. The genetic switch governing whether
F O U N D A T I O N F I G U R E 12.25
P1clll tL N PL OL cl PRM OR3 OR2 OR1 PR cro tR1 PRE cll O P tR2 Q
mRNA mRNA
2 N protein acts as an
antiterminator to extend
transcription beyond
termination sequences tL,
N protein
tR1, and tR2.
clll tL N PL OL cl PRM OR3 OR2 OR1 PR cro tR1 PRE cll O P tR2 Q
mRNA mRNA
2 Accumulation of
Cro protein
clll protein cll protein Cro protein
3 Accumulation of cll/clll
complex leads to
lysogenic cycle.
Lysogenic cycle development Cro and l repressor undertake
competitive binding for
operators OR1, OR2, and OR3.
cll/clll protein
P1clll tL N PL OL cl PRM OR3 OR2 OR1 PR cro tR1 PRE cll O P tR2 Q P1clll tL N PL OL cl PRM OR3 OR2 OR1 PR cro tR1 PRE cll O P tR2 Q
P1clll tL N PL OL cl PRM OR3 OR2 OR1 PR cro tR1 PRE cll O P tR2 Q P1clll tL N PL OL cl PRM OR3 OR2 OR1 PR cro tR1 PRE cll O P tR2 Q
6 Transcription occurs from PRM to transcribe cI, and 3 Transcription continues from PL and PR and delayed
transcription from PR is blocked. The lysogenic cycle early and late gene transcription leads to the lytic
is established. cycle.
467
468 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
C A SE S T U D Y
Vibrio cholerae—Stress Response Leads to Serious Infection Through
Positive Control of Transcription
Cholera is a severely debilitating and potentially fatal dis- degradation of the mucosal cells lining the intestines and to
ease caused by infection with the intestinal bacterium Vibrio excessive leakage of water from the damaged cells. The leak-
cholerae. It is a major public health problem in developing age of water and electrolytes disturbs the osmotic balance
countries where sanitation and supplies of clean water are of the cells; to compensate, they secrete more water, initiat-
inadequate or following disasters that disrupt normal sanita- ing a repeating cycle of ion leakage and water release that
tion and supplies of clean water. The bacterium is transmit- produces watery diarrhea and severe dehydration. Unless
ted from person to person through contact with infected immediate antibiotic treatment and rehydration therapy are
fecal material. The ingestion of fecal-contaminated water started, death can occur within hours.
is the most common way of contracting cholera. Many
ingested bacteria are killed by the highly acidic environment VIBRIO CHOLERAE TOXINS In V. cholerae, three genes—
of the stomach, but V. cholerae in particular can survive in toxS, toxR, and toxT—exert positive control over the tran-
greater numbers than most bacteria by undertaking a rapid scription of genes producing virulence (active bacterial
switch in gene regulation that shuts down the expression of growth that causes disease). The expression of toxS and toxR
some genes and activates the expression of stress response genes is stimulated by the environmental cues encountered
genes. Unfortunately for infected humans, the V. cholerae by V. cholerae in the hostile environment of the stomach.
stress response produces toxins that can rapidly lead to A protein complex formed by the products of these genes
470 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
activates transcription of toxT. The polypeptide product of more than 100,000 deaths are attributed to cholera annually.
toxT is a transcription-activating protein that binds to the Vaccines can help prevent some cholera cases, and oral anti-
promoter Pctx that controls transcription of an operon contain- biotics can help treat the disease once it has been acquired.
ing the two genes CtxA and CtxB (abbreviations for “cholera Important as well is gaining understanding of how the ToxS–
toxin A” and “cholera toxin B”). The polypeptide products of ToxR complex and ToxT operate in promoter recognition, and
CtxA and CtxB are the cholera toxins that initiate the series of identifying the other genes they regulate. Similarly, gathering
actions leading to cholera symptoms. information about the stress response and virulence genes in
V. cholerae will help medical practitioners and microbiologists
PREVENTING AND STUDYING THE DISEASE PROCESS understand how the bacterium produces its lethal effects.
Preventing cholera is an obvious public health priority. Such knowledge may suggest new strategies that can disable
According to the World Health Organization, between the bacterium before it causes disease or new treatments that
3 million and 5 million people contract cholera each year, and can prevent the most serious consequences of infection.
SU MMA RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
12.1 Transcriptional Control of Gene ❚❚ The analysis of mutant haploid and partial diploid bacteria
Expression Requires DNA–Protein Interaction identified the trans-acting repressor protein that binds the
operator sequence.
❚❚ Regulated genes are under transcriptional control, whereas ❚❚ lac operator mutation analysis indicates that the operator is
constitutive genes are not regulated. a cis-acting element that controls transcription of immedi-
❚❚ In negative control of transcription, regulatory proteins ately adjacent genes on the chromosome.
bound to DNA reduce or eliminate transcription. ❚❚ The Lac repressor binding site overlaps the RNA poly-
❚❚ Regulatory proteins, also called repressors, have a DNA- merase binding location in the lac promoter.
binding domain to bind regulatory DNA sequences and an ❚❚ Lac repressor protein binding induces DNA loop formation
allosteric domain to bind a regulatory molecule. that prevents RNA polymerase binding at the promoter.
❚❚ An inducer molecule binds to the repressor molecule at an ❚❚ The CAP–cAMP complex binds to the CAP binding site of
allosteric site to inhibit its action. the lac promoter and facilitates RNA polymerase binding.
❚❚ In positive regulatory control, activator proteins bind DNA
at promoters and other regulatory sequences and initiate or
increase transcriptional efficiency. 12.4 Transcription from the Tryptophan
Operon Is Repressible and Attenuated
12.2 The lac Operon Is an Inducible Operon ❚❚ The tryptophan (trp) operon is a repressible operon that
System under Negative and Positive Control produces five polypeptides that participate in tryptophan
synthesis.
❚❚ Bacterial operons transcribe two or more genes under the
❚❚ trp operon transcription is inhibited by a feedback mecha-
coordinated regulatory control of shared promoters, opera-
nism involving tryptophan as a corepressor.
tors, and other regulatory elements.
❚❚ trp operon gene expression is attenuated to maintain the
❚❚ The lactose (lac) operon is an inducible operon system that
cellular concentration of tryptophan at a steady state. Many
produces three proteins—b@galactosidase (lacZ), permease
of the amino acid operons are regulated by an attenuation
(lacY), and transacetylase (lacA) that are required to metab-
mechanism.
olize lactose and its by-products. Its regulatory control cen-
ter contains a promoter and an operator sequence (lacO). ❚❚ The trpL (leader) region contains an attenuator sequence of
four DNA repeats that form three alternative mRNA stem
❚❚ Negative control of lac operon gene transcription is exerted
loops, two of which are central to attenuation.
by a repressor protein (lacI) that binds to the lacO region
to block transcription. Allolactose inactivates the repressor ❚❚ The 2–3 (antitermination) stem loop formed by mRNA
protein by changing its conformation and preventing it from permits transcription of five trp operon structural genes in a
binding to the operator. polycistronic mRNA.
❚❚ Positive control of transcription of lac operon genes is ❚❚ The 3–4 (termination) stem loop of mRNA terminates tran-
exerted by the CAP–cAMP complex that forms in the scription before RNA polymerase binds to the structural
absence of glucose and binds to the CAP site of the lac genes of the operon.
promoter.
12.5 Bacteria Regulate the Transcription of
12.3 Mutational Analysis Deciphers Genetic Stress Response Genes and Also Translation
Regulation of the lac Operon
❚❚ Alternative sigma factors are used to generate RNA poly-
❚❚ Mutation studies determined the order of lac operon genes merases that recognize promoters of genes not transcribed
as lacZ–lacY–lacA. by the common bacterial RNA polymerase.
Problems 471
❚❚ Genes transcribed using alternative sigma factors are 12.7 Antiterminators and Repressors Control
required only under specialized circumstances, such as in Lambda Phage Infection of E. coli
response to heat shock.
❚❚ The translation of bacterial mRNA can be blocked by ❚❚ Early genes of the bacteriophage l genome produce pro-
RNA-binding translation repressor proteins or by antisense teins that compete to bind at the same regulatory region.
RNA that binds to mRNA from specific genes. The protein that prevails determines whether the phage
infection will follow the lytic cycle or the lysogenic cycle.
❚❚ Completion of the lytic cycle requires the expression of late
12.6 Riboswitches Regulate Bacterial
l phage genes.
Transcription, Translation, and mRNA Stability
❚❚ Lysogen integration and maintenance requires ongoing
❚❚ A riboswitch is a regulatory mechanism that uses ribo- expression of the l repressor protein, which regulates its
switch sequences located on mRNA to bind small regula- own transcription.
tory molecules. ❚❚ Lysogen integration is reversed by environmental changes
❚❚ Riboswitches can regulate the transcription of specific that lead to induction and to resumption of the lytic cycle.
genes, the translation of certain mRNAs, or the stability and
degradation of certain mRNAs.
PRE PA R I N G F O R P R O BLEM S O LV I NG
In addition to the list of problem-solving tips and sugges- the effects of mutations on the functioning of these
tions given here, you can go to the Study Guide and mechanisms.
Solutions Manual that accompanies this book for help at
4. Understand the operation of attenuation in the produc-
solving problems.
tion of proteins.
1. Understand the functioning and the biological signifi-
5. Be prepared to interpret the effects of mutations on
cance of inducible and repressible transcriptional regu-
attenuation.
latory mechanisms in bacteria.
6. Understand the mechanisms and effects of antisense
2. Be prepared to describe the operation of transcriptional
regulation on protein production.
regulatory mechanisms.
7. Know the normal functions of lac operon genes and reg-
3. Be prepared to describe the experimental analysis of
ulatory sequence and the consequences of their mutation.
transcription-regulating mechanisms and to interpret
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Bacterial genomes frequently contain groups of genes the expression of all bacterial genes subject to regulated
organized into operons. What is the biological advantage expression? Compare and contrast the difference between
of operons to bacteria? Identify the regulatory compo- regulated gene expression and constitutive gene expression.
nents you would expect to find in an operon. How are the
4. Identify similarities and differences between an inducible
expressed genes of an operon usually arranged?
operon and a repressible operon in terms of
2. Transcriptional regulation of operon gene expression a. the transcription-regulating DNA sequences.
involves the interaction of molecules with one another and b. the presence and action of allosteric regulatory
of regulatory molecules with segments of DNA. In this con- molecules.
text, define and give an example of each of the following: c. the organization of structural genes of the operon.
a. operator 5. The transcription of b@galactosidase and permease is
b. repressor inducible in lac + bacteria with a wild-type lac operon.
c. inducer Explain the mechanism by which lactose gains access to
d. corepressor the cell to induce transcription of the genes.
e. promoter
f. positive regulation 6. Is attenuation the product of an allosteric effect? Is attenu-
g. allostery ation the result of a transcriptional or a translational activ-
h. negative regulation ity? Explain your answers.
i. attenuation 7. The trpL region contains four repeated DNA sequences that
3. Why is it essential that bacterial cells be able to regulate lead to the formation of stem-loop structures in mRNA.
the expression of their genes? What are the energetic and What are these stem-loop structures, and how do they affect
evolutionary advantages of regulated gene expression? Is transcription of the structural genes of the trp operon?
472 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
8. The CAP binding site in the lac promoter is the location 12. Consider the transcription of genes of the lac operon
of positive regulation of gene expression for the operon. under two conditions: (1) when both glucose and lactose
Identify what binds at this site to produce positive regula- are present and (2) when glucose is absent and lactose is
tion, under what circumstances binding occurs, and how present. Describe the comparative levels of transcription
binding exerts a positive effect. of lac operon genes under these conditions, and explain
the molecular basis for the difference.
9. What role does cAMP play in transcription of lac operon
genes? What role does CAP play in transcription of lac 13. Describe the lytic and lysogenic life cycles of l bacterio-
operon genes? phage. What roles do l repressor and Cro protein play in
controlling transcription from PR and PRM, and how are
10. How would a cap- mutation that produces an inactive CAP
these roles linked to lysis and lysogeny?
protein affect transcriptional control of the lac operon?
14. Define antisense RNA, and describe how it affects the trans-
11. Explain the circumstances under which attenuation of
lation of a complementary mRNA. Why is it more advanta-
operon gene expression is advantageous to a bacterial
geous to the organism to stop translation initiation than to
organism. Would you expect attenuation to be found in a
inactivate or destroy the gene product after it is produced?
single-celled eukaryote? In a multicelled eukaryote?
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
15. Attenuation of trp operon transcription is controlled by c. Mutation of the lacI gene affecting the allosteric site
the formation of stem-loop structures in mRNA. The of the protein
attenuation function can be disrupted by mutations that d. Mutation of the lacI gene affecting the DNA-binding
alter the sequence of repeat DNA regions 1 to 4 and site of the protein
prevent the formation of mRNA stem loops. Describe e. Mutation of the CAP binding site of the lac
the likely effects on attenuation of each of the following promoter
mutations under the conditions specified. 17. Identify which of the following lac operon haploid
genotypes transcribe operon genes inducibly and which
Mutated Region Tryptophan Level transcribe genes constitutively. Indicate whether the strain
a. Region 1 Low is lac + (able to grow on lactose-only medium) or lac -
b. Region 1 High
(cannot grow on lactose medium).
a. I + P + O + Z + Y -
c. Region 2 Low
b. I + P + OC Z - Y +
d. Region 2 High c. I - P + O + Z + Y +
e. Region 3 Low d. I + P - O + Z + Y +
e. I + P + O + Z - Y +
f. Region 3 High
f. I + P + OC Z + Y -
g. Region 4 Low g. I + P + OC Z + Y +
h. Region 4 High 18. Complete the accompanying table, indicating whether
functionally active b@galactosidase and permease are pro-
16. In the lac operon, what are the likely effects on operon duced in the presence and absence of lactose. Use ; + < to
gene transcription of the mutations described in a–e? indicate the presence of a functional enzyme and ; - < to
a. Mutation of consensus sequence in the lac promoter indicate its absence. Indicate whether the partial diploid
b. Mutation of the repressor binding site on the operator strain is lac + (able to grow on lactose-only medium) or
sequence lac - (cannot grow on lactose medium).
19. List possible genotypes for lac operon haploids that have 23. What is a riboswitch? Describe the riboswitch mecha-
the following phenotypic characteristics: nism that regulates transcription of the thi operon in
a. The operon genes are constitutively transcribed, but B. subtilus. What parallels can you see between this
the strain is unable to grow on a lactose medium. List mechanism and the regulation of transcription of the trp
two possible genotypes for this phenotype. operon in E. coli?
b. The operon genes are never transcribed above a basal 24. A repressible operon system, like the trp operon, contains
level, and the strain is unable to grow on a lactose three genes, G, Z, and W. Operon genes are synthesized
medium. List two possible genotypes for this phenotype. when the end product of the operon synthesis pathway is
c. The operon genes are inducibly transcribed, but the absent, but there is no synthesis when the end product is
strain is unable to grow on a lactose medium. List one present. One of these genes is an operator, one is a regula-
possible genotype for this phenotype. tory protein, and the other is a structural enzyme involved
d. The operon genes are constitutively transcribed, and in synthesis of the end product. In the table below, ; + <
the strain grows on lactose medium. List two possible indicates that the enzyme is synthesized by the operon,
genotypes for this phenotype. and ; - < means that no enzyme synthesis occurs. Use this
20. Suppose each of the genotypes you listed in parts (a) and information to determine which gene corresponds to each
(b) of Problem 19 are placed in a partial diploid genotype operon function.
along with a chromosome that has a fully wild-type lac
operon. Genotype Enzyme Synthesis
a. Will the transcription of operon genes in each partial End Product End Product
diploid be inducible or constitutive? Present Absent
b. Which partial diploids will be able to grow on a lac-
tose medium? G+ Z + W + - +
- + + + +
21. Four independent lac - mutants (mutants A to D) are iso- G Z W
lated in haploid strains of E. coli. The strains have the fol- +
G Z W - + - -
lowing phenotypic characteristics: + + - + +
G Z W
Mutant A is lac -, but transcription of operon genes is
induced by lactose. G - Z + W +/G + Z - W - + +
Mutant B is lac - and has uninducible transcription of + - +
G Z W /G Z W - + - + +
operon genes. - - -
G Z W /G Z W + + + - +
Mutant C is lac + and has constitutive transcription of
+ + - - - + - +
operon genes. G Z W /G Z W
Mutant D is lac + and has constitutive transcription of
operon genes. 25. What is the likely effect of each of the following muta-
A microbiologist develops donor and recipient varieties tions of the trpL region on attenuation control of trp
of each mutant strain and crosses them with the results operon gene transcription? Explain your reasoning.
shown below. The table indicates whether inducible, a. Region 3 is deleted.
constitutive, or noninducible transcription occurs, along b. Region 4 is deleted.
with lac + and lac - growth habit for each partial diploid. c. The entire trpL region is deleted.
Assume each strain has a single mutation. d. The start (AUG) codon of the trpL polypeptide is
deleted.
Mating Transcription and Growth
e. Two nucleotides are inserted into the trpL region
A * B lac - immediately after the polypeptide stop codon.
A * C lac +, inducible f. Twenty nucleotides are inserted into the trpL region
immediately after the polypeptide stop codon.
A * D lac +, constitutive
g. Ten nucleotides are inserted between regions 2 and 3
B * C lac +, inducible of trpL.
B * D lac +, constitutive h. Two nucleotides are inserted immediately following
the polypeptide start codon.
C * D lac +, constitutive i. The entire polypeptide coding sequence of trpL is
Use this information to identify which lac operon gene is deleted.
mutated in each strain. j. The eight uracil nucleotides immediately following
region 4 are deleted.
22. Suppose the lac operon partial diploid cap- I + P + O + Z - Y +/
cap+ I - P + O + Z + Y - is grown. 26. Suppose that base substitution mutations sufficient to
a. Will this partial diploid strain grow on a lactose eliminate the function of the operator regions listed below
medium? were to occur. For each case, describe how transcription
b. Is transcription of b@galactosidase and permease or life cycle would be affected.
inducible, constitutive, or noninducible? a. lacO mutation in E. coli
c. Explain how genetic complementation contributes to b. OR1 mutation in l phage
the growth habit of this strain. c. OR3 mutation in l phage
474 CHAPTER 12 Regulation of Gene Expression in Bacteria and Bacteriophage
27. Two different mutations affect PRE. Mutant 1 decreases 31. How could antisense RNA be used as an antibiotic? What
transcription from the promoter to 10% of normal. Mutant types of genes would you target using this scheme?
2 increases transcription from the promoter to ten times
32. Section 9.4 describes the function of tRNA synthetases in
greater than the wild type. How will each mutation affect
attaching amino acids to tRNAs (see Figure 9.16). Sup-
the determination of the lytic or lysogenic life cycle in
pose the tRNA synthetase responsible for attaching tryp-
mutant l phage strains? Explain your answers.
tophan to tRNA is mutated in a bacterial strain with the
28. How would mutations that inactivate each of the following result that the tRNA synthetase functions at about 15% of
genes affect the determination of the lytic or lysogenic life the efficiency of the wild-type tRNA synthetase.
cycle in mutated l phage strains? Explain your answers. a. How would this mutation affect attenuation of the tryp-
a. cI c. cro e. cII and cro tophan operon? Explain your answer.
b. cII d. int f. N b. Would formation of the 3–4 stem loop structure in
29. The bacterial insertion sequence IS10 uses antisense mRNA be more frequent or less frequent in the mutant
RNA to regulate translation of the mRNA that produces strain than in the wild-type strain? Why?
the enzyme transposase, which is required for insertion 33. The following hypothetical genotypes have genes A, B,
sequence transposition. Transcription of the antisense and C corresponding to lacI, lacO, and lacZ, but not nec-
RNA gene is controlled by POUT, which is more than essarily in that order. Data in the table indicate whether
10 times more efficient at transcription than the PIN pro- b@galactosidase is produced in the presence and absence
moter that controls transposase gene transcription. of the inducer for each genotype. Use these data to iden-
a. If a mutation reduced the transcriptional efficiency of tify the correspondence between A, B, and C and the lacI,
POUT so as to be equal to that of PIN, what is the likely lacO, and lacZ genes. Carefully explain your reasoning
effect on the transposition of IS10? for identifying each gene.
b. If a mutation of PIN eliminates its ability to function in
transcription, what is the likely effect on the transposi- Genotype b@Galactosidase Production
tion of IS10? Inducer Present Inducer Absent
30. For an E. coli strain with the lac operon genotype -
1. A B C+ + + +
I + P + O + Z + Y +, identify the level of transcription of + + -
2. A B C + +
the operon genes in each growth medium listed. Specify
transcription as “none,” “basal,” or “activated” for each 3. A- B+ C +/A+ B+ C + + +
medium, and provide an explanation to justify your + + -
4. A B C /A B C + + + + -
answer.
a. Growth medium contains lactose and glucose.
b. Growth medium contains glucose but no lactose.
c. Growth medium contains lactose but no glucose.
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
34. Northern blot analysis is performed on cellular mRNA a. lac + bacteria with the genotype I + P + OC Z + Y +
isolated from E. coli. The probe used in the northern blot b. lac - bacteria with the genotype I + P + O + Z - Y +
analysis hybridizes to a portion of the lacY sequence. c. lac - bacteria with the genotype I + P - OC Z + Y +
Below is an example of the gel from northern blot analy- d. lac + bacteria with the genotype I - P + OC Z + Y +
sis for a wild-type lac + bacterial strain. In this gel, lane e. lac - bacteria with the genotype I + P + O + Z - Y + that
1 is from bacteria grown in a medium containing only has a polar mutation affecting the lacZ gene
glucose (minimal medium). Lane 2 is from bacteria in f. lac - bacteria with the genotype I + P + OC Z - Y -
a medium containing only lactose. Following the style g. lac - bacteria with the genotype I + P + O + Z + Y + and
of this diagram, draw the gel appearance for northern a mutation that prevents CAP–cAMP binding to the
blots of the bacteria listed below. In each case, lane 1 is CAP site
for mRNA isolated after growth in a glucose-containing
35. A bacterial inducible operon, similar to the lac operon,
(minimal) medium, and lane 2 is for mRNA isolated after
contains three genes—R, T, and S—that are involved
growth in a lactose-only medium.
in coordinated regulation of transcription. One of these
Lane genes is an operator region, one is a regulatory protein,
and the third produces a structural enzyme. In the table
1 2
below, ; + < indicates that the structural enzyme is syn-
thesized and ; - < indicates that it is not produced. Use
the information provided to determine which gene is
the operator, which produces the regulatory protein, and
which produces the enzyme.
Northern blot
Problems 475
se
r
ein
so
era
res
+ + + + -
ot
R S T
lym
NA
ep
pr
1 2 3 G A T C
+R
+R
po
No
R - S+ T + - -
R + S- T + + +
+ + - + +
R S T
30 30
- + + + - - + +
R S T /R S T
R + S - T +/R - S + T - + +
+ + - - - + + -
R S T /R S T
CHAPTER OUTLINE
13.1 Cis-Acting Regulatory
Sequences Bind Trans-Acting
Regulatory Proteins to Control
Eukaryotic Transcription
13.2 Chromatin Remodeling
and Modification Regulates
Eukaryotic Transcription
13.3 RNA-Mediated Mechanisms
Control Gene Expression
ESSENTIAL IDEAS
Wild-type petunia flowers have solid color due to expression of a chro-
❚❚ Regulatory DNA sequences bind regula- mosomal pigment gene. Transgenic petunias with an extra copy of the
tory proteins to control the initiation or pigment gene have colorless (white) regions due to co-suppression, a pro-
silencing of transcription in eukaryotes. cess in which regulatory RNAs inactivate both the chromosomal copy and
❚❚ Chromatin remodeling and modification the transgenic copy of the pigment gene.
regulates gene transcription by shifting
I
the position or changing the chemical
composition of nucleosomes. f the 46 chromosomes in a single nucleus from any cell in
❚❚ The structure of chromatin varies among your body were stripped of their associated proteins and
different types of cells and sets the
laid end to end, they would span almost 2 meters. Yet in their
gene-expression program for distinct cell
types. normal, compacted state, these chromosomes can fit inside a
❚❚ RNA-mediated mechanisms regulate nucleus that is about 5 microns (5 millionths of a meter) in diam-
eukaryotic gene expression by posttran- eter and still leave room for DNA replication, transcription, pre-
scriptional interactions with mRNA.
mRNA processing, and numerous other activities to take place.
This efficient packaging and access to DNA are made possible
by the chromatin structure of the genome and the dynamic
changes of which chromatin is capable throughout the cell cycle.
The genomes of eukaryotic organisms—yours included—are
considerably larger on average than those of bacterial and archaeal
476
Regulation of Gene Expression in Eukaryotes 477
species, and they are packaged much differently as well. and the transcription of eukaryotic genes have suc-
One major packaging difference is the localization of ceeded in uncovering many crucial details.
chromosomes in a nucleus in eukaryotic cells. Nuclear The processes that regulate gene expression in
localization sequesters the chromosomes and encap- eukaryotes (see Chapters 8 and 9) are more varied and
sulates DNA replication, transcription, and the various multifaceted than those governing gene expression in
RNA-processing activities. A second difference is the bacterial genomes (Figure 13.1). In the present chapter,
incorporation of DNA into chromatin. we point out similarities to prokaryotic gene regulation
The process of chromatin condensation is initiated while giving special attention to elements that do not
at the beginning of prophase and culminates in fully occur in prokaryotes and yet are central to the regula-
condensed chromosomes in metaphase. This condensa- tion of transcription and gene expression in eukaryotes.
tion is an essential predecessor of efficient chromosome The latter include (1) the organization of regulatory
separation in anaphase. Chromatin condensation also sequences other than promoters that contribute to the
plays a pivotal role in permitting or blocking transcrip- regulation of transcription; (2) mechanisms that remodel
tion. No cell in your body expresses all 20,400 or so chromatin or reconfigure the association between
protein coding genes of the human genome. Instead, nucleosomes and DNA to regulate transcription; (3) epi-
most human cell types express only a few thousand genetic mechanisms that exert transcriptional regulatory
genes, while the other genes are transcriptionally silent. control over the course of an organism’s development;
In recent decades, cell biologists studying the close (4) the transmission of epigenetic states from one gener-
connection between structural changes in chromatin ation of cells to another to exercise long-term control of
13.1 Cis-Acting Regulatory 4200 protein coding genes 21,000 protein coding genes
Sequences Bind Trans-Acting One cell type Hundreds to thousands of
Regulatory Proteins to Control cell types
Respond to environmental Respond to environmental
Eukaryotic Transcription conditions conditions
+
Despite the considerable differences between eukaryotes development
and bacteria, the basic mechanisms controlling transcription Leaky gene expression Tight control of gene
are broadly similar in both groups of organisms. Gene regu- expression
lation is dependent on specific DNA–protein interactions to Housekeeping types of Housekeeping types of
activate or repress transcription. Trans-acting activator pro- gene regulatory control gene regulatory control
teins bind cis-regulatory sequences to stimulate transcrip- inducible (cell-type-specific inducible; developmental;
tion (positive regulation of transcription), whereas repressor in sporulation) cell-type-specific
proteins bind other regulatory sequences to hinder tran-
scription (negative regulation of transcription). Unlike their Figure 13.2 Comparison of bacterial and eukaryotic gene
expression.
counterparts in bacteria, however, eukaryotic transcription
activators and repressors, collectively known as transcrip-
tion factors, are often found in large complexes composed
of a large number of distinct regulatory proteins that bind
a wide and diverse array of regulatory sequences. These
proteins aggregate in diverse combinations that activate or Genes in bacteria can largely be categorized as either house-
repress transcription of different patterns of genes in differ- keeping (required for basic cellular function and consti-
ent tissues and at different times in the life cycle. tutively expressed) or inducible (activated in response to a
The complexity of gene regulation in eukaryotes is change in environmental conditions). Multicellular eukary-
reflected both in the numbers of different transcription fac- otes harbor housekeeping and inducible genes like bacte-
tors and the diversity of the target genes they regulate. For ria do, but in contrast to bacteria, they also possess genes
example, the bacterium E. coli has about 270 transcription that are regulated in a developmental or cell-type–specific
factors, about the same number as the single-celled eukary- manner, with some genes utilized multiple times in precise
ote S. cerevisiae. In contrast, multicellular eukaryotes such developmental patterns of expression. (Note, however, that
as Drosophila, humans, and Arabidopsis have approximately in some bacteria that “differentiate” into dormant spores,
600, 1400, and 1900 different transcription factors, respec- a small number of genes are also regulated in a cell-type–
tively. As for the targets of individual transcription factors, specific manner.)
consider again the example of E. coli: the CAP–cAMP com- Also related to these differences in ecology and life
plex that activates lac operon transcription (Section 12.2) cycle is the stringency of gene regulatory control exercised in
regulates only about a dozen loci in the E. coli genome, multicellular eukaryotes as compared with bacteria. E coli,
and the Lac repressor has only a single target locus, the lac a single-celled organism, depends on being able to change
operon. In contrast, individual transcription factors in multi- gene expression patterns rapidly to respond quickly to
cellular eukaryotes may regulate tens to hundreds of target changing environmental conditions. The mechanism giving
genes. This increased complexity in gene regulation is held E. coli this ability requires that even when a gene is “off,” a
to be responsible for the evolution and development of mul- few transcripts of it are always present in the cell. We saw
ticellular eukaryotes. For example, humans, with only about an example of how this “leaky” regulation in the case of the
five times as many genes as E. coli, are able to produce many lac operon enables E. coli to sense the presence of lactose
more times the number of distinct cell types. (Section 12.2). In contrast, in multicellular eukaryotes with
Other differences in gene regulation between bacteria and hundreds to thousands of different cell types, genes encod-
eukaryotes, especially complex multicellular ones, are tied ing proteins that are required only in specific cell types
to differences in their ecology and life cycles (Figure 13.2). need to be tightly regulated. This stricter control, in which
13.1 Cis-Acting Regulatory Sequences Bind Trans-Acting Regulatory Proteins to Control Eukaryotic Transcription 479
genes that are “off” are essentially transcriptionally silent, activity controls the timing and location of eukaryotic gene
is mediated by the packaging of chromatin into an inactive transcription to help ensure the proper function and devel-
state, a subject we will explore in this chapter, after we first opment of organisms (for example, by making a polypep-
discuss the role of transcription factors in eukaryotic gene tide available at crucial times or in specific cells or tissues).
regulation. Unlike core promoter elements, which are invariably
located upstream of and close to the genes they regulate,
Overview of Transcriptional Regulatory enhancer and silencer modules can be upstream or down-
stream of genes they regulate and may reside in introns and
Interactions in Eukaryotes even, occasionally, within coding regions. In multicellular
To repeat, the regulatory sequences required for eukaryotic eukaryotes, some enhancer and silencer sequences are close
gene regulation are similar to those described for bacteria— to the genes they regulate, but others are great distances,
a binding site for RNA polymerase and regulatory sequences thousands to tens of thousands of nucleotides, away from
that bind either activators or repressors. RNA polymerase the genes they regulate (Figure 13.3), though DNA loop for-
II (pol II) and various general transcription factors (GTFs) mation can bring even very distant sequences together. In
are recruited to and bind to the core promoter region (see contrast, enhancers and silencers in yeast are usually situ-
Section 8.3). This region contains the TATA box along with ated relatively close to the genes they regulate. Some genes
other sequences and lies immediately adjacent to the start of contain various proximal elements that lie upstream of the
transcription (Figure 13.3). core promoter and that are often involved in quantitative
Transcriptional activator proteins and transcriptional gene regulation.
repressor proteins that bind to enhancer and silencer All of the regulatory regions described here are
sequences (or enhancers and silencers) provide both quan- cis-acting regulatory sequences, which means they regu-
titative and qualitative control of gene expression. Enhanc- late transcription of genes located on the same chromosome
ers and silencers are typically composed of binding sites for that the regulatory sequence is on. In contrast, all of the pro-
a number of transcription factors, and this allows them to teins that bind these sequences are trans-acting regulatory
integrate the activities of different sets of transcription fac- proteins: they are able to identify and bind target regula-
tors to produce different outputs. Often such a group of tran- tory sequences on any chromosome. RNA polymerase II,
scription factor binding sites is referred to as an enhancer for example, is able to bind any core promoter region if the
or silencer module. In a broad sense, enhancer and silencer right general transcription factors are also present. Similarly,
(a) Saccharomyces cerevisiae cis-regulatory structure Cis elements consist of single binding
sites and are located close to and
generally 5’ of the gene.
Figure 13.3 Cis-element regulatory structures in eukaryotes. (a) Typical cis-regulatory structure of a
Saccharomyces cerevisiae gene. (b) Typical cis-regulatory structure of a gene of a multicellular eukaryote.
480 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
transcription activator and repressor proteins can bind their The model depicted in Figure 13.5, for the Sonic hedgehog
target regulatory sequences and can influence transcription (SHH) gene, shows two distant enhancers controlling tran-
with equal efficiency no matter where the sequence occurs. scription of the same gene in a tissue-specific manner. Simi-
In addition to the regulatory proteins that bind regu- lar models depicting the binding of repressor proteins to
latory DNA in a sequence-specific manner, there are also silencer sequences describe how distant silencers can inhibit
many proteins that combine through protein–protein inter- transcription of targeted genes in a tissue-specific manner.
actions to form larger complexes that then bind to regulatory In humans and other mammals, SSH directs the devel-
DNA (as mentioned at the start of this section). At enhanc- opment of limbs, including the production of five digits
ers, for example, aggregation of multiple proteins (proteins (fingers or toes) on each appendage. It also plays a role in
binding other proteins and some also binding the enhancer brain organization. Figure 13.5 compares the transcription
sequence) forms a large protein complex known as an
enhanceosome. Enhanceosomes direct DNA bending into
(a)
loops that bring the enhanceosome into contact with RNA
Activators:
polymerase and transcription factors bound at the core pro-
moter and to proximal promoter elements (see Figure 8.13). Repressors:
The DNA loops can be small or large, in keeping with the (b)
observation that enhancers may be close to or quite dis- Pioneer factor binds first:
tant from the genes they regulate. Repressor proteins act
in a similar manner, with some proteins binding DNA in a
sequence-specific manner and recruiting additional proteins Facilitating binding
into a larger repressor complex. of other factors:
Integration and Modularity of Eukaryotic Figure 13.4 Eukaryotic enhancer and silencer module. (a) Mod-
Regulatory Sequences ules consist of multiple binding sites for both activators and repres-
sors, with the output from the module resulting through integration
The overview we presented above described the regula-
of the effects of all the bound factors. (b) Pioneer factors bind first,
tory sequences that bind activators and repressors as if each allowing the binding of additional transcription factors.
sequence must either be an enhancer or a silencer. In reality,
many regulatory modules bind both activators and repressors
and thus act to integrate both positive and negative signals (a) Limb cells
into a single output. As an analogy, consider the regulatory Limb-specific
transcription factors
sequences of the lac operon, which could be viewed as a mod-
ule consisting of binding sites for a repressor (the Lac repres- Pol II
sor) and an activator (the CAP–cAMP complex). Transcription
of the lac operon results from integration of the effects of bind- Limb
ing the activator and repressor proteins; in this case, the repres- enhancer SHH gene
sor is dominant, since when it is bound, the operon is repressed
regardless of the presence of the CAP–cAMP complex. An
example of a eukaryotic regulatory module, consisting of mul-
tiple binding sites for both activator and repressor proteins, is
presented in Figure 13.4. As with the lac operon, the output Brain
from the eukaryotic regulatory module represents the integra- enhancer
tion of the effects of the binding of activators and repressors
(and repressors often prevail over activators), but via a different (b) Brain cells
molecular mechanism. However, not all transcription factors Brain-specific
are equal—some, called pioneer factors are the first to bind transcription factors
regulatory modules, and their binding facilitates the binding of Pol II
additional transcription factors (Figure 13.4b). We will return
to the importance of pioneer factors later in this chapter.
SHH gene
A general model of eukaryotic transcription regula- Limb Brain
tion must incorporate the action of enhancers and silenc- enhancer enhancer
ers while taking the variability of their locations and their Figure 13.5 Tissue-specific enhancer action. (a) The limb-
tissue-specific patterns of regulation into account. The dif- specific enhancer binds different, limb-specific transcription factors
ferent regulatory proteins present in different types of cells to express SHH differently in limb cells. (b) A different brain-
lead to tissue-specific patterns of expression of the target specific enhancer is bound by brain-specific transcription factors
gene, producing a different set of polypeptides in each case. and activates SHH transcription in brain cells.
13.1 Cis-Acting Regulatory Sequences Bind Trans-Acting Regulatory Proteins to Control Eukaryotic Transcription 481
of the SHH gene in brain tissue and in limb cells. Transcrip- (a) b-globin gene complex and LCR
tion in these tissues is controlled by different regulatory pro- 0 10 20 30 40 50 60 70 kb
teins and transcription factors produced in each cell type.
One combination of regulatory proteins binds one enhancer HS4 HS3 HS2 HS1
Gg Ag cb
in brain cells, whereas a different combination of regula- e d b
tory proteins binds an alternative enhancer in limb cells.
LCR
The limb enhancer of the SHH gene is 1 million base pairs
(1 megabase) away from the gene. Genomic sequencing (b) Developmental expression of b-globin–complex genes
analysis reveals that this SHH enhancer is actually located
100 Gg + Ag
in an intron of a neighboring gene (see Figure 16.17).
b-globin synthesis
b
This model illustrates an important aspect of eukary- 80
% of total
otic transcription regulation. Only when all of the necessary 60
transcription factors and regulatory proteins are present in 40
a cell can the assembly of protein complexes required for e
20 b
Gg + Ag
the tissue-specific or development-stage–specific pattern of 0
d
transcription take place. The protein complexes assembled 6 12 18 24 30 36 6 12 18 24 30 36 42 48
at regulatory sequences direct patterns of gene expression Weeks of gestation Birth Weeks of age
by activating transcription of certain genes while blocking
transcription of other genes. This modularity of transcrip- Figure 13.6 Locus control and developmental expression of
human b-globin–complex genes. (a) The locus control region
tional regulation in eukaryotes can provide the flexibility
(LCR) of the human b@globin complex contains four regulatory
that multicellular organisms need for regulation of differ-
segments (HS1 to HS4). (b) The LCR regulates the expression of
ential gene expression. The polypeptides that are ultimately five genes (Ψb is an unexpressed pseudogene) in a developmental
produced in each cell or at each stage of development drive pattern matched to gestational age.
the processes that make cells distinctive and lead to the
observed developmental changes. Located close to the b@globin complex is a regulatory region
Our previous discussions of mutations have described known as a locus control region (LCR). LCRs are highly
numerous ways in which changes in DNA can result in specialized enhancer elements that regulate the transcription
abnormal polypeptides or abnormal levels of polypeptide of multiple genes packaged in complexes of related genes. The
production. The modularity of regulatory sequences means LCR regulating transcription of genes in the b@globin com-
that changes in gene expression can also occur due to muta- plex contains four distinct cis-acting regulatory sequences,
tions in an enhancer module. As an example, the SHH limb designated HS1 to HS4. Together these elements orchestrate
enhancer is mutated in certain cases of a condition called the sequential developmental expression of the b@globin–
polydactyly, in which extra fingers and toes can form during complex genes as a fetus develops during gestation. The LCR
development. The extra digits result from abnormal expres- and the six genes it regulates occupy a little more than 70 kb.
sion of the SHH gene. In studies of certain human fami- Each gene of the b@globin complex produces a dis-
lies with polydactyly, single base substitutions in the SHH tinct globin polypeptide that imparts a different oxygen-
enhancer have been identified. In addition, studies in mice carrying capacity to hemoglobin. During gestation, the
in which a deletion of the SHH enhancer has occurred reveal oxygen requirements of the developing fetus change as
significant abnormalities of limb development. Changes in its size increases and its organs develop. As gestation pro-
gene regulation are held to be a significant driver in the evo- ceeds, transcription of the genes of the b@globin complex is
lution of morphological complexity. Moreover, the modu- switched from one to the next to produce hemoglobin mol-
larity of regulatory elements allows evolutionary changes ecules that have the oxygen-carrying capacity required by
in gene expression without loss of protein function. For the developing fetus. The order of expression of b@globin–
example, since the coding sequences of chimp and human complex genes during development matches the order in
genes are nearly identical, it is likely that most differences which they occur on the chromosome. Figure 13.6b shows
between the two species are due to differences in gene the expression profile of these genes during development.
regulation rather than to functional differences in protein The HS1 to HS4 components of the b@globin–complex
products. LCR bind regulatory proteins that direct the formation of
small DNA loops, and these serve as a bridge to the pro-
Locus Control Regions moters of the b@globin–complex genes (Figure 13.7). The
The human b@globin gene produces the b@globin polypep- composition of enhanceosomes bound to the LCR varies
tide, two copies of which join with two a@globin polypeptides during development to vary the resulting loops and thus
produced by the a@globin gene to form the heterotetrameric produce the developmentally regulated pattern of gene
hemoglobin molecule. The b@globin gene is, however, only expression from the b@globin complex. A similar LCR
one of six very closely related globin genes forming the drives transcription of a smaller number of genes in the
b@globin complex on human chromosome 11 (Figure 13.6a). a@globin complex.
482 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
Mechanism of transcriptional activation by LCR globin gene complexes, resulting in enhancer mutations that
Gg e alter the level of transcription of affected genes and lead to
an imbalance of polypeptide production.
Ag
d b Enhancer-Sequence Conservation
Comparisons among species reveal DNA-sequence conser-
vation in some enhancers. This implies that natural selection
LCR
is operating to retain enhancer function, that is, to retain the
capacity to bind specific regulatory proteins by conserv-
ing sequence composition. Figure 13.8 shows enhancer
sequences for the b@interferon gene in several mammals;
the abbreviations at the top of each column represent the
Activator RNA polymerase enhancer-binding proteins whose binding relies on certain
proteins sequences. The species listed on the left side of the figure
Promoter share a common ancestor from which their different lin-
Transcription eages diverged approximately 100 million years ago.
d Genomic sequence analysis indicates evolutionary con-
straint on the diversification of some enhancer sequences.
b-globin gene This constraint is demonstrated in enhancer elements that
Transcription regulate key genes controlling the development of the ver-
factors
tebrate body plan and that have been conserved throughout
Figure 13.7 Human b-globin–complex locus control region. In vertebrate evolution. (We will resume the topic of genomics
combination with regulatory proteins that vary with developmental approaches to identifying conserved regulatory sequences in
stage, the LCR forms DNA loops that also vary with developmental Chapter 16.) In contrast, certain enhancer module sequences
stage, allowing the LCR to activate transcription of specific genes of the have been observed to evolve quite rapidly and yet not
complex. The RNA polymerase on the left transcribes the d globin gene produce significant differences in outcome. Since the out-
and the RNA polymerase on the right transcribes the b globin gene.
put from an enhancer module is a result of the integration
of several inputs, different combinations of activators and
repressors can still result in similar outputs.
Recent genome-wide mapping studies in humans sug-
gest that many disease-susceptibility alleles reside in non- Yeast as a Simple Model for Eukaryotic
coding sequences that may be regulatory. Some of the
known examples are enhancer mutations, such as those
Transcription
causing certain cases of thalassemia, a kind of hereditary The yeast Saccharomyces cerevisiae provides a simple
anemia in which mutation leads to an imbalance of pro- model for illustrating some principles of eukaryotic tran-
duction of a@globin and b@globin polypeptides. The imbal- scriptional regulation. For example, the regulation of tran-
ance reduces the amount of functional hemoglobin, since scription by enhancer sequences is well understood in
each hemoglobin molecule needs an equal number of both Saccharomyces cerevisiae for the transcription of genes
polypeptides. Distinct types of thalassemia result from dif- involved in the galactose utilization pathway. When the
ferent mutations of the a@globin or b@globin genes, but in monosaccharide galactose is the only sugar in the growth
some thalassemia patients, no mutations of either globin medium, strains of gal + yeast will induce the transcription
gene or of their promoters are detected. In several of these of four enzyme-producing genes, GAL1, GAL2, GAL7, and
cases, the thalassemia is due to deletion or chromosome- GAL10, that together import extracellular galactose (the
rearrangement mutations that alter the LCR of one of the role of GAL2) and, through a short series of biochemical
Figure 13.9 Galactose utilization in S. cerevisiae. Galactose utilization requires the action of products of
each of four galactose-utilization (GAL) genes.
reactions, break down intercellular galactose into glucose- (a) Galactose absent
1-phosphate for glycolysis (GAL1, GAL7, and GAL10; Gal80
Activation domain Gal4 is bound by Gal80 and is
Figure 13.9). Each of the four genes has its own promoter, unable to activate transcription.
but transcription of the genes is regulated by another gene, Gal4
GAL4, which encodes Gal4, a regulatory protein. This is homodimer GAL genes
a transcription activator protein that binds to an enhancer
element—called an upstream activator sequence (UAS) No transcription
UASG
in yeast—located upstream of each of the four GAL
genes. The Gal4 regulatory protein is continuously avail- (b) Galactose present
able in yeast cells and interacts with Gal80, encoded by Gal80 is bound by Gal3,
Gal3
the GAL80 gene. When Gal80 protein binds to Gal4 pro- releasing Gal4 to activate
tein, it inactivates Gal4 and blocks its ability to activate Gal80 transcription.
transcription.
The UASG sequences are cis-acting regulatory ele- Gal4 DNA-binding domain
ments, and Gal4 protein is a trans-acting regulatory pro- homodimer GAL genes
tein. Each UASG element contains two 17-bp repeat
sequences that are the binding sites for Gal4 protein. In Transcription
its active, DNA-binding form, Gal4 is a homodimeric UASG
protein composed of two identical polypeptides that form
two active domains. The DNA-binding domain, at one Figure 13.10 Regulation of GAL gene transcription. (a) When
galactose is absent, Gal80 protein binds the activation domain of
end of the Gal4 dimer, targets the 17-bp repeats of UASG.
Gal4 to inactivate that protein and block GAL gene transcription.
The activation domain, at the opposite end, is a target for
(b) When galactose is present, Gal3 protein binds Gal80
binding by the protein Gal80. Since Gal4 and Gal80 are protein to prevent it from binding Gal4 protein. The activation
each constitutively produced, they are normally bound to domain of Gal4 protein is then available to initiate GAL gene
one another at the UASG of Gal4. In this configuration, transcription.
the activation domain of Gal4 is inactive, and transcrip-
tion of GAL genes is blocked (Figure 13.10a). Conversely,
when galactose is present, galactose and Gal3, the protein genes by RNA polymerase II is dependent on transcription
product of another GAL gene, binds to Gal80. Binding of activation by Gal4 binding to UASG elements and causing
the galactose–Gal3 complex alters Gal80 and causes it to the formation of Mediator. Distant silencers use the same
release Gal4. The free Gal4 dimer then activates GAL gene kind of DNA loop formation to regulate transcription of tar-
transcription (Figure 13.10b). geted genes.
In the GAL gene system, Gal4 acts as an activator pro- A common mode by which repressor proteins inhibit
tein, initiating transcription. Its target DNA sequence is transcription in bacteria is to bind to operator sequences
UASG, an enhancer sequence that is separated from GAL that overlap promoters, blocking the binding of RNA
gene promoters by a large number of nucleotides. Gal4 polymerase (see Chapter 12). In eukaryotes, this mecha-
binding leads to the formation of a multiprotein complex nism of transcription inhibition is not seen. However,
known as Mediator, which is an enhanceosome that forms among the mechanisms by which eukaryotic repressors do
after Gal4 binds UASG. Mediator induces the formation of inhibit transcription is the binding of eukaryotic repres-
a DNA loop, and in so doing makes contact with the gen- sors to silencer sequences, indirectly preventing enhancer-
eral transcription apparatus—including TFIID (transcription mediated transcription. The galactose-utilization genes in
factor II D) and RNA polymerase II (pol II)—at a GAL gene yeast offer an example of this mechanism of transcription
promoter (see Figure 8.13). In sum, the transcription of GAL repression. When glucose is present in the yeast growth
484 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
Transcription (a)
Tup1 repression Transcription
of gene A
Mig1 Enhancer activity
No helps initiate
transcription
Enhancer Promoter Gene A transcription.
ON
UASG GAL1
and heterochromatic, and are referred to as facultative Variegated eye Su(var) mutations E(var) mutations
heterochromatin. These latter regions often contain genes
that are active only at specific times or in certain tissues—
genes involved in development or active in specific cell
types. When DNA that is normally euchromatic is placed—
through induced or accidental mutation—in the vicinity
of heterochromatin, the heterochromatic character may
spread into the normally euchromatic region, silencing gene
expression, a phenomenon called position effect variegation
(PEV; see Section 10.6). Analysis of mutations that affect Red patches are Mutations block Mutations enhance
produced by cells efficient formation heterochromatin
the frequency or intensity of PEV in Drosophila provided in which w+ is of heterochromatin formation and
the first insights into how euchromatic and heterochromatic transcribed, and and leave most restrict w+
states are established and maintained. white patches by cells with active w+ expression to small
cells in which w+ transcription. patches.
is inactivated by
heterochromatin
PEV Mutations spread.
Genetic analysis of eukaryotic genomes reveals PEV to be a
Figure 13.13 E(var) and Su(var) mutations. Mutations in genes
widespread phenomenon, suggesting that mechanisms con-
whose protein products participate in chromatin modification
trolling chromatin structure are important in the control of are detected by enhancement or suppression of position effect
gene expression. In Drosophila, mutations modifying PEV variegation.
have led to the identification of several genes and proteins
that play a direct role in establishing and maintaining chro-
matin structures associated with gene expression and gene elucidate normal functions. Some Su(var) mutations are loss
silencing. The starting point was a mutant line in which of function of heterochromatin protein-1 (HP-1), a protein
the eye color is variegated wild-type red and mutant white, found in association with centromeres, telomeres, and other
due to an inversion placing the white gene in the vicinity constitutively heterochromatic chromosome locations in
of centromeric heterochromatin (see Figure 10.28). Muta- Drosophila. Comparison of Su(var) mutants with wild types
tions in which the variegation is either enhanced or sup- reveals that HP-1 is a nucleosome-binding protein that binds
pressed were then identified. Mutations known as E(var) lysine amino acids in position 9 of histone H3 if they carry
mutations, where E(var) is short for enhancers of position a methyl group. Methylation of lysine 9 of H3 is one of the
effect variegation, increase or enhance the appearance of most common epigenetic modifications of histones in con-
the mutant white-eye phenotype by encouraging the spread stitutively heterochromatic regions. The absence of HP-1
of heterochromatin beyond its normal boundaries. (Note interferes with heterochromatin formation and suppresses
that the use of “enhancer” in this context refers to a genetic variegation.
interaction, and is different from the concept of enhancers A second group of Su(var) mutations affects genes
as regulatory sequences.) The effect of E(var) mutation is encoding histone methyltransferases (HMTs), enzymes
to produce a greater number of eye cells lacking pigment responsible for catalyzing the addition of methyl groups to
(Figure 13.13). In contrast, Su(var) mutations, where amino acids of histone proteins. Histone methyltransfer-
Su(var) is short for suppressors of position effect varie- ases appear to target methylation-specific basic amino acids
gation, restrict the spread of heterochromatin or interfere (e.g., arginine and lysine) in nucleosomes, attaching methyl
with its formation. Su(var) mutations increase the extent of groups to these amino acids as part of epigenetic marking
normally pigmented regions of the eye by suppressing the of histones. As noted above, the lysine residue in position
emergence of white patches. 9 of histone protein H3 is a frequent target for methylation.
Several dozen E(var) and Su(var) mutations are known Upon methylation, this location is described as H3K9me,
in Drosophila, and they have proven especially valuable in which is short for histone 3, lysine (one-letter abbreviation K),
the identification of genes and proteins that modulate chro- position 9, and methylation. If HMTs are not functioning
matin structure. Genetic analysis of E(var) and Su(var) properly, epigenetic methylation is not established, and
mutations supports the hypothesis that chromatin structure heterochromatin formation is inhibited.
is dynamic and is associated with gene expression. In fact, The identification of the activities affected by these two
chromatin structure appears to oscillate: Sometimes it is in a groups of Su(var) mutations led to a simple model of HP-1
highly condensed state in which gene transcription is silenced and HMT function predicting that specific histone locations
(i.e., heterochromatic), and sometimes it is in a more loosely in nucleosomes (e.g., H3K9) are methylated by HMTs and
condensed state that allows transcription (i.e., euchromatic), then act as sites of HP-1 binding that helps condense chro-
but it can also exist in an intermediate state of condensation. matin structure to silence gene expression (Figure 13.14).
The analysis of one prominent group of Su(var) muta- According to this model, the Su(var) mutants, defective in
tions exemplifies how the detection of defective proteins can their silencing of w +, could carry an HMT gene mutation
486 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
The poly A/T tract contains enhancer sequences (ES) that cuts DNA in open chromatin regions but is not able to do
attract transcription activators (ACT). This binding region is so where chromatin is closed. Regions of open chroma-
usually flanked by sequences that help position two nucleo- tin, sensitive to DNase I digestion, are known as DNase I
somes, one upstream and one downstream, of the NDR. The hypersensitive sites. Where DNase I hypersensitivity is
downstream nucleosome, identified as the +1 nucleosome, detected, genes are potentially transcribable. The experi-
is placed at the transcription start site. This +1 nucleosome mental analysis of DNA for DNase I hypersensitivity is
contains a variant histone 2A protein known as H2A.Z that
is readily modified for removal from the transcription start
site at transcription initiation, allowing RNA polymerase II (a) Open promoter
to bind and access the transcription start sequence. –2 nucleosome –1 nucleosome +1 nucleosome +2 nucleosome
Covered promoters, on the other hand, are character- H2A.Z
istic of genes whose transcription is regulated, in either an ACT ES
inducible, a developmental, or a cell-type–specific manner.
Transcription of these genes is blocked until nucleosomes are Transcription
NDR start site
displaced or removed from the promoter to allow transcrip- Poly A/T tract
tion activators to bind to the necessary sequences, an event (no TATA box)
that leads in turn to RNA polymerase II binding and tran-
scription initiation (Figure 13.15b). These promoters gen- (b) Covered promoter
1 Closed chromatin
erally contain TATA boxes. At covered promoters, there is
active competition between nucleosomes and transcription- DNase I insensitive and transcriptionally silent
activating factors for binding. As a result, regulatory Nucleosome Promoter Gene
mechanisms are required that remodel chromatin to give
activator proteins access to binding sequences to initiate
transcription. Transcription
Enhancer start site
2 Activator
Mechanisms of Chromatin Remodeling binding in
progress ACT
Chromatin remodeling refers to chromatin modifications
that reposition nucleosomes in such a way as to open or close +1 nucleosome +2 nucleosome
promoters and other regulatory sequences (e.g., enhancer
modules). Moving nucleosomes off regulatory sequences ACT
improves the availability of those sequences to transcription-
activating regulatory proteins. Open chromatin is chroma- ES
tin in which the association of DNA with nucleosomes is TATA box Nucleosome
Chromatin remodeling and displacement
relaxed in regions containing regulatory sequences, allowing
additional bindinhg
access by regulatory proteins. Modifications that cause regu-
latory DNA to be covered by nucleosomes, thus restricting 3 Open chromatin
the access of regulatory proteins to the sequences, produce Nucleosomes are displaced, and activator binds.
closed chromatin. In closed chromatin, regulatory sequences
Activator binding
cannot be efficiently accessed by regulatory proteins, and
genes are transcriptionally silent.
Molecular biologists can determine experimentally
whether a region of DNA contains closed chromatin or open
chromatin by assessing the sensitivity of the region to the RNA pol II and transcription DNase I hypersensitivity
factors bind promoter. detected following
DNA-digesting enzyme DNase I. This enzyme randomly nucleosome displacement
RNA pol II
Figure 13.15 Transcription of open and covered promoters.
(a) Open promoters have a nucleosome-depleted region (NDR)
and no TATA box. Activator proteins (ACT) are attracted to
enhancer sequences (ES) to recruit RNA polymerase II for tran- Persistent DNase I
scription. (b) With covered promoters, transcription is activated hypersensitivity
by activator-protein binding and displacement of nucleosomes.
Closed chromatin 1 is inaccessible to transcription factors Transcription is initiated.
and insensitive to DNase I digestion, whereas following activa-
tor binding and nucleosome displacement 2 , the resulting
open chromatin 3 binds transcription factors and is DNase I
hypersensitive. mRNA 5¿
488 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
much like DNA footprint protection analysis described in complex, which substitutes the variant histone protein
Research Technique 8.1 (pages 288–289). Fragments of H2A.Z in nucleosomes in place of the more common H2A
DNA created by exposure to DNase I are separated and ana- protein.
lyzed by gel electrophoresis.
DNase I hypersensitivity occurs in the immediate vicin- The SWI/SNF Complex The SWI/SNF complex (pro-
ity of transcribed genes and can also appear 1000 bp or more nounced “swee-sniff” or “swy-sniff”) was first described
upstream or occasionally downstream of actively transcribed in yeast and is now known to operate in all eukaryotes. Its
genes. Hypersensitive regions surround promoters, enhanc- name comes from yeast mutants unable to switch (SWI)
ers, and other transcription-regulating sequences. The open mating types and from sucrose-nonfermenting mutants
chromatin complexes detected by DNase I hypersensitivity (SNF). The composition of this complex varies somewhat
are the sites for binding by transcription-activating proteins among eukaryotic species, but in each species it functions to
and for transcription (Figure 13.15b). Genetic Analysis 13.1 open chromatin structure by displacing or ejecting nucleo-
guides you through an analysis for the presence of DNase I somes. These actions expose promoter and other regulatory
hypersensitivity in a region of DNA. sequences to allow binding of transcription factors or activa-
Another, more direct technique for identifying where tors that help initiate transcription (Figure 13.17 1 ).
proteins are bound to DNA is a process called chromatin
immunoprecipitation (ChIP). Transcription factors attached The ISWI Complex Chromatin remodelers of the ISWI
to chromatin and associated DNA are isolated from liv- (imitation switch) complex primarily function to control the
ing cells by first chemically cross-linking the proteins and placement of nucleosomes into an arrangement that causes
DNA together and then, using an antibody specific to a the region to be transcriptionally silent. These proteins have
transcriptional regulatory protein of interest, causing the the ability to “measure” the length of linker DNA between
DNA–chromatin combination attached to that protein of bound nucleosomes in order to place the nucleosomes at
interest to precipitate. Next, the DNA from the precipitated regular intervals where they will cover promoters, thus pre-
chromatin is released by reversing the cross-linking, after venting regulatory proteins from having access to the TATA
which the isolated DNA is amplified by PCR (Chapter 7) box and other regulatory sequences (see Figure 13.17 2 ).
and sequenced. The sequences obtained will correspond to There is some evidence that certain nucleosome modifica-
the DNA to which the transcriptional regulatory protein of tions can block ISWI activity, by a process that could be
interest was bound in the cells. This approach is not only related to the opening of promoter and chromatin structure.
applicable to specific activator or repressor proteins but also
can be performed using antibodies targeting specific chro- The SWR1 Complex The SWR1 complex (switch remod-
matin modifications described later in this chapter. ChIP eling 1) is responsible for replacing the common histone 2A
can be targeted to determine whether a protein of interest is protein of nucleosomes with a variant form known as H2A.Z
bound to a specific DNA locus or can be used to determine that differs from the more common form by amino acid dif-
all the sites in the genome to which a particular protein is ferences internal to the protein and in the amino terminal
bound, a concept that we will return to in Chapter 16. (N-terminal) protein tail. The differences found in H2A.Z
Chromatin remodelers are the protein complexes that alter its pairing with other H2A proteins and its interactions
carry out chromatin remodeling by moving nucleosomes with H3/H4 tetramers in the nucleosome. H2A.Z is found
in three principal ways (two are seen in Figure 13.16). One primarily at the so-called +1 nucleosome that is affiliated
type of chromatin-remodeling enzyme changes nucleosome with the start of transcription. Functional analyses in sev-
organization by either sliding them along the chromosome eral species suggest that the role of H2A.Z is in the cre-
or removing them from the DNA. These enzymes usually ation of unstable nucleosomes that might then be displaced,
work by uncovering enhancers or promoters and thus are ejected from DNA, or modified to regulate transcription (see
associated with gene activation. A second type of chroma- Figure 13.17 3 ).
tin-remodeling enzyme reorganizes nucleosomes by induc- It is important to note that chromatin remodeling com-
ing nucleosome repositioning to a different DNA region. plexes do not bind to DNA on their own but are recruited to
These enzymes usually act to repress transcription. The specific chromosomal locations by sequence-specific bind-
third type of chromatin-remodeling enzyme changes the ing activator or repressor transcription factors. Recruitment
composition of histone octamers, replacing specific histone of chromatin remodelers can lead to the transition of closed
proteins with variant proteins. These changes are associated chromatin to open chromatin and vice versa (Figure 13.15).
with gene activation.
A number of distinct chromatin remodelers are known.
Three of the best-understood categories, classified by their
Chemical Modifications of Chromatin
main functions, are the SWI/SNF complex, which both In contrast to chromatin remodelers that move nucleosomes,
slides and relocates nucleosomes; the ISWI complex, which the proteins called chromatin modifiers chemically modify
helps direct the placement of nucleosomes; and the SWR1 histone proteins in the nucleosomes by adding or removing
13.2 Chromatin Remodeling and Modification Regulates Eukaryotic Transcription 489
ACT
Transcription
activator Nucleosome
displacement
TATA Nucleosome ACT SWI/SNF ACT
assembly and 1 SWI/SNF family
box
organization Nucleosome
ejection
2 ISWI
Figure 13.17 The actions of chromatin-remodeling complexes. 1 The SWI/SNF family opens chromatin
structure and helps initiate transcription by either displacing nucleosomes away from regulatory sequences
or ejecting nucleosomes. 2 ISWI assembles and organizes nucleosomes in a regular pattern and contributes
to transcription repression. 3 SWR1 inserts the modified histone protein H2A.Z into nucleosomes to help
facilitate displacement.
Q How does the ISWI complex work to prevent binding of regulatory proteins?
GENETIC ANALYSIS 13.1
PROBLEM The tissue enzyme TE2 is expressed in various mouse tissues
at different times during the life cycle. Identical chromosome segments TE2
were isolated at different times in the cycle from a region immediately
upstream of TE2 and analyzed for DNase I hypersensitivity. The chromo-
some segments were collected from TE2 upstream
BREAK IT DOWN: DNase I cuts in region
embryonic (E) and adult (A) mouse TE2 upstream
regions of open chromatin but not
condensed chromatin (p. 487–488). heart, kidney, and thymus gland. In chromosome fragment
the analysis, a radioactive label was
attached to one end of each chromosome fragment, and the samples
Radioactive
from each tissue were exposed to DNase I to determine if the regions label DNase I
upstream of TE2 were DNase I hypersensitive. When the resulting frag- treatment
ments from each sample were separated by gel electrophoresis, the pat-
tern shown at right was obtained.
a. Based on the gel results, is there evidence that chromatin remodeling
plays a role in the expression of TE2? Heart Thymus Kidney
E A E A E A
Explain your reasoning.
BREAK IT DOWN: Chromatin
–
b. In which tissue(s) and at what times remodeling is the process by which
Migration
during development do the results nucleosome position or identity is
indicate the expression of TE2 was altered (p. 490).
most likely taking place?
+
Electrophoresis gel
Evaluate
1. Identify the topic of this problem and the 1. This problem concerns an experimental analysis for DNase I hypersen-
kind of information the answer should sitivity in the region upstream (i.e., the promoter region) of TE2. The
contain. answers require interpretation of experimental results with respect to
chromatin structure and gene expression.
2. Identify the critical information given in the 2. Gel electrophoresis results are given for identical chromosome fragments
problem. from embryonic and adult heart, thymus, and kidney. All chromosome
TIP: DNase I hypersensitivity is detected when
chromatin structure is open and potentially fragments were exposed to DNase I.
accessible to transcription-activating proteins.
Closed chromatin is not hypersensitive to DNase I.
Deduce
3. Compare and contrast the meaning of the 3. A continuous series of DNase I–digested bands indicates DNase I
continuous series of bands in some lanes of hypersensitivity. Hypersensitivity correlates with open chromatin that is
the gel versus lanes in which gaps are seen accessible to transcription. Gaps between gel bands indicate that certain
between bands. regions of chromosomes are not fragmented by DNase I treatment. This
result signals the absence of DNase I hypersensitivity in those regions
and suggests closed chromatin structure and no transcription.
4. Evaluate the gel, and describe the patterns 4. Discontinuous band patterns are observed in adult heart and embryonic
of DNase I–digestion bands for each sample. thymus gland DNA. This absence of DNase I hypersensitivity suggests
closed chromatin structure. Each of the other DNA samples indicates
hypersensitivity to DNase I.
Solve Answer a
5. Determine whether the gel data indicates 5. The DNase I hypersensitivity results indicate differential patterns of TE2
chromatin modification near TE2. expression in different tissues and at different times of development due to
chromatin modifications. DNase I hypersensitivity resulting from open chro-
matin appears in embryonic and adult kidney, in embryonic heart, and in adult
thymus chromosomal material. Hypersensitivity is not seen in adult heart or in
embryonic thymus chromosomal material, indicating closed chromatin.
Answer b
6. Name the tissues in which TE2 is expressed, 6. TE2 expression is likely to occur at embryonic and adult stages in the kid-
and describe the developmental timing. ney, in the embryonic heart, and in the adult thymus gland. TE2 expres-
sion is unlikely to occur in adult heart or in embryonic thymus gland.
For more practice, see Problem 20. Visit the Study Area to access study tools. Mastering Genetics
490
13.2 Chromatin Remodeling and Modification Regulates Eukaryotic Transcription 491
Euchromatin:
Active gene expression
Constitutive and inducible genes
and active developmental/cell-type-
specific genes
HeK9Ac and H3K4me
Facultative heterochromatin:
Repressed gene expression
Developmental/cell-type
specific genes
H3K27me3
Constitutive heterochromatin:
Repressed gene expression
Repetitive elements
H3K9me3
Figure 13.18 Chromatin states, gene content, and characteristic histone modifications.
(a)
adherence to negatively charged DNA. Acetylation neutralizes
the positive charge and relaxes the tight hold the nucleosomes
have on DNA. Thus, acetylation of K9 of histone 3, desig-
nated H3K9ac, is associated with an opening of the chromatin Me Me Me Me Me Me Me Me Me
and active transcription (Figure 13.20). HATs are recruited to
the chromatin by activator proteins ( 1 ), leading to the forma-
tion of euchromatin and active transcription ( 2 ). Conversely,
HDACs are recruited by repressors ( 3 ), resulting in the for- Chromatin readers Chromatin writers Chromatin erasers
mation of transcriptionally inactive heterochromatin ( 4 ). Writers and erasers are recruited
The addition of methyl groups is accomplished by to chromatin by trans-acting
chromatin-modifying histone methyltransferases (HMTs), transcription factors
which act as writers. Again, lysine is the frequent target for
methylation, and residues can be mono- (me), di- (me2), or tri-
(b) Euchromatin Heterochromatin
methylated (me3). Depending upon the K residue, methylation
H3K4 H3K27
can play a role in converting open euchromatin to closed het- CH3
Deacetylation CH3
erochromatin in conjunction with deacetylation; H3K9 is the AC AC CH3 Demethylation CH3 AC
residue methylated in the case of constitutive heterochromatin P Methylation CH3 P
CH3
In summary, the chromatin state can be reversibly con- Perhaps you recall from the original description of PEV
verted between euchromatin (active) and heterochromatin in Section 10.6 that the white gene was relocated next to
(inactive) through the combined action of transcription centromeric constitutive heterochromatin. In contrast to
factors and chromatin modifiers. Multiple chemical modi- facultative heterochromatin, this type of heterochromatin is
fications of N-terminal amino acids are required to con- characterized by H3K9me3. We will return to the question
vert from a closed to an open structure and vice versa. No of how constitutive heterochromatin is maintained later in
single acetylation or methylation event changes chroma- this chapter.
tin structure; rather the change is accomplished through a At this point, you might be wondering: If chromatin
coordinated set of events localized to a gene or regions of is in an inaccessible heterochromatic state, how do factors
a gene. bind to its DNA to initiate the transition to euchromatin?
Thus, the alternation of facultative heterochromatin The transition can occur through the activity of a spe-
between an open euchromatic state and a closed hetero- cial class of transcription factors called pioneer factors,
chromatic state is driven by an interplay of chromatin- which can access and bind DNA even in heterochromatin
modifying enzymes recruited by activator or repressor (Figure 13.21). Pioneer factors may be a single protein. In
proteins. In many eukaryotes, this interplay between the other cases, a combination of factors that on their own are
opposing activities of writers and erasers involves a pro- not pioneer factors can sometimes form a pioneer complex.
tein complex called the Polycomb group (PcG) acting in One role of pioneer factors is to open up heterochromatin by
gene repression and another protein complex called Tritho- first binding to DNA and then recruiting chromatin modi-
rax (Trx) acting to maintain gene expression. PcG and Trx fier and remodeling complexes. Another role is to bind to
complexes are recruited to specific loci by repressors and DNA to prepare the chromatin in such a way that a gene
activators, respectively. The PcG complex acts to maintain can be rapidly induced when additional transcription factors
a chromatin state that is marked with H3K27me3 and not become available.
acetylated; that is, it has an H3K27 HMT and an HDAC. Finally, although it is convenient for the sake of dis-
In contrast, the Trx complex has a HAT and an H3K27 cussion to divide chromatin states into active euchroma-
HDMT. (We will explore how these complexes work in an tin and inactive heterochromatin, many genes do not fit
example below.) neatly into those categories but instead are found along a
2 Transcription activated
Activator
RNA pol II AC AC AC
AC AC
4 No transcription
mRNA 5’
RNA pol II binding
initiates transcription.
Figure 13.20 Acetylation and deacetylation in open and closed chromatin structure. Histone deacety-
lases (HDACs) deacetylate amino acids in N-terminal histone protein tails and close the chromatin structure.
Histone acetyltransferases (HATs) acetylate N-terminal amino acids and help open the chromatin structure
to activate transcription.
Q How are the HAT and HDAC complexes directed to specific chromosomal loci?
13.2 Chromatin Remodeling and Modification Regulates Eukaryotic Transcription 493
(b) Attributes
continuum. Genes expressed in developmental and cell- (a) High phosphate results in transcription repression
type–specific patterns are tightly regulated and reside at
NuA4
the ends of the spectrum, whereas constitutive genes are PHO5
Pho2
always euchromatic. Other genes may carry both active
and inactive chromatin marks that keep them poised to be
UASp1 UASp2 TATA
expressed, allowing for rapid changes in gene expression. Nucleosome –5 –4 –3 –2 –1 +1
number
An Example of Inducible Transcriptional
Regulation in S. cerevisiae (b) Low phosphate results in transcription activation
AC AC NuA4 AC AC
To illustrate the role of chromatin modifications in the PHO5
Pho4
Pho2
Pho4
PHO5
Pho2
of the nucleosomes -1 to -4 blocks access of activa- discussion, it is sufficient to understand that these genes
tor protein and transcription factors to PHO5 regulatory are involved in patterning the anterior-posterior axis of
sequences. animal embryos.
Transcription of PHO5 occurs when phosphate level For example, during Drosophila embryogenesis, the
falls. The activator, Pho4, is translocated to different Ubx gene is initially activated in specific cells toward the
locations within the cell depending on the level of phos- posterior end of the embryo (Figure 13.23b). In wild-type
phate: Under high-phosphate conditions Pho4 is phos- embryos this pattern is maintained throughout embryo-
phorylated and exported from the nucleus, whereas under genesis. However, in PcG mutants, the Ubx gene later
low-phosphate conditions Pho4 is unphosphorylated and on becomes activated in cells of the anterior part of the
imported into the nucleus. Under low-phosphate condi- embryo, where normally it is not expressed. This indicates
tions, the nuclear-localized Pho4 protein binds to Pho2, that expression of the PcG complex is required for the con-
forming a protein complex that begins transcription acti- tinued normal repression of the gene in those cells later
vation (Figure 13.22b). Additional acetylation of the -1 on during embryogenesis. Conversely, in TrxG mutants,
to -4 nucleosomes takes place under the direction of Ubx gene expression fails to be maintained in the poste-
NuA4. The Pho4–Pho2 complex then initiates chroma- rior region, where it is normally expressed in wild-type
tin modification by displacing nucleosome -2, making embryos. It is thought that the initial posterior activators
UASp2 available for binding by the Pho4 protein. The and anterior repressors regulating Ubx expression recruit
SWI/SNF protein complex assembles, and additional TrxG and PcG complexes, respectively, to maintain expres-
chromatin modification displaces nucleosomes -1 (that sion of Ubx through later stages of embryogenesis, even
previously covered the TATA box), -3, and -4. With after the initial regulatory transcription factors are no lon-
chromatin opened by nucleosome displacement, general ger present.
transcription factor proteins and RNA polymerase II are How does this occur mechanistically? Once the chroma-
able to bind the promoter and initiate transcription of the tin has been demarcated as heterochromatin, the H3K27me3
PHO5 gene. reader within the PcG complex can recognize the mark in
heterochromatin, and the H3K27 methylase of the complex
Facultative Heterochromatin can write the mark on nearby octamers. The euchromatic
state can be maintained by the TrxG complex in a similar
and Developmental Genes manner. In this way, these proteins provide a type of epi-
For an example of developmental regulation of faculta- genetic cellular memory that is propagated through cell
tive heterochromatin we turn to Drosophila. As mentioned divisions occurring long after the initial activators of Hox
previously, facultative heterochromatin can be converted gene expression patterns have disappeared. We will revisit
to euchromatin and vice versa via the activities of large the role of these complexes in the development of a multi-
protein complexes known as Trithorax and Polycomb. cellular organism in Chapter 18.
Components of the complexes are encoded by genes
known, respectively, as the Trithorax group (TrxG) genes
Epigenetic Heritability
and the Polycomb group (PcG) genes. Both the TrxG and
PcG protein complexes are recruited to specific DNA Activating the transcription of an individual gene requires
sequences by sequence-specific DNA-binding factors a confluence of regulatory proteins that remodel or mod-
(activators and repressors), and each complex possesses ify chromatin to provide enhancer and promoter access to
a distinct type of histone-3-methyltransferase activity in transcription factors that initiate and carry out transcript
which the activity of the TrxG complex is opposite to the synthesis. Mechanisms controlling differential chromatin-
activity of the PcG complex. The PcG complexes repress state formation and maintenance produce patterns of gene
target gene expression by recruiting histone-modifying expression in different types of cells that are required for
protein complexes capable of histone deacetylation. In the growth and development of complex organisms. In a
contrast, TrxG complexes recruit protein complexes that broad sense, these regulatory processes are the reason a
acetylate histone, leading to maintenance of active gene single fertilized egg can develop and produce many dis-
expression (Figure 13.23a). These two types of modifi- tinct types of cells (liver cells, muscle cells, brain cells,
cation, we have seen, are associated with transcription- and so on).
ally inactive heterochromatin and transcriptionally active Among the trillions of somatic cells in your body are
euchromatin, respectively. As with chromatin remodel- scores of different cell types, and yet all these cells contain
ers, TrxG and PcG complexes are recruited to the cis- the same genetic information. The differences of morphol-
acting regulatory sequences of Hox genes by activators ogy and function between cell types are genetically con-
and repressors, respectively, to “lock” the chromatin trolled, as evidenced by the fact that daughter cells have
into a particular form, allowing maintenance of either the same structures and functions as parental cells, but
active or silent states of gene expression. Hox genes will DNA sequence variability is not the reason for those dif-
be described in detail in Chapter 18, but for the present ferences. Instead, the differences between somatic cells
13.2 Chromatin Remodeling and Modification Regulates Eukaryotic Transcription 495
2 Transcription activated
Activator H3K4me3
RNA pol II AC AC AC
4 No transcription
AC AC
H3K27me3
mRNA 5¿
(b)
Early embryo Later embryo
Wild type
Polycomb group
(PcG) mutant
Trithorax
(TrxG) mutant
Figure 13.23 Antagonistic activities of PcG and TrxG complexes in facultative heterochromatin. (a) Activator proteins recruit the TrxG
complex to the chromatin, resulting in erasing of repressive histone marks and writing of positive histone marks. HAT complexes are also
often recruited to add positive acetyl marks. Conversely, repressor proteins recruit the PcG complex to the chromatin, resulting in erasing of
positive histone marks and writing of repressive histone marks. HDAC complexes are also often recruited to erase positive acetyl marks.
(b) Ubx expression (blue) is activated and maintained posteriorly in wild-type Drosophila, but its repression is lost in PcG mutants and its
maintenance is lost in TrxG mutants.
are epigenetic, resulting from the distinct chromatin states parent and sibling cells—a cellular memory. Some epi-
affecting gene transcription in specific types of cells. genetic changes occur in the course of normal growth and
Epigenetic patterns are often heritable through mitosis development, in some cases resulting from different physi-
from one generation of cells to the next, causing daughter ological conditions. These changes are potentially revers-
cells to have the same patterns of gene expression as their ible and variable during the life cycle of an organism; the
496 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
transcription of certain genes may be turned on and later off lncRNAs and Inactivation of Eutherian
again, or vice versa. Mammalian Female X Chromosomes
An example we encountered earlier of mitotically heri-
table variation of gene expression with an epigenetic basis It is becoming increasingly apparent that a class of RNA
is position effect variegation (PEV) in Drosophila, which molecules in eukaryotic cells called long noncoding RNAs
results from the movement of the transcriptionally active w + (lncRNAs) play critical roles in gene regulation. As their
allele into the centromeric region of the fruit-fly X chromo- name implies, they are long RNAs without substantial open
some (see Figure 10.28). The DNA sequence of the gene reading frames. A study of lncRNAs expressed in embry-
is not altered. Instead, the spread of heterochromatin closes onic stem cells in mice suggests that many lncRNAs may
chromatin structure and blocks gene transcription by an epi- act as scaffolds linking chromatin regulatory proteins to
genetic mechanism. The repressed transcriptional state is affect gene expression. Given that the genomes of mammals
then maintained in daughter cells through mitotic division. encode a large number of lncRNAs, this may be a critical
The result is patches of cells descendant from original pro- mechanism of gene regulation in the mammalian lineage.
genitor cells that share the same pattern of inactivation of The best-known example of a lncRNA regulating gene
w + expression. These cells form regions of white in the eye expression is Xist, which is involved in X chromosome inac-
of the fly. tivation in eutherian female mammals.
How is epigenetic control maintained in cells? For X-inactivation, as we discussed in Section 3.6, is the
cellular memory to be maintained, any acetyl and methyl dosage compensation mechanism by which eutherian mam-
groups that are present on histones before DNA replica- malian females achieve the correct balance of X-linked gene
tion must be maintained or established on both the old and expression. Mammalian females undergo random X inacti-
new histones after DNA replication. The specific molecu- vation in each nucleus early in gestational development, the
lar mechanics of this process are not entirely clear, but the precise timing being species specific. Recall that random X
partial disassembly and subsequent reassembly of nucleo- inactivation leaves one active X chromosome that is largely
somes is an essential component. Recall that chromatin euchromatic and one inactive X chromosome that is almost
structure is broken down as the replication fork passes (see entirely heterochromatic in each nucleus. The heterochro-
Figure 10.27). Nucleosomes are separated from the paren- matic X chromosome is almost completely silent with
tal DNA strands so the latter can serve as templates for the respect to gene expression. This highly heterochromatic X
synthesis of daughter strands. The nucleosomes partially chromosome forms a Barr body in the nucleus. All cells
break apart, and old nucleosome segments along with descending from the ones that originally underwent random
newly synthesized nucleosome segments are reassembled X inactivation maintain the same active (euchromatic) and
on both new duplexes. inactive (heterochromatic) X chromosomes, leading to the
Immediately after DNA replication, the newly formed mosaic pattern of cells characteristic of eutherian mamma-
nucleosomes carry only part of their previous epigenetic lian females (see Figure 3.26).
information. The original epigenetic state must be quickly Extensive studies of X inactivation in mice and
reestablished by epigenetic marking of the newly synthe- humans have detected about a dozen genes on the hetero-
sized histones. Old histones are able to modify new histones chromatic (inactive) X chromosome that escape silenc-
to have the same pattern of epigenetic marks through the ing. One of these genes is critically important to the
activities of the readers and writers of PcG and TrxG com- establishment and maintenance of X-inactivation. The
plexes. This process takes place among adjacent nucleo- gene, called X-inactivation-specific transcript (Xist), is
somes, thus preserving local epigenetic control of gene active on the heterochromatic X chromosome and is
transcription. The interaction must also occur over long dis- inactive on the euchromatic chromosome. It is located in
tances so as to maintain higher-order chromatin structure, the X-inactivation center, or XIC, of the X chromosome
such as that characterizing inactivated X chromosomes (see (Figure 13.24). The Xist gene is transcribed only on the
below). heterochromatic chromosome, where it is active; it is not
In contrast to the formation and differentiation of spe- transcribed on the euchromatic X chromosome, where it
cialized tissues and cells in the body, the formation of germ- is inactive. The gene transcript is a specialized RNA tran-
line cells (cells that give rise to the next generation), must script called Xist RNA that never leaves the nucleus and is
clear the replicating chromatin of the majority of accumu- never translated. Instead, Xist RNA exclusively coats the X
lated epigenetic marks. Thus, most epigenetic marks added chromosome that produces it.
during the lifetime of an organism are erased during meiosis, One idea of how the modification is accomplished is
resetting the epigenetic landscape for the next generation. that the Xist RNA may act as a molecular bridge between
However, there is evidence that some epigenetic differences the inactive chromatin and the repressive chromatin-
can be heritable through meiosis, passing from one genera- modifying complexes such as PcG, whose associated HMTs
tion of the organism to the next, a topic we will explore in and HDACs methylate (H3K27me3) and deacetylate his-
the Case Study. tones, respectively. These epigenetic modifications are
13.2 Chromatin Remodeling and Modification Regulates Eukaryotic Transcription 497
activator proteins bind the enhancer sequence and direct Nucleotide Methylation
transcription of H19 by interacting with transcription fac-
tors and RNA polymerase II at the promoter. The ICR in The methylation pattern identified in genomic imprinting of
the maternal chromosome is bound by an insulator protein the ICR and H19 gene is a type of methylation that is asso-
that blocks the enhancer from affecting IGF2. On the pater- ciated with repression of gene expression in many plants
nal chromosome, on the other hand, extensive methylation and vertebrates, particularly mammals, that differs from the
of the ICR and H19 prevents insulator protein binding and methylation of amino acids in N-terminal histone protein
blocks transcriptional protein binding at the H19 promoter. tails. In this case, methyl (CH3) groups are attached to spe-
In the absence of the insulator protein, the enhancer stimu- cific DNA nucleotides, not to amino acids in histone protein
lates transcription of IGF2. tails. Nucleotide methylation is performed by specialized
Genomic imprinting silences expression of paternal DNA methyltransferases that add methyl groups primarily
H19 and maternal IGF2 and directs transcription of pater- to cytosines located in CpG dinucleotides, side-by-side
nal IGF2 and maternal H19 in all somatic cells. This pattern cytosine and guanine nucleotides in the same DNA strand.
is essential for normal development, and any other pattern The p in CpG represents the single phosphoryl group in the
produces profound abnormalities. A genetic condition called phosphodiester bond connecting the nucleotides. Comple-
Beckwith–Wiedemann syndrome, characterized by an over- mentary strands of DNA containing CpG dinucleotides
growth of tissues, results if the both the maternally and each have 5’-CG-3’. In plants, other C nucleotides may
paternally inherited chromosomes display the expression be methylated—the ones in 5’-CNG-3’ and 5’-CNN-3’
patterns normally associated with the paternally inherited configurations, for example.
locus. Conversely, if both inherited chromosomes display Much of the cytosine-methylated DNA in eukaryotic
the typical maternal expression pattern, a genetic condition genomes is in transposable element sequences and non-
called Russell–Silver syndrome, characterized by under- coding sequences and is associated with a transcriptionally
weight infants that fail to grow appropriately, results. silent chromatin state. Just as with chromatin-remodeling
Why might these genes be imprinted in mammals? The enzymes, the DNA methyltransferases are recruited to
reason may be related to the reproductive biology involving specific loci by transcription factors when DNA meth-
placentation, whereby the female bears the physiological ylation is being established. Also paralleling nucleosome
burden for nurturing the young. IGF2 encodes a growth fac- modification, the pattern of cytosine-methylated sites is
tor promoting development—its expression promotes growth usually mitotically stable but can be reset during meiosis.
of the embryo. One hypothesis is that male mammals profit A simple modification of Sanger sequencing in which the
from promoting maximum growth in all their offspring, DNA is first treated with bisulfite, which converts cyto-
whereas female mammals profit more from balancing the sine to uracil but leaves methylcytosine untouched, allows
growth of multiple offspring over the mother’s lifetime. Thus, the direct determination of the methylation status of DNA.
the active IGF2 allele inherited from the father promotes Recall from Section 11.2 that deamination of a meth-
embryo growth, while the female’s inactive allele counterbal- ylated cytosine creates a thymine, which generates a mis-
ances the excess activity provided by the male. The evolution match that is repaired either to a C-G or a T-A base pair at
of imprinting in both mammals and flowering plants is likely approximately equal frequencies. Thus, in organisms with a
due to their both being placental organisms, with different significant amount of cytosine methylation, such as in ver-
selectives pressures for the male and female parents. tebrates, where most of the cytosines in CpG dinucleotides
Given the importance of imprinting for certain genes are methylated, over time the number of CpG dinucleotides
and considering the different imprinting patterns of is reduced. In these species, sequences rich in CpG, called
gene expression in maternally derived versus paternally CpG islands, are regions of the genome in which there is
derived chromosomes, how does the inheritance of cor- strong selection for maintenance of cytosines, reflecting a
rectly imprinted chromosomes occur? The answer in the functional role for such regions. As a result, CpG islands can
case of H19 and IGF2 is that in primordial germ-line be used to identify potentially functional genomic regions
cells, the inherited imprinting patterns are first erased such as gene regulatory sequences.
and then are reestablished in the sex-specific pattern
of the germ line early in gametogenesis. In the female
germ line, methylation of the paternal chromosome is 13.3 RNA-Mediated Mechanisms
reversed by demethylase activity, and the insulator pro-
tein is removed from the ICR on the maternal chromo- Control Gene Expression
some. Both chromosomes are then re-imprinted with
the female-specific pattern. In the male germ line, both In the past several years, RNA has emerged as a key compo-
chromosomes have their imprinting erased and then rees- nent in the regulatory control of eukaryotic gene expression.
tablished in the male-specific pattern. These processes Largely unknown before the mid-1990s, RNA-mediated
ensure that each parent passes a properly imprinted chro- regulatory mechanisms have rapidly become a major focus
mosome during reproduction. of research in plants and animals. This important area of
13.3 RNA-Mediated Mechanisms Control Gene Expression 499
inquiry emerged unexpectedly from experiments designed pairing to attach the guide strand to mRNA, and the mRNA
to produce a more colorful petunia. is destroyed; 2 the RISC–guide RNA binds to complemen-
In the early 1990s, Richard Jorgensen and his col- tary mRNAs and blocks their translation; or 3 the complex
leagues were attempting to deepen the color of petunias by directs chromatin-modifying enzymes to the nucleus, where
introducing into the petunia genome a pigment-producing they silence transcription of selected genes.
gene under the control of an active promoter. The research- What is the origin of the dsRNA? It can be produced
ers hoped that active transcription of this recombinant gene from endogenous genes or from the transcription of other
would dramatically deepen flower color. To Jorgensen’s endogenous nongene sequences (e.g., transposons), or it can
surprise, however, rather than exhibiting more intense color come from exogenous sources. In many eukaryotes, genes
overall, many of the resulting flowers were variegated (see encode precursors of dsRNA 4 that are processed into 21-
the chapter opener photo). Some flowers had stripes of to 24-nucleotide microRNAs (miRNAs) at a Dicer com-
deep pigment and stripes lacking pigment, and some flow- plex. Most genes encoding miRNAs are transcribed by RNA
ers were almost entirely white. The researchers called this polymerase II, and the resulting transcript folds back on
phenomenon cosuppression because expression of both the itself into a dsRNA. The targets of miRNAs are endogenous
introduced pigment gene and the petunia’s natural pigment- mRNAs that are then either cleaved or have their translation
producing gene was suppressed. blocked subsequent to activity mediated through RISC.
By 1995, similar gene-silencing phenomena had been Another type of dsRNA is small interfering RNA
documented in numerous plant species, in the fungus Neu- (siRNA). In contrast to miRNAs, siRNAs are usually not
rospora crassa, in the nematode worm Caenorhabditis derived from genes but rather come from exogenous sources
elegans, and in the fruit fly Drosophila. The fundamental or from other endogenous transcription. For example, if both
mechanism behind this form of regulation was identified strands of a genomic region happen to be transcribed, dsRNA
in 1998 by a research team led by Andrew Fire and Craig can form. Transcription from opposite strands of repetitive
Mello. Fire and Mello found that double-stranded RNA elements, such as transposons, can also lead to dsRNA pro-
(dsRNA) molecules were taking part in a posttranscriptional duction 5 . In the latter case, the two strands do not have to
regulatory mechanism now known universally as RNA be derived from the same genomic location. Some eukary-
interference (RNAi). Fire and Mello received the Nobel otes possess RNA-dependent RNA polymerases, which can
Prize in Physiology or Medicine in 2006 for their work. produce dsRNA using single-stranded RNA as a template.
The endogenous sources of dsRNAs can direct either
posttranscriptional silencing, through the destruction of
Gene Silencing by Double-Stranded RNA target mRNAs or inhibition of their translation, or tran-
RNA interference silences gene expression either by block- scriptional silencing of target genes that takes place by
ing transcription of targeted genes or by blocking gene chromatin-modifying processes. Finally, exogenous sources
expression posttranscriptionally. Posttranscriptional silenc- of dsRNA can include RNA viruses 6 that trigger virus-
ing occurs following binding of small regulatory RNAs to induced gene silencing.
mRNA targets by complementary base pairing. The binding
of these regulatory RNAs either can lead to the destruction Cleaving dsRNA The general mechanism of action by
of the target mRNAs or can block their translation. Alterna- which Dicer cleaves dsRNA into fragments of the proper
tively, some regulatory RNAs enter the nucleus, where they size involves the enzyme’s dsRNA-binding site (called PAZ)
bind DNA to block transcription of targeted genes. Any of and its two RNase domains, separated from the PAZ site
these regulatory processes first require that small regulatory by a distance corresponding to the length of the resulting
RNA molecules use complementary base pairing to bind dsRNA fragments. Dicer repeats the cleaving action, each
their targets. time behaving as a molecular ruler measuring off precisely
The regulatory RNAs in RNAi are derived from various sized dsRNAs. The spacing between the PAZ site and RNase
sources that produce double-stranded RNAs. An enzyme domains varies among species and appears to correlate
known as Dicer (Figure 13.26) cuts the double-stranded with species-specific differences in the lengths of siRNAs
RNA into 21- to 25-bp fragments. These fragments are produced by subsequent RISC processing of dsRNAs.
then bound by a protein complex called the RNA-induced Precursor transcripts of miRNAs and siRNAs are syn-
silencing complex (RISC) that denatures the double- thesized in the nucleus of a cell and are processed into miR-
stranded RNAs into single strands of 21 to 25 nucleotides. NAs and siRNAs by Dicer activity. In the case of miRNAs,
The RNA single strands produced by RISC are identified the precursor transcript is called a primary microRNA (pri-
as the guide strand, which is biologically active, and the miRNA). The pri-miRNA folds to form a double-stranded
passenger strand, which is usually degraded. The guide stem typically containing 65 to 70 nucleotides and having
strand remains bound to RISC, and the complex directs free ends on one side and a single-stranded loop on the other
one of three gene-silencing processes (numbers 1 through side (Figure 13.27). In animals, the Drosha enzyme complex
3 in the figure): 1 The complex uses complementary base cuts pri-miRNA near the middle of the stem and produces
500 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
Pre-siRNA
siRNA or
miRNA
Binds to RISC
RISC
Passenger strand
RISC denatures Pri-
RNA miRNA
(degraded)
Guide strand
dsRNA
3 Transcriptional
Binds to silencing of 5 4
mRNA by targeted
complementary genes
base pairing
DNA
Bidirectional Transcription of
transcription microRNA genes
(e.g., of repetitive
DNA sequences)
1 mRNA destroyed
or
Nucleus
2 Translation blocked
two segments, one of which, now called precursor microRNA family that plays a central role in how the RISC–guide
(pre-miRNA), contains the remainder of the upper stem, strand silences gene expression. Many species encode mul-
which is approximately 21 to 25 bp, and the terminal loop 1 . tiple Argonaute proteins—humans encode eight, for exam-
The pre-miRNA is transported to the cytoplasm, where Dicer ple—and each seems to direct a somewhat different activity
removes the terminal loop, leaving dsRNA of approximately by RISC–guide strand.
21 to 25 bp 2 . RISC then binds the dsRNA and separates the The best-understood mechanism of gene silencing by
strands to create miRNAs 3 . In contrast to animals, plants use RISC–guide strand involves complementary binding of the
a single Dicer enzyme to perform all the miRNA processing guide strand to a target mRNA. If the percentage of base-
activities. The creation of siRNA is similar. pair complementation is high enough, this binding forms
a structure that allows an RNase domain of Argonaute to
RISC and Argonaute The newly produced siRNA or cut the targeted mRNA strand near the middle of the guide
miRNA remains bound by RISC to act as a guide strand. strand–mRNA duplex, thus causing cleavage of the mRNA.
Within the RISC is a protein of the Argonaute gene When the guide strand–mRNA base pairing is less well
13.3 RNA-Mediated Mechanisms Control Gene Expression 501
Cleavage by Cleavage by methylation of H3K9 and does not have gene silencing
Pri-miRNA Drosha Dicer around the centromere. The explanation for these additional
5¿ deficiencies is that in S. pombe, both strands of the centro-
3¿ meric repeat sequences are transcribed by RNA polymerase
II (Figure 13.28 1 ). The resulting mRNAs are complementary
Lower stem Upper stem Terminal loop
(~11 bp) (~22 bp)
Centromere
1 Lower stem cleavage by Drosha AC AC repeat AC AC AC AC
1 Transcription of
centromeric DNA
produces …
and form double-stranded RNAs that Dicer cuts 2 . The siRNA to the movement of some transposable elements around the
fragments produced by this process are then separated into genome and potentially to the production of new mutations.
single strands that bind to Argonaute, which then joins a pro- The evidence suggests that RNAi plays a role in silencing
tein known as Chp1 and other proteins to form a RISC-like the transcription of transposons.
complex called the RNA-induced transcriptional silencing RNAi also plays a protective role in response to viral
(RITS) complex ( 3 ) that carries the siRNA into the nucleus. infection. In plants, the infection of one leaf by a virus can
The siRNA–RITS complex is attracted to the centromere, generate an RNAi response that blocks viral replication and
where the siRNA appears to use complementary base pairing prevents the infection from spreading throughout the plant.
to form a duplex with nascent transcripts of the centromeric In support of this observation, plants with Dicer or Argo-
repeat sequences 4 . This pairing attracts other proteins that naute mutations are much more susceptible to the spread of
promote the deacetylation of histones and the methylation of viral infections than are plants without Dicer or Argonaute
H3K9 to close the chromatin structure and spread constitutive mutations. These findings are consistent with the idea that
heterochromatin outward from the centromere 5 . RNAi evolved as a genome-protection mechanism against
transposable genetic elements and viral infection.
Both plants and animal genomes encode miRNAs,
The Evolution and Applications of RNAi but the mode of action of miRNAs differs slightly between
RNAi is widespread in eukaryotes, and the mechanism of the two taxa. In plants, miRNAs display near-complete
transcriptional silencing in S. pombe is thought to be related sequence complementarity with their mRNA targets and
to RNAi-mediated transcriptional silencing in other eukary- usually cleave the target rather than block translation. In
otic species. But how did RNAi evolve? The answer is still contrast, miRNAs in animals are usually only complemen-
under investigation, but the operating hypothesis is that tary to their targets at one end of the miRNA and usually
RNAi evolved by helping organisms protect their genomes repress translation rather than cleave the target. These dif-
against the mutational effects of transposable genetic ele- ferences suggest that miRNAs may have evolved indepen-
ments (described in Section 11.7). dently in the two lineages, from an RNAi-like mechanistic
Transposable elements are diverse and make up large precursor.
percentages of the genomes of complex eukaryotes. For RNAi is also a powerful research tool that can be used
example, almost half the human genome is composed of in a multitude of ways. One frequent application of RNAi in
transposable elements. In the human genome and in other research is the use of siRNAs to “knock down,” or obstruct,
eukaryotic genomes, most of these transposons are located the expression of selected genes. Researchers can then
in heterochromatin and are silent; however, researchers examine how phenotype is altered in the absence of the
have discovered that mutations in the RNAi machinery of obstructed genes and in this way discover the genes’ usual
an organism can reactivate normally quiescent transpo- effects. We discuss other experimental applications of RNAi
sons by reversing transcriptional silencing. This can lead in Section 14.3.
C A SE ST U D Y
Environmental Epigenetics
Here’s a seemingly simple question: How are traits passed HONEYBEE DESTINY Three lines of evidence suggest a role
from one generation to the next? The first answer that came for nutrition and dietary history in the epigenetic modification
to your mind was probably (and not incorrectly) that traits of gene expression. The first comes from studies in honeybees,
are passed by the transmission of genes from parents to off- where it has been shown that genetically identical larvae can
spring. But over the past decade or so, the answer to that develop into either fertile queens or sterile worker bees fol-
question has expanded in an unexpected direction. Emerg- lowing differential feeding with royal jelly, the compound fed
ing evidence suggests that in certain cases, an organism’s to larvae that become queens. Experimental analysis led by
nutrition and diet may lead to epigenetically controlled Ryszard Maleszka in 2008 reveals that silencing the expres-
modifications of gene expression and that in a few select sion of the DNA methyltransferase Dnmt3 by knocking down
instances, the affected genes can be transmitted to the translation of the Dnmt3 transcript by RNA interference leads
organism’s offspring in their epigenetically modified form. to the development of fertile queens. In other words, blocking
More surprisingly, the data also indicate that the epigeneti- a major histone methylation pathway led to the expression of
cally modified state of the genes may persist in later genera- genes that are typically expressed only when a larva is fed royal
tions. In other words, it may be possible for the nutritional jelly. The implication is that methylation is an important epigen-
experience of grandparents to affect gene expression in etic mechanism for repressing gene expression and directing
their grandchildren—an idea reminiscent of the theories of the development of worker bees. Methylation and the resulting
Lamarck, who proposed the inheritance of traits acquired transcriptional repression are subverted by feeding royal jelly to
within a lifetime! produce the development of fertile queen bees.
Summary 503
EVIDENCE IN MICE The second line of evidence comes Netherlands between November 1944 and May 1945.
from multiple studies of the connection between environ- The famine reduced daily caloric intake to 500 to 800 cal-
mentally generated methylation of genes and variation in ories per day, much less than the body needs to fuel its
gene expression in rats and mice. In one study, genetically normal metabolic activities. Long-term studies have been
identical mice carry a modified agouti gene that produces performed on Dutch people who were conceived or born
yellow coat color and extreme obesity when the gene is during the famine and on their descendants. Studies of
expressed, whereas the normal brown coat color and nor- the health effects of the famine find that so-called fam-
mal body weight are produced if the modified gene is not ine babies were often born severely underweight. As the
expressed. The coat color and body weight of genetically famine babies grew into adults and aged, they suffered
identical mouse pups carrying this modified gene are deter- increased risk of cardiovascular disease, diabetes, and obe-
mined by the diet of the mother in the weeks before impreg- sity compared with peers who had not been affected by
nation and during pregnancy and lactation. the famine. The proposed explanation is that the restricted
In controlled experiments, mothers that will transmit nutritional conditions in the womb caused alterations of
the modified agouti gene to their pups are fed either a diet gene expression, producing an energetically “thrifty”
enriched with three compounds that each act as donors of metabolism. More surprising, however, was that among
methyl groups to DNA—folic acid (vitamin B12), choline chlo- the children of the famine babies, there is also an elevated
ride, and anhydrous betaine—or a diet without these com- risk of cardiovascular and other diseases. The explanation
pounds. The controlled dietary period begins 2 weeks before proposed for this second-generation effect is epigenetic
mating and continues through pregnancy and lactation. The modification of gene expression that is transmitted through
pups produced are genetically identical, and after they are multiple generations.
weaned, they are all fed the same diet. At 3 weeks of age, A 2008 study by Bastiaan Heijmans on the methylation
however, the appearance of the pups is dramatically differ- pattern of the IGF2 gene on chromosome 15 confirms the
ent. Mice produced by mothers who were fed the enriched epigenetic control mechanism that we discussed previously
diet have brown coat color and normal body weight, whereas in connection with genomic imprinting, Prader–Willi syn-
genetically identical mice produced by mothers not fed the drome, and Angelman syndrome. Heijmans and colleagues
enriched diet have yellow coat color and are obese. The dif- found that IGF2 in certain famine babies (now in their six-
ference indicates that the modified agouti gene is expressed ties) still bears the marks of famine. The IGF2 genes of those
when it is transmitted from mothers that were not fed the exposed to famine during the first 10 weeks of gestation are
diet enriched with methyl donors. If the modified gene is marked by significantly fewer methyl groups than are the
transmitted from mothers receiving the enriched diet, how- genes of their same-sex siblings not exposed to famine con-
ever, the modified agouti gene is methylated and silenced. ditions. These results support the idea that prenatal condi-
tions can impart specific epigenetic patterns to genes and
INHERITANCE OF FAMINE EFFECTS The third line of that environmental factors contributing to epigenetic pat-
evidence comes from an unfortunate event during World terns may play an important role in modifying gene expres-
War II. A severe famine occurred in German-occupied sion over multiple generations.
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
13.1 Cis-Acting Regulatory Sequences Bind 13.2 Chromatin Remodeling and Modification
Trans-Acting Regulatory Proteins to Control Regulates Eukaryotic Transcription
Eukaryotic Transcription ❚❚ Heterochromatin has a closed chromatin structure and is
❚❚ Promoters, proximal elements, and enhancer modules are transcriptionally silent, whereas euchromatin has an open
cis-acting DNA sequences that bind trans-acting regulatory structure that is transcriptionally active.
proteins to regulate transcription. ❚❚ Open promoters are constitutively transcribed (often house-
❚❚ The effects of activators and repressors binding to keeping genes), whereas transcription from covered pro-
enhancer/silencer modules integrate to produce an output, moters is regulated.
with repressors often dominant. ❚❚ Chromatin-remodeling complexes displace nucleosomes
❚❚ Enhancer sequences can be strongly conserved, indicating to allow transcription initiation by RNA pol II and general
they perform essential functions. transcription factors.
❚❚ Upstream activator sequences (UASs) in yeast are ❚❚ Chromatin is modified by writers and erasers, and
enhancer-like elements that regulate the expression of genes read by readers. Writers and erasers are recruited by
such as those involved in galactose utilization. transcription factors to open and close the chromatin
❚❚ Locus control regions (LCRs) are specialized enhancers by adding and removing acetyl and methyl groups at
that control the sequential expression of sets of genes such specific amino acids in the N-terminal tails of histone
as those in the developmentally regulated human b@globin gene proteins.
complex. ❚❚ Polycomb group and Trithorax group complexes act to
❚❚ Insulators block enhancer influence on nearby genes and transform facultative heterochromatin into euchromatin and
direct that influence to other genes. vice versa.
504 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
❚❚ Epigenetic states of chromatin are heritable in somatic cells ❚❚ Small interfering RNAs (siRNAs) and microRNAs
that divide by mitosis and may be reset in germ-line cells (miRNAs) are principal regulatory RNA molecules.
that divide by meiosis. ❚❚ The Dicer protein complex processes dsRNAs into small
❚❚ Genomic imprinting in mammalian genomes involves RNAs.
nucleotide methylation and the action of enhancer and insu- ❚❚ RISC carries regulatory RNAs to RNAs targeted for
lator sequences. destruction or for blockage of translation.
❚❚ A specific form of regulatory RNA directs mammalian ❚❚ RITS acts to maintain constitutive heterochromatin.
X-inactivation.
PREPA R IN G F O R P R O B LE M S O LV I NG
In addition to the list of problem-solving tips and suggestions 4. Familiarize yourself with the functions of TrxG and
given here, you can go to the Study Guide and Solutions Man- PcG complexes in facultative heterochromatin.
ual that accompanies this book for help at solving problems.
5. Review the different classes of chromatin and their rela-
1. Familiarize yourself with the mechanistic differences tion to gene expression, for example, the types of genes
between bacterial and eukaryotic gene expression. they are likely to contain.
2. Understand that enhancer/silencer modules integrate 6. Acquaint yourself with the sources and processing of
inputs of several transcription factors into a single output. dsRNAs and their subsequent roles in modulating gene
expression.
3. Review the roles of chromatin remodelers and chroma-
tin modifying enzymes in eukaryotic gene expression.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Devoting a few sentences to each, describe the following 7. What are the roles of the Polycomb and Trithorax com-
structures or complexes and their effects on eukaryotic plexes in eukaryotic gene regulation?
gene expression:
8. Most biologists argue that the regulation of gene expres-
a. promoter
sion is considerably more complex in eukaryotes than in
b. enhancer
bacteria. List and describe the four factors that in your
c. silencer
view make the largest contribution to this perception.
d. RISC
e. Dicer 9. Compare and contrast the transcriptional regulation of
2. Describe and give an example (real or hypothetical) of GAL genes in yeast with that of the lac genes in bacteria.
each of the following: 10. The term heterochromatin refers to heavily condensed
a. upstream activator sequence (UAS) regions of chromosomes that are largely devoid of genes.
b. insulator sequence action Since few genes exist there, these regions almost never
c. silencer sequence action decondense for transcription. At what point during the cell
d. enhanceosome action cycle would you expect to observe the decondensation of
e. RNA interference heterochromatic regions? Why?
3. What is meant by the term chromatin remodeling? 11. Compare and contrast promoters and enhancers with respect
Describe the importance of this process to transcription. to their location (upstream versus downstream), orientation,
4. What general role does acetylation of histone protein and distance (in base pairs) relative to a gene they regulate.
amino acids play in the transcription of eukaryotic genes? 12. What are the different chromatin classifications, and what
5. Describe the roles of writers, readers, and erasers in is their relationship to gene expression?
eukaryotic gene regulation. 13. Define epigenetics, and provide examples illustrating your
6. Outline the roles of RNA in eukaryotic gene regulation. definition.
Problems 505
14. What is one proposed role for lncRNAs? 16. How does dsRNA lead to posttranscriptonal gene
silencing?
15. What are the sources of dsRNA? Diagram the mechanisms by
which dsRNAs are produced and processed into small RNAs.
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
17. A hereditary disease is inherited as an autosomal reces- a. Explain the differential effects of deletions B and F on
sive trait. The wild-type allele of the disease gene pro- expression in the two tissues.
duces a mature mRNA that is 1250 nucleotides (nt) long. b. Why does deletion D raise UG4 expression in leaf tis-
Molecular analysis shows that the mature mRNA consists sue but not in stem tissue?
of four exons that measure 400 nt (exon 1), 320 nt (exon c. Why does deletion E lower expression of UG4 in leaf
2), 230 nt (exon 3), and 300 nt (exon 4). A mother and tissue but not in stem tissue?
father with two healthy children and two children with the 19. Diagram and explain how the inducibility of a gene—for
disease have northern blot analysis performed in a medi- instance in response to an environmental cue—could
cal genetics laboratory. The results of the northern blot for be mediated by an activator. Then show how it could be
each family member are shown here. mediated by a repressor.
1 2 20. A muscle enzyme called ME1 is produced by transcrip-
I tion and translation of the ME1 gene in several muscles
during mouse development, including heart muscle, in
1 2 3 4 a highly regulated manner. Production of ME1 appears
II to be turned on and turned off at different times dur-
ing development. To test the possible role of enhancers
and silencers in ME1 transcription, a biologist creates
I-1 I-2 II-1 II-2 II-3 II-4 a recombinant genetic system that fuses the ME1 pro-
1250 moter, along with DNA that is upstream of the promoter,
Northern
nt to the bacterial lacZ (b@galactosidase) gene. The lacZ
blot
1020 gene is chosen for the ease and simplicity of assay-
ing production of the encoded enzyme. The diagram
shows bars that indicate the extent of six deletions the
a. Identify the genotype of each family member, using biologist makes to the ME1 promoter and upstream
the sizes of mRNAs to indicate each allele. (For sequences. The blue deletion labeled D is within the
example, a person who is homozygous wild type is promoter whereas the gray bars span potential enhancer/
indicated as “1250/1250.”) silencer modules. The table displays the percentage of
b. Based on your analysis, what is the most likely molec- b@galactosidase activity in each deletion mutant in com-
ular abnormality causing the disease allele? parison with the recombinant gene system without any
18. The UG4 gene is expressed in stem tissue and leaf tissue deletions.
of the plant Arabidopsis thaliana. To study mechanisms
regulating UG4 expression, six small deletions of DNA
sequence upstream of the gene-coding sequence are made. ME1 ME1
The locations of deletions and their effect on UG4 expres- Upstream region Promoter lacZ gene
sion are shown here.
Transcription A
Promoter start
Deletions
Upstream region B
region C
UG4 gene D
E
Deletion E D A C B F F
regions
Transcription (%)
_______________
Deletion Stem Leaf lacZ
Deletion activity (%)
None (control) 100 100 None (control) 100
A 100 100 A 100
B <1 <1 B 100
C 100 100 C 4
D 100 163 D <1
E 98 <1 E 170
F >1 >1 F 5
506 CHAPTER 13 Regulation of Gene Expression in Eukaryotes
a. Does this information indicate the presence of c. Given the information available from deletion analy-
enhancer and/or silencer sequences in the ME1 sis, can you give a molecular explanation for the
upstream sequence? If so, where is/are the sequences observation that ME1 expression appears to turn on
located? and turn off at various times during normal mouse
b. Why does deletion D effectively eliminate transcrip- development?
tion of lacZ?
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
21. Using the components in the accompanying diagram, a. How will the gene be activated in the proper cell type?
design regulatory modules (i.e., enhancer/silencer mod- b. How will its expression be maintained?
ules) required for “your” gene to be expressed only in c. How will expression be prevented in other cell types?
differentiating (early) and differentiated (late) liver cells. 22. The majority of this chapter focused on gene regulation
Answer the three questions presented below by describing at the transcriptional level, but the quantity of functional
the roles that activators, enhancers, repressors, silencers, protein product in a cell can be regulated in many other
pioneer factors, insulators, chromatin remodeling com- ways as well (see Figure 13.1). Discuss possible reasons
plexes, and chromatin readers, writers, and erasers will why transcriptional regulation or posttranscriptional regu-
play in the regulation of expression of your gene, that is, lation may have evolved for different types of genes.
what factors will bind and be active in each case? Specify
which transcription factors need to be pioneer factors.
Polycomb Trithorax
complex complex
HAT
complex
Reader
H-acetyl
modifiers
Writer
HDAC
complex
Eraser
Insulator H3K27me
Histone octomer
remodeler H3K4me RNA pol II
H3K9me
SWI/SNF complex
A = Expressed everywhere
ACT B = Expressed early in L, K, S. G
C = Expressed early in L, K, N
D = Expressed late in K,L
1 = Expressed everywhere
2 = Expressed early in L, K, S, G
REP 3 = Expressed early in S, G
4 = Expressed early in K, S
5 = Expressed late in S, G
Thomas Hunt Morgan’s fly room (he is at far right, back row) was the site ESSENTIAL IDEAS
of the original mutagenesis experiments. The first screens for mutations
were limited by their reliance on spontaneous mutants, but the discovery ❚❚ Forward genetic screens induce muta-
by Hermann Muller (second from right, back row) that X-rays are muta- tions to identify genes involved in a
genic turned genetic screens into routine and powerful tools to uncover biological process; subsequent cloning
gene function. Also visible in this photo are Calvin Bridges (third from sheds light on their molecular function.
left, back row), who used observations of nondisjunction to prove the ❚❚ DNA sequences of specific genes can
chromosome theory of heredity, and Alfred Sturtevant (middle front be discovered using recombinant DNA
row), who constructed the first genetic map. technology.
❚❚ Reverse genetics techniques start with a
A
gene sequence and then proceed to the
central goal of biology is to understand the molecular identification of a mutant phenotype.
and genetic bases of physiology and development. ❚❚ Phenotypes of transgenic organisms can
Beginning with Mendel and resuming in the first part of provide information on gene function.
the development of tools to manipulate DNA in vitro. nature of the hereditary abnormality and, by inference,
With these tools, collectively referred to as recombi- the normal functions of an associated gene. Ultimately,
nant DNA technology, geneticists could for the first the sequence of the gene responsible for the abnormal-
time obtain the precise DNA sequences of specific ity is determined and may suggest the molecular func-
genes and alleles, thus identifying the molecular basis tion of the corresponding gene product (Figure 14.1a).
of phenotypic differences. In contrast to forward genetics approaches, which
The exploration of how genes control physiologi- begin genetic investigation with a mutant pheno-
cal and developmental processes is approached in type and proceed toward the identification of a gene
two ways that attack the problem from diametrically sequence, reverse genetics approaches begin with a
opposite directions. These opposite approaches are gene sequence and seek to identify the corresponding
known as forward genetic analysis and reverse genetic mutant phenotype (Figure 14.1b). In a reverse genetics
analysis. The goals of forward and reverse analysis are experiment, loss-of-function alleles of specific genes
the same: to identify the genes responsible for heredi- are created by a variety of techniques, and the result-
tary variation, to determine the structure and function ing phenotypes are examined to see how they differ
of wild-type alleles controlling traits, and to describe from the wild type. Reverse genetic analysis has risen
how mutant alleles generate abnormal phenotypes. to prominence as a result of the enormous quantity
However, the two strategies begin at different ends of of DNA sequence data made available since the late
the process of gene identification. 1990s and of the ability of recombinant DNA technol-
Forward genetic analysis starts with a genetic ogy to manipulate DNA sequences in vitro and in vivo.
screen that identifies specific phenotypic abnor- In this chapter, we discuss forward and reverse
malities in a population of organisms that have been genetic analyses from a conceptual viewpoint, and
mutagenized—mutagenesis being the intentional intro- in Chapter 15 we present details of the recombinant
duction of mutations into the genome of an organism. DNA technology used to conduct this research.
The abnormal phenotype is then studied to identify the
Hoxa10
1 Isolate mouse 2 Generate ATG ACG GGG AAA 3 Identify
ATG ACG GGG AAA
gene similar to mutant mutant
Drosophila GCG GGG GAA GCG allele. GCG GGG GAA GCG phenotype. Lumbar
Ultrabithorax CTG AGC AAG CCC CTG AGC TAG CCC
gene. GAC ATG GCT TAG GAC ATG GCT TAG
Sacral
14.1 Forward Genetic Screens is found in our genome, and there is great interest in the role
of this gene in human memory.
Identify Genes by Their A great strength of forward genetic screens is that they
Mutant Phenotypes are unbiased; no prior knowledge of the molecular func-
tion of the encoded gene product is required. In a sense,
by performing a mutagenesis, the geneticist is allowing
With the discovery by Hermann Muller that ionizing radia-
the organism to reveal how its biological processes oper-
tion induces mutations (see Section 11.3), geneticists real-
ate. Once genes in particular physiological or developmen-
ized that mutant organisms could be generated at will and
tal processes have been identified by mutation, clues to the
systematically screened for phenotypes of interest. Mutant
molecular function of the gene product can be obtained
phenotypes provide information on the function of the wild-
using recombinant DNA technology.
type allele and insight into biological processes. The earliest
example of this logic is the work of Archibald Garrod, who
in 1908 connected the human autosomal recessive heredi- General Design of Forward
tary condition alkaptonuria to the lack of a specific bio- Genetic Screens
chemical activity, the metabolism of homogentisic acid (see Forward genetic screens often require the mutagenesis of
Figure 4.17b). He suggested that the wild-type version of thousands of individuals, followed by screening large num-
the gene encodes the enzyme responsible for this biochemi- bers of their progeny for mutant phenotypes. Each progeny
cal activity. After Muller brought the mutagenic powers may contain multiple mutations, but only a small fraction
of X-rays to their attention (see Section 11.3), geneticists of the progeny will have a mutant phenotype of interest.
began to employ systematic genetic screens to dissect other For example, in their screens to identify auxotrophs, Bea-
biological processes, and the genetic bases for entire bio- dle, Tatum, and colleagues screened many thousands of
chemical pathways were elucidated. individual mutant lines to find the few arginine auxotrophs
The designing of genetic screens to identify genes that were produced. Although some screens necessitate
involved in specific biological processes is limited only by the visual inspection of all progeny, others are specifically
the imagination of the geneticist. An example is the research designed to highlight certain mutants of interest against
by Seymour Benzer that led to the field of behavioral genet- the background of all other mutants. The designing of such
ics in the 1970s. Benzer believed mutations could be iden- screens is an art.
tified that specifically affect behavioral processes, such as Perhaps the most dramatic screen is one in which appli-
one you are using now, the process of learning and mem- cation of a simple selection technique allows mutants of
ory. At the time, behavior was thought by many to be too interest to survive while those not of interest die. Examples
complex to be dissected genetically. However, Chip Quinn, include the isolation of bacteria resistant to antibiotics,
a graduate student in Benzer’s lab, built on previous ideas insects resistant to insecticides, and plants resistant to her-
and designed an ingenious screen to identify learning- and bicides. Similarly, isolation of mutants resistant to analogs
memory-deficient mutants in Drosophila. Wild-type flies of cellular chemicals or to high levels of naturally occur-
could be taught that a pulse of odor would be followed by a ring hormones has proven useful in genetic screens. Often
shock; later, when the flies smelled the odor, they would take in such cases, mutations identify genes encoding proteins
evasive action. When Quinn and Benzer subjected a muta- involved in the metabolism or signaling pathways of the
genized population of Drosophila to this genetic screen, respective chemicals.
they identified mutant strains of flies that could perceive the Even when strong selection criteria cannot be applied,
odor but seemed unable to associate the odor with the stimu- knowledge of the biological process of interest can influ-
lus; either they did not learn or could not remember. ence the design of the screen. For example, in research on
Two mutant genes identified in the study, dunce and the genetic control of embryonic development (described
rutabaga, were later shown to encode proteins involved in in Section 18.2), Eric Wieschaus and Christiane Nüsslein-
the production or degradation of the small signaling mole- Volhard designed a screen for Drosophila embryogenesis
cule cyclic adenosine monophosphate (cAMP). At the time, mutants based on the assumption that the mutations of inter-
signaling via a cAMP pathway was known to be required est were all likely to be lethal to the larva. Thus they could
for learning in the sea hare, Aplysia. Since both Drosophila limit their intensive analysis to mutant lines in which larval
mutants were defective in cAMP physiology, other genes lethality was evident.
that encoded proteins involved in cAMP signaling and
response were also investigated for roles in learning. Ulti- Specific Strategies of Forward
mately, a transcription factor called creb (cAMP response
Genetic Screens
element–binding protein), which activates or represses
genes in response to cAMP signaling, was shown to be Forward genetic screens begin with a mutagenesis: An organ-
critical for storing memories in flies. Remarkably, creb is ism is treated with a mutagen to create mutations randomly
widely conserved in animal species, and mouse mutants throughout the genome. A typical goal is to induce mutations
lacking creb activity also fail to remember. A similar gene in every gene in a population of mutagenized individuals, an
510 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
approach called saturation mutagenesis. The mutagenized individuals. However, the cloning of genes identified by
population is then screened for phenotypic defects in what- chemical mutagenesis can be laborious. In contrast, muta-
ever biological process is being studied, and the mutants are gens that result specifically in insertions of DNA, such as
collected and propagated for further analysis. Strategies for transposons, result in far fewer mutations per individual,
mutagenesis depend on the biological process of interest, making saturation difficult. But these mutagens have the
which dictates the experimental organism to use, the choice of advantage of being able to provide a DNA “tag” that facili-
mutagen, and the screening procedure to identify mutations. tates finding and cloning the mutated genes.
In all mutageneses used for forward genetic screens,
Choosing an Organism The attributes that make an organ- care must be taken to “outbreed” mutants of interest by
ism a good genetic model (see back endsheets) also make crossing them with the wild-type progenitor strain. This will
it a good choice for a mutagenesis experiment: The organ- ensure that the collected mutant lines have only the muta-
ism must be able to progress through its entire life cycle in tion of interest and not others that were also induced during
the laboratory, have a short generation time (for eukaryotic the mutagenesis.
models, the time it takes to produce sexually mature prog-
eny and complete the sexual life cycle), and produce a rea- Strategy for Identifying Dominant and Recessive
sonable number of progeny. In addition, researchers must Mutations The overall goal of mutagenesis is to identify
be able to manipulate it to produce specific genetic crosses. multiple independent mutant alleles of each gene involved
Organisms that are diploid usually have a starting geno- in the biological process of interest. Let us consider the
type (the genotype to be mutagenized) that is inbred—in identification of dominant and recessive mutations in a typi-
other words, for the most part homozygous at all loci. Such cal animal example.
a genotype allows newly induced mutations to be read- Most animals spend most of their life cycle in the dip-
ily identified, without interference from the confounding loid state. Their germ cells are set aside early in develop-
effects of polymorphisms. Finally, it is advantageous to use ment and do not contribute to the somatic development of
the simplest organism possible for the biological process the remainder of the animal body. When animals are treated
under study. Because Saccharomyces cerevisiae has a rapid with a mutagen—for example, by feeding males ethyl meth-
life cycle and is easily manipulated in the laboratory, it is anesulfonate (EMS), a potent mutagen that causes a spec-
often used to investigate biological processes common to all trum of mutant alleles (see Table 14.1)—only the mutations
eukaryotes. The principles elucidated in S. cerevisiae can induced in the germ cells are heritable and will be passed to
often be extended to other eukaryotes, including humans. the progeny of the mutagenized animals.
Breeding these mutagenized males with wild-type
Choosing a Mutagen The choice of mutagen is dictated females will allow newly induced dominant mutations to
by both the organism and the type of mutant alleles desired; be identified in the resulting F1 generation (Figure 14.2a).
different mutagens have different advantages and disadvan- However, only a small fraction of all the mutagenized
tages (Table 14.1). Mutagens inducing different types of flies will harbor a dominant mutation, since they are rare.
changes in DNA sequences were described in Section 11.3. This rarity is due to the low probability that any change in
Treatment with chemical mutagens can induce hun- the DNA sequence of a gene will produce a gain in func-
dreds of mutations in a single individual, allowing satura- tion for the encoded gene product, either qualitatively or
tion to be reached with only a few thousand mutagenized quantitatively.
Radiation
Fast-neutron Rearrangements (deletions, Moderate Usually loss-of-function
X-ray inversions, translocations) (often null), but can be
Gamma-ray gain-of-function
Insertional
Transfer DNA Insertions Low Usually loss-of-function
Transposons (often null)
14.1 Forward Genetic Screens Identify Genes by Their Mutant Phenotypes 511
(a) F1 screen identifies dominant (b) F3 screen identifies recessive (c) F2 screen identifies recessive
mutations. mutations in organisms that mutations in organisms capable
cannot self-fertilize. of self-fertilization.
+++ +++ +++ 1 Mutagenize germ-
—— —— ——
+++ 1 Mutagenize +++ 1 Mutagenize +++ line progenitors.
sperm cells. sperm cells.
2 Allow F1 individuals
+++ +M+ 2 Mate with +++ +m+ 2 Mate with +m+ +m+ to self-fertilize.
P —— × —— P —— × —— wild-type F1 —— × ——
+++ +++ wild-type +++ +++ +++ +++ Newly induced
female. female. mutations should
F1 F1 3 Isolate F1 be present in both
progeny and male and female
+M+ +++ 3 Identify Since each +m+ +++ gametes.
—— —— dominant —— × —— individually
+++ +++ mutagenized +++ +++ F2 +m+ +m+
mutations in sperm is unique, mate to wild —— —— 3 Identify
+M+ +++ type to produce +m+ +++
—— —— F1 individuals. each F1 individual recessive
+++ +++ separate F2 +m+ +++
carries distinct —— —— mutations in F2
families. +++ +++ individuals.
Dominant mutations induced mutations.
segregate in a 1:1 ratio. 4 Interbreed F2 Homozygous mutants may not segregate
+m+ +m+
F2 —— × —— individuals to 3:1 in F2 generation if F1 individuals are
+++ +++
produce F3 mosaics with some wild-type cells and
progeny. some heterozygous mutant cells, as is the
5 Identify case when plant seeds are the starting
F3 +m+ +m+ recessive material for mutagenesis.
——
+m+ ——
+++ mutations in
+m+ +++ F3 individuals.
—— ——
+++ +++
Mutations that result in a loss of function are more Use of Balancer Chromosomes for Tracking Mutations
common, but loss-of-function mutations are usually reces- The inefficiency of an F3 screen can be circumvented using
sive and do not result in an observable phenotype in the chromosomes that are marked so they can be followed
F1 generation. Therefore, further breeding must be per- through generations. Balancer chromosomes developed in
formed, to produce homozygous loss-of-function mutants. Drosophila allow specific chromosomes to be transmitted
Specifically, recessive mutations are identified in an F3 intact and followed through multiple generations.
screen (Figure 14.2b). In this screen, each F1 individual Balancer chromosomes have three general features:
derived from the mating of mutagenized males with wild- (1) one or more inverted chromosomal segments, within
type females carries unique mutations. The F1 individuals which meiotic recombinants are not transmitted (see
are then crossed with wild-type females, producing an F2 Section 10.5 for a review); (2) a recessive allele that results
generation in which half of the individuals will carry the in lethality, so an individual cannot be homozygous for the
newly induced mutations. The F2 siblings are interbred, pro- balancer chromosome; and (3) a “mark” in the form of a
ducing an F3 population segregating for individuals that are dominant mutation conferring a visible nonlethal pheno-
homozygous for the induced mutation. The interbreeding of type, so the segregation of the chromosome can be followed
the F2 to produce homozygous mutant F3 is inefficient, since through generations. An example of a balancer chromosome
only half of the F2 are heterozygous for the induced muta- is the ClB chromosome used by Hermann Muller to dem-
tion. Nonetheless, such mutagenesis strategies are employed onstrate that X-rays induce mutations (see Experimental
with many species, such as mice and zebrafish. Insight 10.1, page 382).
Identification of recessive mutations is somewhat sim- Balancer chromosomes are available for all of the Dro-
pler in organisms that self-fertilize, such as Caenorhabditis sophila chromosomes and can be used to identify mutations
elegans and many plants (e.g., Arabidopsis and maize). In on specific chromosomes (Figure 14.3). Male flies are fed
these organisms, F1 individuals are self-fertilized to produce EMS to induce mutations and then are mated with females
an F2 generation from which recessive mutations can be containing a balancer chromosome. Note that while muta-
identified. An example of an F2 screen is shown in Figure tions are induced throughout the genome, only those on the
14.2c. In either an F2 or F3 screen, mutations resulting in homolog of the balancer chromosome are analyzed. Male
homozygous lethality can be maintained in heterozygous F1 progeny are selected that inherit a mutagenized chromo-
siblings. some from their father and the balancer chromosome from
512 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
cn+++
——— What happens if the new mutation results in lethality
Balancer cn+++
chromosome
1 Mutagenize sperm cells. when it is homozygous? In that case, all surviving F3 indi-
viduals will carry the dominant allele of the balancer chro-
++++
——— cn+m+ 2 Mate with a female fly
P
cnCyO
× ———
cn+++ mosome. When a lethal mutation is identified in this way,
carrying a balancer
chromosome.
the mutant allele can be propagated from the heterozygous
CyO is a balancer siblings. This was the mutagenesis strategy used by Eric
chromosome The cinnabar mutation (cn) is
included to help follow the balancer Wieschaus, Christiane Nüsslein-Volhard, and colleagues in
with a dominant
allele resulting and mutagenized chromosomes. their screen to identify Drosophila mutations that disrupt
in curly wings pattern formation during embryogenesis. The research is
and a recessive 3 Select male F1 progeny described in detail in Section 18.2.
lethal allele. with curly wings, carrying
F1 cn+m+
——— × ++++ the CyO mutation, and
——— Screening for Conditional Alleles in Haploid Organisms
cnCyO cnCyO mate with a female fly
carrying the balancer The use of haploid organisms in a forward genetic screen
chromosome. has the advantage of allowing both recessive loss-of-function
4 Select F2 progeny mutations and dominant mutations to be identified directly.
++++ cnCyO with curly wings and With single-celled haploid organisms, a population of mitoti-
F2 cinnabar eyes, cally active cells can be mutagenized, and mutants with
++++ cn+m+
——— ——— carrying dominant
cn+m+ cn+m+ cnCyO an altered phenotype can be selected directly in the colo-
CyO and homozy-
gous for the recessive nies derived from the mutagenized cells. A disadvantage is
cnCyO ++++
——— cn allele. (These have that mutations disrupting essential processes in growth and
cnCyO Dies
inherited the original physiology are often lethal, interfering with the propaga-
mutagenized
chromosome and a tion of alleles and thus complicating genetic screening. For-
balancer chromo- tunately, it is often feasible to design a screen to identify
some.) conditional mutant alleles of essential genes. In conditional
cn+m+
——— × cn+m+
——— 5 Interbreed within mutants, the encoded gene product is either functional or not
cnCyO cnCyO
selected F2. needed under one environmental condition—the permissive
Homozygous condition—but is required and either inactive or absent
for m mutation cn+m+ cnCyO under another—the restrictive condition).
F3 cn+m+ cn+m+ With some lethal mutations, the mutant phenotype can
——— ———
cn+m+ cnCyO be rescued by addition of a needed substance to the growth
cn+m+ medium. For example, histidine auxotrophic mutants can
Straight wings, Curly wings, grow only when histidine is present in the growth medium. In
Heterozygous cinnabar eyes cinnabar eyes
for m mutation
a screen for conditional mutants of this type, the mutagenized
cn+m+
———
cnCyO Balancer
population is initially grown under permissive conditions—
cnCyO homozygote in this case, in a medium containing histidine—so that both
Curly wings, dies mutant and wild type will grow. This mutagenized population
cinnabar eyes is then replica plated, and the population is screened for phe-
notypic defects (e.g., lethality) when grown under the restric-
If no straight-winged flies are present in
F3 progeny, the new mutation is lethal.
tive condition (e.g., a lack of histidine). Such genetic screens
were performed by Beadle and Tatum to identify auxotrophs
Figure 14.3 Identifying recessive mutations in Drosophila using in Neurospora in the research that established biochemical
a balancer chromosome. genetics and produced the one gene–one enzyme theory (see
Section 4.3).
Some kinds of mutants can be rescued not by supply-
their mother. Next, the selected males are mated to females ing a certain substance to the medium but by altering other
of the balancer stock, producing F2 progeny. The F2 genera- kinds of environmental conditions instead. In temperature-
tion consists of both males and females heterozygous for the sensitive mutants, the stability of the polypeptide product of
induced mutation and can be interbred to produce F3 prog- a mutant allele differs with temperature (see Section 4.1),
eny. In the F3 generation, 25% should be homozygous for often as a result of a missense mutation.
the induced mutation and will not carry the dominant allele This type of conditional lethal allele in the yeasts
of the balancer chromosome; 50% will be heterozygous for S. cerevisiae and Schizosaccharomyces pombe led to a
the newly induced mutation and also carry the dominant molecular genetic understanding of the cell cycle, a bio-
allele; and the remaining 25% will die due to homozygos- logical process shared by all eukaryotes. Mutagenized
ity for the balancer chromosome. The homozygous progeny yeast were grown at a permissive temperature to allow
lacking the dominant allele from the balancer chromosome propagation, and then the mutant lines were exposed to
can be screened for an aberrant phenotype. a restrictive temperature, causing an arrest in growth
14.1 Forward Genetic Screens Identify Genes by Their Mutant Phenotypes 513
Plate Replica
Haploid yeast Yeast colonies
plate
Yeast colonies Analysis of Mutageneses
grown at 23°C at 23°C at 36°C
Typically, the initial analysis of mutants obtained by muta-
Temperature-sensitive genesis will focus on three key questions: (1) Are mutant
(ts) mutants grow at alleles dominant or recessive with respect to the wild-type
23°C but not at 36°C.
allele? (2) How many different genes have been identified in
the mutagenesis? (3) How many different mutant alleles of
(b) Yeast-cell cycle
each gene have been identified?
can be obtained by mutation of the gene in question (see (a) Sturtevant’s cross identifying synthetic lethality
Section 4.1). The recovery of multiple alleles for each gene
pn + pn+ K-pn
also provides information on the saturation of the genetic ––– — × –––— –––—
pn + K-pn
screen; in other words, it suggests what percentage of the
genes that could be identified have in fact been identified. The dominant allele
When a mutagenesis experiment is shown to have pro- pn+ K-pn pn K-pn Prune-killer (K-pn) in
F1 –––— –––— ––– –––—
duced multiple independent mutations in each gene identi- pn + + combination with loss
all die of prune (pn) function
fied, most genes in the process of interest have likely been results in lethality.
= Y chromosome
mutated.
Genetic Analysis 14.1 challenges you to design a screen
that identifies genes involved in a particular biological (b) Possible mechanisms for synthetic enhancement
process.
Between-pathway interactions
Identifying Interacting and Redundant Pathway A Pathway B
Genes Using Modifier Screens A B
Generally, mutant phenotypes reflect the response of the
organism to a loss or change of a particular gene product.
However, individual genes do not act in isolation. The activ- Essential biological
ity of other genes may modify, by either enhancing or sup- function
pressing, the phenotypic defects caused by the loss of a gene
If two pathways both perform the same essential function,
product. One approach to discovering genetic interactions is mutation of either alone may be inconsequential, but mutations in
to carry out a genetic modifier screen to see if mutations both results in a loss of the essential function.
in a second gene can enhance or suppress the phenotype of
the first mutation. For example, starting with a Drosophila Within-pathway interactions
mutant with slightly curled wings, a modifier screen could
be carried out to identify second-site mutations that result C1 wild type c1 mutant C1 wild type c1 mutant
C2 wild type C2 wild type c2 mutant c2 mutant
either in more severely curled wings or in a wing morphol-
ogy that is restored to a wild-type phenotype. Genes identi- C1 c1 C1 c1
fied in modifier screens are often involved in the same or
closely related genetic pathways. An enhancer screen is a
modifier screen in which mutations in a second site enhance C2 C2 c2 c2
the phenotype of the initial mutant. A suppressor screen is Full Reduced Reduced Insufficient
a modifier screen designed to identify second-site mutations pathway pathway pathway pathway
that suppress the phenotype of the initial genotype. Note activity activity activity activity
Essential
that both types of screens can be performed simultaneously. function
Enhancer–suppressor screening strategies are almost limit-
less in number and sophistication and have the potential to Wild type Viable Viable Lethal
identify genes that function in interacting genetic pathways.
Modifier screens can identify double mutants that dis- Partial loss-of-function mutations in C1 or C2 alone reduce
functions, but organism is still viable. However, if both components
play an unexpected phenotype, one that is not simply the are mutated, the pathway may become nonfunctional.
combination of the phenotypes of the two single mutants.
In perhaps the most dramatic form of enhancement, termed Figure 14.5 Synthetic enhancement.
synthetic lethality, the two single mutants are viable but
the double mutant is inviable. Q What kind of modifier screen can uncover genetic
redundancy?
Synthetic lethality, or synthetic enhancement, was
first noted by Drosophila geneticists who observed that
some pairwise combinations of mutant alleles were invi- homozygous for K-pn mutation alone did not have an
able. For example, when Alfred Sturtevant crossed prune aberrant phenotype. In his cross, all male progeny inher-
(pn) mutant females (pn is on the X chromosome) with ited a pn allele from their mother and a K-pn allele from
males from a stock of separate origin called S/E-S, he their father, and therefore these progeny died. In contrast,
noted that the progeny consisted solely of pn+ females the female progeny were viable, since despite inheriting
and no viable males (Figure 14.5a). Sturtevant determined a K-pn allele from their father, they also inherited a pn+
that the S/E-S males carried an autosomal dominant muta- allele from their father. In this example, both pn mutants
tion, which he called Prune-killer (K-pn), that in combi- and K-pn mutants are viable, but the pn, K-pn double
nation with pn results in lethality, but he noted that flies mutant results in lethality.
Genetic Analysis GENETIC ANALYSIS 14.1
X.X
PROBLEM In all eukaryotic organisms, proteins to be secreted from the cell or embedded in the plasma
membrane are translated at the endoplasmic reticulum and travel via the Golgi apparatus to reach the
plasma membrane. Outline a genetic screen for identifying genes involved in
BREAK IT DOWN: The posttrans- protein secretion.
lational processing steps can be
reviewed in Section 9.4 (p. 336). BREAK IT DOWN: In planning a
mutagenesis, consider what type
of organism and mutagen are
appropriate.
Evaluate
1. Identify the topic this problem addresses 1. This problem is about designing a genetic screen to find a certain type of
and the nature of the required answer. gene. The answer should describe a genetic screen to identify mutations in
genes that function in protein secretion.
2. Identify the critical information given in 2. Information is given about protein secretion in cells, a universal process
the problem. among eukaryotes.
Deduce
3. Consider any information given about 3. Since we have not been given any information about the genes involved
genes involved in the secretory process. in protein secretion, a forward genetic screen would be a good approach,
TIP: Consider experimental approaches that do
because forward genetic mutageneses do not depend on prior knowledge
not require prior knowledge of gene function. about biochemical functions or gene sequences.
4. Based on the chapter discussion of for- 4. Since secretory systems in all eukaryotes are similar, they are likely to be
ward genetic screens, choose an appro- homologous, that is, inherited from a common ancestor. Thus we can
priate organism. choose any eukaryote amenable to genetic analysis. Saccharomyces
TIP: In which organisms does the biological
cerevisiae would be a good choice because many genetic tools already
process occur? exist for this model genetic organism.
5. Based on the chapter discussion of 5. Because complete loss of a functioning secretory system is likely to be lethal
designing a forward genetic screen and to any organism, we should use a strategy to identify conditional mutant
on the phenotypic consequence of a loss alleles. Thus we should use a mutagen that induces point mutations.
of protein secretion, pick a strategy for
PITFALL: Avoid the possibility of mutations
identifying desirable mutant alleles. that are lethal under all growth conditions.
Solve
6. Design an approach for a genetic screen 6. A good design would be one similar to the procedure used to identify
based on Solution Steps 3–5. temperature-sensitive mutant alleles in genes of the cell cycle in S. cerevisiae.
Mutagenesis of haploid cells could be performed at a permissive temperature
(e.g., 25°–30°C), followed by screening for mutant phenotypes at a restrictive
temperature (e.g., 39°C).
7. Describe how you would identify muta- 7. A method to monitor secretion is required. One approach would be to
tions specifically affecting secretion. select a protein known to be secreted into the growth media of wild-type
S. cerevisiae and look for mutants that do not secrete that protein (i.e., the
protein is not detected in the medium in which they are growing).
For more practice, see Problems 17, 20, 21, 22, 25, 26, and 28. Visit the Study Area to access study tools. Mastering Genetics
Figure 14.5b shows two possible mechanisms to functions effectively. Note that in the latter scenario, hypo-
explain synthetic lethality. In one mechanism, the two genes morphic alleles can result in synthetic enhancement, but
in question act in parallel complementary pathways. In this null alleles cannot.
scenario, mutations resulting in the loss of either pathway The first scenario, where two genes act in parallel, is
can be compensated for by the activity of the remaining an example of genetic redundancy, where the loss of the
pathway. However, when both pathways are disrupted, a function of either gene alone is compensated for by the
dramatic enhancement in mutant phenotype is observed. activity of the other, nonmutant gene. Only when both
An alternative mechanism is possible when both genes are genes are mutant would a conspicuous mutant phenotype
acting in the same pathway: A reduction in function of one be evident. In such a case, a 15:1 segregation ratio could
component of the pathway results in a mild phenotype, but be expected in the F2 of a cross between the two recessive
when two components are disrupted, the pathway no longer single mutants (see the discussion of duplicate gene action
515
516 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
in Section 4.3). In the most obvious case of genetic redun- influences strategies for gene cloning and subsequent analy-
dancy, two genes encode very similar proteins that can func- ses of gene function.
tion interchangeably. In many instances, the activities of the A second key aspect is the creation of libraries, collec-
two genes do not fully compensate for one another, so that tions of clones of DNA fragments, derived from the total
single mutations, in either gene alone, result in a mild phe- DNA or mRNA isolated from an organism. A library is a set
notype, while a severe phenotype is seen when both genes of recombinant DNA molecules that collectively includes
are mutant. Genetic redundancy caused by the presence of clones of all the relevant DNA sequences of an organism.
duplicate genes can arise in a species through small-scale Genomic libraries are collections of cloned DNA frag-
duplications or through whole-genome duplications. As ments that as a group represent the entire genome of an
we explore in detail in Chapter 16, genome sequences of organism, including repetitive and noncoding sequences.
eukaryotes show such duplications to be very common. Genomic libraries usually consist of tens to hundreds of
Genetic redundancy can also arise from the compensa- thousands of clones, each carried within an individual
tory action of genes that have little or no sequence similar- cloning vector—usually a plasmid (see Section 6.1) or
ity and encode biochemically different activities. This type bacteriophage (see Section 6.5) that has been modified to
of genetic redundancy is difficult to predict on the basis of accommodate the insertion of exogenous fragments of DNA
the DNA sequences of the genes, but it too can be uncov- and that can be stably maintained in a host, such as E. coli.
ered by enhancer–suppressor screens. Enhancer–suppressor Some cloning vectors are specialized for carrying small
screens have been performed on many organisms, including (2–10 kb) pieces of genomic DNA and other cloning vec-
Drosophila, C. elegans, Arabidopsis, and mice (see Section tors are specialized for carrying large pieces (greater than
14.3), and are extremely successful at identifying interact- 100 kb). After the fragments of a genome have been inserted
ing genetic pathways (see Section 18.3). into vectors, the vectors containing the genomic DNA are
propagated in bacteria. A collection of many thousands to
millions of bacterial colonies, each of which harbors copies
14.2 Genes Identified by Mutant of a different piece of genomic DNA, makes up the genomic
library.
Phenotype Are Cloned Using In contrast to genomic libraries, complementary DNA
Recombinant DNA Technology libraries (cDNA libraries) are collections of cloned DNA
fragments that represent mRNA produced by an organism or
cell type. In other words, only the portion of the genome that
Although genes can be identified by genetic screens, deter-
is transcribed is represented in a cDNA library. The clones
mination of the specific DNA sequences of the wild-type
of a cDNA library are also placed in cloning vectors, such as
and mutant alleles requires the use of recombinant DNA
specially modified plasmids, and introduced into bacteria so
techniques to manipulate DNA molecules in vitro and in
that the complete cDNA library is composed of a large num-
vivo. In this section, we discuss the theoretical foundations
ber of bacterial colonies, each of which harbors a different
of how cloning of specific genes is achieved. Recombinant
cDNA clone derived from the mRNA population.
DNA technology is touched on in this chapter but discussed
Within a library, clones containing specific DNA
in detail in Section 15.1.
sequences can be identified through complementary base
To appreciate the magnitude of the task of cloning
pairing. With awareness of these tools, we can now consider
a specific gene, consider that the goal is to single out the
the two approaches that are the focus of this section and
particular gene responsible for the mutant phenotype from
whose purpose is to physically identify specific genes.
among the thousands (or tens of thousands, in the cases of
many eukaryotes) in the organism’s genome, the proverbial ❚ First, genes can be identified by introducing a wild-
needle in a haystack. Because both the biology and the ease type copy of a gene to complement a recessive mutant
of manipulation vary depending on the organism, different phenotype.
approaches have been developed for different species. In
❚ Second, advances in DNA sequencing technology
this section, we describe two of those approaches.
have made it feasible to find genes identified in genetic
We begin by identifying two fundamental aspects of
screens by directly comparing the genome sequence of
recombinant DNA technology that are required for cloning
the mutant with that of the wild-type strain from which
genes. First, gene sequences created in vitro can be intro-
it was derived.
duced into the genome of a living organism. Such genes
are termed transgenes, and the resulting organism is a
Cloning Genes by Complementation
transgenic organism. Because this process is similar to the
transformation of bacteria—that is, the uptake of free DNA The most direct approach to identifying specific genes is to
from outside the cell to inside the cell (see Section 6.3)— detect genetic complementation of a mutant phenotype by
the creation of a transgenic organism is also referred to as an introduced wild-type gene. This approach is restricted
transformation. The ease with which this process is accom- to cases in which large numbers of transgenic organisms
plished varies significantly between organisms and thus can be generated. Consider the yeast temperature-sensitive
14.2 Genes Identified by Mutant Phenotype Are Cloned Using Recombinant DNA Technology 517
cell-cycle mutants described in Section 14.1. If clones of a to identification of human genes similar in function to the
yeast cDNA expression library are transformed into a yeast mutated yeast genes. The fact that both human and plant
cell-cycle mutant, any clones that complement the mutant genes can complement these yeast mutants demonstrates
phenotype so that the cells grow normally should contain the universality of the cell-cycle machinery and indicates
wild-type alleles of the mutated gene (Figure 14.6). that such proteins were present in the common ancestor of
In a procedure of this type, the yeast strain would first eukaryotes.
be transformed and grown at the permissive temperature.
The resulting yeast colonies would then be transferred to
an environment maintained at the restrictive temperature. Genome Sequencing to Determine
Only the yeast colonies receiving a clone encoding a wild- Gene Identification
type version of the mutant gene in question would be able to Cloning genes by complementation is not applicable to all
continue growth at the restrictive temperature; in those colo- organisms, as it relies on a high efficiency of transforma-
nies, the mutant phenotype would have been complemented tion—that is, a high frequency of successful transformation
by the added gene. events in a host population (available in many bacteria and
Complementation experiments can also be used to iden- some fungi). When this is not feasible, as in most multicel-
tify similar genes from other species, if there is sufficient lular eukaryotes, how do biologists find the DNA sequence
conservation of protein function. For example, research in for a gene that is known only by its mutant phenotype? The
which a yeast cell-cycle mutant was transformed using a most direct way to identify the molecular nature of muta-
human cDNA expression library (one in which the human tions might seem to be to compare the genome sequence of
cDNA clones were first fused with sequences allow- the mutant line with that of the wild-type strain from which
ing for their transcription and translation in yeast) has led it was derived.
In theory, comparison of wild-type and mutant
sequences should be straightforward, but there are both
Temperature-sensitive technical and physiological obstacles. First, in organisms
cdc2 mutants of like humans, it is difficult to distinguish between causative
Schizosaccharomyces
pombe
mutations and widespread polymorphisms. Second, even
in inbred laboratory animals, typical mutagenesis proto-
cols produce up to several hundred new mutations in each
mutagenized gamete, introducing the need to backcross
new mutant lines with their wild-type parental strain, as
1 Transform with S. pombe described earlier in this chapter, to isolate the causative
cDNA expression library mutation from the background of other mutations induced
designed so that the cDNA
sequences are transcribed and
during the mutagenesis.
translated in the host. These obstacles can be overcome in inbred laboratory
CDC2-containing plasmid
organisms by examining the genome sequences of many
mutant individuals simultaneously after backcrossing. The
Each S. pombe cell details of how genome sequencing is accomplished are
receives a different described in Section 16.1, but a conceptual outline of its
cDNA clone from the
library. application to identify a gene originally defined by a mutant
phenotype is presented in Figure 14.7. First, the newly iden-
tified mutant line is backcrossed with the wild-type strain
from which it was derived. The resulting F1 individuals are
2 Plate at 23°C. interbred to produce an F2 generation from which homozy-
gous mutants can be selected. DNA is isolated from a num-
ber of homozygous mutants in the F2 and is then pooled and
sequenced in amounts sufficient to ensure that, on average,
every nucleotide in the genome of each individual will be
3 Replica plate and grow at 36°C. sequenced. The idea is that the causative mutation will be
homozygous in all F2 individuals selected, while other muta-
tions will not. Mutations that are not linked to the causative
mutation will segregate in a Mendelian fashion in the F2,
and this pattern will be reflected in the genome sequences.
Only colonies harboring a cDNA clone that can
complement the cdc2 mutant will grow at the Mutations that are linked will segregate according to how
restrictive temperature. closely they are linked to the causative mutation.
The concept behind using a large number of F2 progeny
Figure 14.6 An example of cloning by complementation. is that, although in a single F2 individual the probability of
518 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
recombination between the causative mutation and another, the genome sizes of the model genetics organisms and their
closely linked mutation will be low, in a large population genetic map length (see back endsheets), a researcher can
some level of recombination will occur between the caus- approximate the likelihood of identifying only a small
ative mutation and most unlinked mutations. For example, number of candidate mutations. Due to inexpensive DNA-
if 50 homozygous mutant F2 individuals are examined, 100 sequencing technologies, this approach for going from
meiotic events are being assayed (since meiosis will have mutant phenotype to gene identification is becoming com-
occurred to produce each of the gametes in the F1 parents), monplace in Drosophila, C. elegans, and Arabidopsis.
providing a resolution of approximately 1 cM. Knowing How does one prove that the causative mutation has
been identified? In organisms amenable to transformation,
the “gold standard” of gene identification is to complement
(a) the mutant phenotype by introducing a copy of the wild-type
1 Cross new homozygous
+m+ +++
allele into the mutant background. This approach is similar
mutant with wild-type ——— × ———
+++ to cloning by complementation described earlier, except the
+m+
strain from which it
number of candidate genes is reduced from the entire set of
was derived. The only
differences in DNA genes in the genome to only the candidate gene(s) identified
sequence should be by genome sequencing. Transformation experiments are
those introduced routine in many model genetic organisms and are described
during mutagenesis.
in more detail in Section 15.2.
2 Interbreed F1 +m+
——— × +m+
——— In organisms not amenable to transformation (e.g.,
F1 +++ +++
individuals. humans), other approaches must be used to identify caus-
ative mutations conclusively. For example, having multiple
3 Select a large
number of independent mutant alleles can facilitate gene identification.
+m+
——— +m+
——— +m+
——— +++
——— Genome sequencing of each independent mutant may reveal
homozygous F2 +m+ +++ +++ +++
mutant F2 many candidate genes, but when the genome sequences
individuals. are compared, they should all be seen to contain muta-
tions of the same gene. However, if there is only a single
4 Isolate DNA from 25–100 homozygous mutant F2 individuals. mutant allele available, it may be difficult to tell whether
5 Pool DNA, and sequence such
differences in the DNA sequences of candidate genes are the
that, on average, every cause of the mutant phenotype or simply polymorphisms
nucleotide is sequenced for
each of the pooled individuals.
Chromosome 1 Chromosome 2
ATaACG GGGGGGGaGGGGGG GCGCT ATCTAaCATAGCATAGtATATTATG
Unlinked chromosome
ATGACG GGGAAGCaGGGGAA GCGtT ATCTAGCATgGCATAGCATATTATG 1
difference from reference
Figure 14.7 Genomics approach to gene identification following mutagenesis. (a) Strategy to identify a
mutant gene via genomic sequencing. (b) Example from Arabidopsis.
14.3 Reverse Genetics Investigates Gene Action by Progressing from Gene Identification to Phenotype 519
existing in the population. In this case, candidate mutations mechanisms, either through nonhomologous end joining
must be assessed by additional approaches, such as examin- (NHEJ) or homologous recombination (see Section 11.5).
ing whether the corresponding gene is expressed in a pattern If the double-strand break is repaired by NHEJ, then small
consistent with the mutant phenotype. deletions often remain at the site of the break, leading to
possible loss- or gain-of-function alleles, depending on
what sequences are lost. Alternatively, the break may be
14.3 Reverse Genetics Investigates repaired by homologous recombination, either with endog-
enous sequence from the homologous chromosome in a
Gene Action by Progressing from diploid cell or with exogenously supplied DNA sequences.
Gene Identification to Phenotype In the latter case, if the exogenously supplied DNA has
been constructed in such a way that it contains the desired
Forward genetics was for a long time the primary—and for change, a specific sequence change in the chromosome may
much of the 20th century, the only—approach to uncovering be accomplished.
gene function. Now, however, the development of molecular Two different approaches have been designed used to
methods for gene and genome manipulation and advances cause the nuclease to target a specific site in the genome of
in sequencing technologies are making reverse genetics living cells. First, the nuclease can be translationally fused
approaches increasingly valuable and common. to a sequence-specific DNA binding domain that recognizes
The reasons for shifting toward reverse genetics are only the site in the genome to be targeted (translational
twofold. First, the enormous amount of genomic sequence fusion is discussed in Section 14.4). Second, the nuclease
available has increased by orders of magnitude the num- can be incorporated into a complex with an RNA molecule,
ber of known gene sequences, and only a fraction of them which provides specificity via complementary base pairing
have been assigned a function by forward genetics. For with the target sequence of interest. This latter approach is
example, when the E. coli genome was fully sequenced, based on reengineering a bacterial system called CRISPR–
4288 protein-coding genes were identified, only 1853 Cas9, which has become the system of choice due its ease of
of which had been previously identified through for- use, its flexibility, and the fact that it is inexpensive.
ward genetic screens. Second, genomic sequencing and
reverse genetic screens have uncovered a degree of gene CRISPR–Cas9 The CRISPR story begins in the 1990s
duplication not previously suspected. Gene duplications in the salt marshes along the Mediterranean coast of
often result in genetic redundancy. In forward genetic Spain, where scientists were investigating an extremely
screens, such duplicated genes would not be identified, salt tolerant archaeal microbe, Haloferax mediterranei.
since mutation of only one of the genes would not usu- They noted that an enigmatic array of repetitive DNA
ally result in a conspicuous mutant phenotype. However, in its genome—unique spacer sequences alternating
reverse genetics approaches, where the functions of both with a repeat sequence—seemed to change with chang-
duplicates can be disrupted in an individual organism, are ing environmental conditions. It soon became obvious
particularly suited in these situations to provide evidence from studies by numerous other scientists that related
of gene function. archaeal species and bacteria also possessed similar
Reverse genetics begins with the creation of a mutant arrays but with distinct sequences. The repeats were
allele for a gene identified only by its sequence (see Figure termed CRISPR, for clustered regularly interspaced pal-
14.1). The selection of mutational tools is largely dependent indromic repeats, describing the nature of the repetitive
on the biology of the experimental organism. We describe sequences (Figure 14.8a). Additionally, in each case, adja-
here four technologies for reverse genetics, including one cent genomic loci encoded related sets of genes, termed
that is presently revolutionizing the field of genetics. CRISPR-associated (cas) genes. The cas genes encode a
DNA endonuclease, either as a single protein or as a pro-
tein complex depending on the species. Given that genes
Genome Editing in prokaryotes are often organized into operons, it became
You may not realize it, but you are living through a revolu- apparent that the repeats and associated genes (CRISPR–
tion in genetics due to advances in technologies to manipu- cas) had a common function, but it was not until the early
late DNA sequences in the genomes of living cells. A dream 2000s that the function was determined.
of geneticists for many decades was to have the ability to A range of accumulated experimental evidence indi-
“edit” the genome—precisely changing the nucleotide cated that the CRISPR–cas system acts as a defense mech-
sequence at a specific chromosomal locus to any desired anism against invading nucleic acids. The unique spacer
sequence. Remarkably, this dream has become reality sequences in the CRISPR repeat were found to be derived
in the past few years. The general concept is to design a from the genomes of phage (see Section 6.5) and to act as
DNA endonuclease to target a specific genomic location. guides directing the Cas endonuclease to a specific sequence
The endonuclease creates a double-strand break at the site, of an invading phage. The CRISPR sequences are transcribed
which can be subsequently repaired by endogenous repair into a noncoding RNA and processed into individual repeat
520 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
+
crRNA
tracrRNA
Sequence of guide RNA
gene designed to target
locus of interest RNA
pol III
RNA promoter
pol III
promoter Guide RNA Cas9
Transcription
Transcription and translation
Target
Cleavage
1 2
target multiple chromosomal loci simultaneously. Target- be to delete the gene of interest from the genome, but the
site selection must take into account that the length of the deletion of a specific sequence from the genome requires
guideRNA sequence that is complementary to the target techniques, such as homologous recombination, that pre-
is about 20 base pairs. Any particular 20-bp sequence has cisely manipulate the genomes of living organisms. These
the probability of occurring at random approximately once techniques are very efficient in certain microorganisms,
every 1012 base pairs (1420, assuming equal base pair com- such as bacteria, archaea, and some simple eukaryotes, but
position in a genome). This may seem sufficiently rare to they are much less efficient in more complex eukaryotes
be acceptable even in the human genome of 3 * 109 base like plants and animals. Thus, various different approaches
pairs, but genome sequences are not random, and therefore are used in reverse genetics, depending on the nature of the
a target site should be chosen that will reduce the binding organism (Table 14.2).
at “off-targets” as much as possible. Having the genome Reverse genetics approaches for most of the commonly
sequences for many organisms available to be searched used model genetic organisms utilize knockout libraries,
makes the task of choosing appropriate target sites sim- collections of mutants in which most or all genes have been
pler (see Chapter 16). mutated by inactivating, or “knocking out,” their expres-
The ability of CRISPR technology to create specific sion. Most knockout mutants are produced by the insertion
mutations in the genome of a live cell has revolutionized of exogenous pieces of DNA into the genome to generate
reverse genetic approaches to the study of gene function loss-of-function alleles; thus, most alleles in the libraries are
and given rise to a rapidly proliferation of applications. One null alleles. Saccharomyces cerevisiae and E. coli geneti-
obvious application that we explore further in Section 15.3 cists have, for example, systematically generated loss-of-
is gene therapy, in which a mutant allele in the cells of an function alleles of all known S. cerevisiae and E. coli genes
individual is “corrected” to a functional state. by homologous recombination. In these knockout library
Applications in agriculture that modify the genotype collections, each strain has a single mutation in a different
and hence phenotype of domesticated plants and animals gene. In this subsection we discuss the use of homologous
have the potential to accelerate creation of new breeds and recombination, and in the next we discuss applications in
varieties for specific purposes or for adaptation to changing which the DNA is integrated at random locations in the
climates. Such technologies can be designed to ensure that genome.
the resulting organisms do not carry any exogenous genes If DNA that is introduced into an organism has no ori-
and thus are nontransgenic. The technology has been used, gin of replication, it undergoes one of two fates: enzymatic
for example, to create hornless cattle, obviating the need for degradation or integration into the host genome. Enzymatic
painful “dehorning,” and to produce pigs that are resistant degradation, accomplished by nucleases that are common
to swine flu. Further demonstrating what is feasible using in cells, will eliminate the introduced DNA. Integration of
CRISPR systems, geneticists targeted 62 endogenous ret- DNA into the host genome, in contrast, allows the intro-
rovirus loci in a pig embryo for simultaneous mutation, duced nucleic acid to persist in the host cell. Integration
thereby eliminating all of these elements from the pig is accomplished by either of two distinct mechanisms of
genome. The rationale behind this experiment was to gener- recombination: illegitimate recombination or homologous
ate a pig breed suitable for temporary xenotransplantation recombination.
of organs into humans while they await a more permanent
human donor organ. Also pushing the present boundaries
of accomplishment, DNA from a woolly mammoth, a spe-
cies that went extinct 4000 years ago, was spliced into the Table 14.2 Reverse Genetics Approaches in
DNA in a cell of an elephant, raising the prospect of one day Model Genetic Organisms
recreating now-extinct plants and animals. As can be seen
Species Reverse Genetics Tools
from these applications, there are significant ethical consid-
erations to examine before such modified organisms can be Escherichia coli Knockouts by homologous
recombination
released from the confines of the laboratory.
Saccharomyces cerevisiae Knockouts by homologous
recombination
Use of Homologous Recombination Arabidopsis thaliana CRISPR; random T-DNA and
in Reverse Genetics transposon insertions; TILLING;
RNAi
Although CRISPR–Cas has dramatically transformed how Drosophila melanogaster CRISPR; random P element
reverse genetics is approached, other previously established insertion lines; RNAi
methods are still in use. Another powerful technique for Caenorhabditis elegans CRISPR; RNAi loss-of-function
producing loss-of-function alleles, for example, is to utilize alleles
the endogenous mechanisms of recombination to integrate Mus musculus CRISPR; knockouts by homologous
exogenous DNA fragments into the genome. Conceptually, recombination; RNAi
the simplest way to construct a loss-of-function allele would
14.3 Reverse Genetics Investigates Gene Action by Progressing from Gene Identification to Phenotype 523
Illegitimate recombination integrates introduced DNA recombination with the yeast chromosome if sequence homol-
at a random, nonhomologous location. This form of recom- ogy is present. An introduced circular molecule of DNA can
bination does not require any homology between the intro- recombine by either a single crossover or a double crossover
duced DNA and the genomic DNA into which the former is (Figure 14.10a). In a single crossover, the entire molecule of
integrated. In contrast, the second mechanism for integration introduced circular DNA is integrated into the yeast genome
of introduced DNA, homologous recombination between with no loss of any genomic DNA. If recombination of a circu-
the introduced DNA and the host genomic sequence, requires lar molecule occurs by double crossover, however, only DNA
a significant length of DNA sequence in common between between the homologous flanking sequences is integrated into
the two recombining molecules. The relative frequencies the recipient genome, and the integration is accompanied by
with which these mechanisms occur depend on the species a concomitant loss from the genome of the DNA between the
into which the DNA is introduced. In most plant and animal homologous sequences. Thus, recombination with two cross-
species, illegitimate recombination is the more common fate, overs results in replacement of the genomic DNA with the
but techniques exist to select for individuals in which homol- introduced DNA flanked by the homologous sequences.
ogous recombination has occurred (as described later in this Introducing a linear rather than circular molecule of
chapter). In bacterial and fungal species, introduced DNA is DNA favors retrieval of recombinants produced by double
often recombined in the genome in a homologous manner. crossover, since a single crossover will cause a deletion
For example, fragments of DNA introduced into the event resulting in recombinant molecules lacking a large
yeast, S. cerevisiae, have a propensity to undergo homologous portion of the original chromosome and therefore likely
(a) Homologous recombination with circular DNA molecule (b) Homologous recombination with linear DNA molecule
Homology with
Plasmid target gene 2
Target gene–
1 Single crossover at Selectable marker
1 2
Selectable marker Linearized plasmid
Single crossover at 1
Yeast Yeast
chromosome Target gene+ chromosome
Part of chromosome is lost.
Integrated plasmid Integrated plasmid
1 2 1 1 2
Target gene –
Target gene +
Target gene–
Single crossover results in integration of introduced Single crossover results in integration of introduced DNA
DNA without replacement of target gene. and loss of chromosomes distal to integration site.
1 Double crossover at
2
1 and 2
Linearized plasmid
Double crossover at 1 and 2
Yeast Yeast
chromosome Target gene+ chromosome Target gene+
1 2 1 2
Target gene –
Target gene–
Double crossover results in replacement of target gene. Double crossover results in replacement of target gene.
to be lethal (Figure 14.10b). Linearized DNA molecules the need to produce mutations in the genes of interest at the
recombine at a higher frequency than circular ones, making start of every new genetic experiment.
the introduction of linear molecules the method of choice
for homologous recombination experiments. Use of Insertion Mutants
Taking advantage of this tendency for homologous recom-
bination to occur in yeast, yeast geneticists create recombinant
in Reverse Genetics
yeast both through gene insertion and gene replacement. Loss- In many model genetic organisms, homologous recombina-
of-function alleles are created by replacing the target gene tion frequencies are very low, and thus it is not technically
with heterologous DNA, often a selectable marker gene, thus simple or economically feasible to systematically generate
eliminating the production of functional wild-type protein by loss-of-function mutants for all the genes. However, if an
the target gene. Gene insertions that result in a deletion of the organism is easy to transform, populations of random mutant
entire coding region of the gene create null alleles that pro- organisms can be generated by transposon insertions or, in
duce no protein product. Such insertion alleles are often called the case of plants, T-DNA insertions (see Section 15.2 for
gene knockouts because the insertion “knocks out” the func- details). These populations can then be screened for muta-
tion of the gene (as explained above in the definition given tions in specific genes, using PCR-based techniques with a
for knockout libraries), creating a recessive loss-of-function primer that is specific to the gene of interest and a primer that
allele. Conversely, inserting a functional gene, often creating a is specific to the insertional mutagen used (Figure 14.11).
gain-of-function allele, is called a knock-in. For some model genetic systems, such as Drosophila
The ease with which homologous recombinants are and Arabidopsis, the precise genomic locations of thou-
generated in S. cerevisiae has allowed the production of a sands of random insertions have been identified, permitting
large number of yeast strains for genetic analysis of biologi- mutations in specific genes to be ordered directly from stock
cal processes in this organism. Loss-of-function alleles of centers. In Drosophila, P elements (Section 11.7) have
every gene in the S. cerevisiae genome have been generated been used as an insertional mutagen. These P elements can
and can be ordered from a stock center. Such stocks have be mobilized—by crossing flies possessing a nonautono-
greatly facilitated genetic research by relieving scientists of mous P element with flies possessing an active transposase
1 2 3 100,000
2 Isolate DNA
g1
t1
t2
Gene-specific primers (g1 and g2) are used in conjuction with
tranposon/T-DNA–specific primers (t1 and t2) in PCR reactions. If a gene does not If a gene has an insertion, specific
have insertion (wild combinations of g and t primers
type), only the (in this case g1 + T1 and g2 + t2)
combination of will yield a product. In addition,
primers g1 + g2 the g1 + g2 primers should yield
results in a product. a larger product as compared to
wild type.
targeted induced local lesions in genomes (TILLING). (a) Seeds are mutated to produce M1 generation. Each M1 plant
In a TILLING protocol, a population of organisms of an is heterozygous for mutations in different genes (colors).
inbred strain is randomly mutagenized throughout the M1 individuals
genome. Enough independent lines are produced to bring
the level of mutagenesis to near saturation, so that, ideally,
each gene is represented by multiple mutant alleles in the
mutagenized population. Often, the mutagen employed
in the development of the mutagenized lines is a chemi-
cal such as EMS (Table 14.1). DNA from the mutagenized Each M1 individual is propagated to produce an M2 family.
lines is screened systematically using PCR-based methods
to search for mutations in a particular gene of interest. M2 families
For each individual of the mutagenized population, 1 2 +/+ 3 +/– 4 5
both progeny and DNA are collected. The generation
derived from the mutagenized population is often referred
–/– +/–
to as the M1 generation (Figure 14.13a). DNA is isolated
from M1 individuals or from M2 families of organisms.
Any DNA carrying a mutation induced in the mutagenesis
will be either heterozygous (if the DNA was derived from Each M2 family is segregating for mutations in different
an M1 individual) or segregating (if the DNA was derived genes (homozygous mutants in color). Seed stock and
from an M2 family). A region of the target gene is chosen DNA samples are collected from each M2 family. Seed
stocks represent a repository of mutants.
for PCR-based amplification. The PCR products generated
in this analysis are expected to contain both the wild-type
sequence and mutant sequence. Those that consist solely of (b) Mutations in specific genes are identified by analyzing
DNA isolated from each M2 family. For example, one
the wild-type allele can be distinguished from those consist- representative M2 family has a red mutant segregating.
ing of a mixture of the wild-type allele and a mutant allele
as follows. DNA is collected and screened
The PCR products are first denatured and allowed to for mutations in target gene by PCR amplification.
reanneal, creating some homoduplex DNA, in which the
strands are fully complementary because they are derived
from the same allele, and some heteroduplex DNA (Figure Target gene
PCR products:
14.13b). Heteroduplex DNA is composed of strands that are
largely complementary but contain one or more mismatched Wild-type G Mutant A
allele allele T
base pairs, indicating that the strands are derived from DNA
Homoduplex DNA Heteroduplex DNA
containing different alleles. Heteroduplex DNA can be dis- G G
tinguished from homoduplex DNA by either a difference in C T A
migration of the products during electrophoresis or by dif- A
ferential susceptibility to an endonuclease that cleaves het- T C
eroduplex DNA at mismatched base pairs. Heteroduplex Endonuclease (Cel1) cleaves
DNA forms only in DNA samples in which a mutation in a single strand at mistakes
in heteroduplex DNA.
the target gene is present. Screening progeny from several G
G
thousand mutagenized individuals often allows identifica-
tion of multiple mutant alleles of the target gene. Individu- C T A
A
als homozygous or heterozygous for the mutant allele can
T C
then be identified in the appropriate M2 family. Denaturing electrophoresis
When chemical mutagenesis is used to produce TILL-
ING alleles, it results in both null alleles and partial loss- M2 family: 1 2 3 4 5
of-function alleles. The spectrum of phenotypes produced Uncut DNA Most M2 families
by alleles obtained through TILLING approaches is often have only uncut
(wild-type) DNA.
of use in dissecting gene function, even in organisms where
gene knockouts are available. Although TILLING was Cut DNA
developed for studies in model genetic species, it is suit- One family (red)
able for any organism that can be mutagenized and geneti- has cut DNA,
indicating a
cally analyzed. It is currently being applied to several crop mutation in the
plants. Cut DNA
gene of interest.
Genetic Analysis 14.2 tests your understanding of the
reverse genetics analytical techniques discussed in this section. Figure 14.13 Reverse genetics by TILLING.
Genetic Analysis GENETIC ANALYSIS 14.2
X.X
PROBLEM In searching the mouse genome, you identify three mouse orthologs similar to
BREAK IT DOWN: When genes in
the single hedgehog gene of Drosophila. (Orthologs are genes descended from a single different species are highly similar,
gene in the common ancestor of two or more species and therefore often have similar func- they are likely to have originated
tions in those species; for more detailed discussion see Section 16.2.) The mouse genes are from a single ancestral gene in a
common ancestor.
Sonic hedgehog, Indian hedgehog, and Desert hedgehog. Describe the research design
you would use to learn the function of each of the genes and whether that gene function is BREAK IT DOWN: You are starting
unique or redundant in the mouse. with gene sequences and wish to
know gene functions. Which genetics
approach, forward or reverse, is most
appropriate?
Evaluate
1. Identify the topic this problem addresses 1. This problem is about designing research to identify the functions of genes
and the nature of the required answer. known only by sequence and to discover whether those functions are
unique or redundant.
2. Identify the critical information given in 2. While only one hedgehog gene exists in Drosophila, three “hedgehog”
the problem. gene sequences exist in the mouse, raising the question of whether the
three mouse genes have different functions or whether there is any sharing
of function.
TIP: Reverse genetics approaches can be
Deduce used for functional analysis (p. 521).
3. Consider possible approaches 3. Functions of genes known only by sequence can be determined by reverse
to discovering the functions genetics approaches.
of genes known only by sequence.
4. Consider possible approaches to reverse 4. CRISPR–Cas9 approaches can be used to produce loss-of-function muta-
genetics available for use with mice. tions in mice. Other reverse genetics approaches, such as homologous
recombination or RNAi, could also be used, but CRISPR–Cas9 is the pre-
TIP: Consider the methods appropriate for
creating mutations in mice (see Table 14.2). ferred method.
Solve
5. Describe a genetics approach to deter- 5. First, use CRISPR–Cas9 to create loss-of-function knockout alleles of each
mine whether the genes have unique or of the three genes. Homozygous mutant lines can then be bred and the
redundant functions. phenotypes of each of the three single knockouts examined. Interbreeding
the single-mutant lines will lead to the creation of strains in which combina-
tions of two or more genes are inactive. Comparison of phenotypes of sin-
gle mutants with those of multiple mutants allows an assessment of whether
the genes exhibit unique or redundant functions.
For more practice, see Problems 14, 16, 29, and 31. Visit the Study Area to access study tools. Mastering Genetics
14.4 Transgenes Provide a Means visual output of gene expression patterns. Fusion of the reg-
ulatory sequences of a gene of interest to coding sequences
of Dissecting Gene Function of a reporter gene provides information about where, when,
and how much a gene is expressed. Some reporter genes
Transgenes have other uses in the study of gene function, facilitate live imaging and monitoring of gene expression
in addition to the creation of loss-of-function alleles. For in real time.
example, chimeric genes, which are transgenes composed of The second category of transgenes useful for genetic
regulatory sequences from one gene and coding sequences analysis consists of gain-of-function alleles generated by
from a second gene or of coding sequences from two dif- placing coding regions from one gene under control of
ferent genes, provide a means to create gain-of-function regulatory sequences derived from another gene. An allele
alleles, as well as to monitor gene expression patterns. This constructed in this way often results in ectopic expression,
section describes in greater detail the ways transgenes can meaning expression observed at times or in places where the
reveal genetic function. gene is not normally expressed. The use of either or both of
Although an almost limitless array of transgenes can these types of transgenes can complement analyses of loss-
be constructed for genetic analysis, many fall into one of of-function alleles by providing information on how genes
two categories. One category consists of reporter genes, are normally expressed and the phenotypic consequences of
used to investigate gene regulation because they produce a changing their normal expression pattern.
527
528 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
Monitoring Gene Expression Some frequently used reporter genes are represented in
Figure 14.15. The choice of reporter gene depends on the
with Reporter Genes biological question being addressed. With some reporter
A gene can act as a reporter if its product can be detected genes, the assay to monitor gene expression requires sacri-
directly or is an enzyme that produces a detectable product. ficing the organism, whereas the expression of other reporter
The regulatory sequences of a gene under study are used genes can be traced in a living organism. To be detected,
to drive the expression of the reporter gene. Two types of reporter gene products sometimes require substrates that
reporter gene fusions can be constructed: transcriptional and must penetrate into the tissues or cells where the reporter
translational (Figure 14.14). genes are expressed. In addition, reporter genes vary in their
In a transcriptional fusion, regulatory sequences sensitivity.
directing transcription of the gene of interest are fused One of the first reporter genes to be developed emerged
with the reporter gene so as to direct transcription of the from research on the lac operon in E. coli (see Section
coding sequences of the reporter gene. In this case, the 12.2). To purify and study the activity of b@galactosidase,
reporter gene will be transcribed in the pattern directed by encoded by the lacZ gene, a number of b@galactosides were
the regulatory sequences to which it is fused. Note that the synthesized and tested as substrates. Two b@galactosides,
transcriptional fusion shown in Figure 14.14 is idealized abbreviated X-gal and ONPG, were found to be useful.
and that regulatory sequences may reside in regions other b@galactosidase cleaves the colorless substrate, ONPG, into
than immediately 5′ upstream of the gene of interest. In a yellow product. This assay is typically used for in vitro
translational fusion, not only the regulatory sequences but measurement of b@galactosidase activity. In contrast, X-gal,
also the coding sequence of the gene of interest are fused also colorless, is cleaved by b@galactosidase into a blue
to the reporter gene in such a way that the reading frame product. This assay can be used in bacteria in vivo, since
for translation is maintained for both the gene of interest bacterial cells can take up the X-gal substrate without a
and the reporter gene. As a result, the reporter protein will reduction in viability.
be translationally fused with the protein of interest, and The lacZ gene can be used in conjunction with the
the location of the reporter protein will provide informa- substrate X-gal as a reporter gene in animal systems
tion not only on the spatial and temporal transcriptional (Figure 14.15a). However, because plants have an endog-
expression pattern but also on the subcellular location of enous b@galactosidase activity, lacZ is not suitable for
the fusion protein. In translational fusions, care must be studying plant systems. An alternative option is the E. coli
taken to ascertain whether the fusion protein is still func- uidA gene encoding b@glucuronidase, which enzymatically
tional, since the addition of the reporter protein could cleaves a colorless precursor, X-gluc, into a blue product
interfere with the proper folding or activity of the protein (Figure 14.15b). Conversely, because animals have endog-
of interest. enous b@glucuronidase activity, the uidA gene cannot be
Transcriptional fusion
5¿ upstream Transcription
regulatory start site 3¿ downstream
sequences sequences
5¿ UTR ATG Reporter gene STOP 3¿ UTR
Translational fusion
5¿ upstream Transcription
regulatory start site Exon 1 3¿ downstream
Exon 2 Exon 3
sequences sequences
5’ UTR ATG Reporter gene STOP 3¿ UTR
Intron 1 Intron 2
14.4 Transgenes Provide a Means of Dissecting Gene Function 529
(a) Lin-3 regulatory sequences (b) PHABULOSA regulatory sequences (c) CaMV 35S regulatory
driving lacZ reporter gene driving uidA reporter gene in sequences driving luciferase
in C. elegans Arabidopsis reporter gene in tobacco
(d) RHODOPSIN regulatory sequences driving (e) Mus musculus neurons expressing three different
GFP reporter gene in Mus musculus fluorescent reporter genes, derived from modifying GFP
used as a reporter in animals. A limitation of both of these The development of green fluorescent protein (GFP)
reporter genes in organisms other than bacteria is that in led to great strides both in genetics and cell biology by pro-
order for the substrate to be taken up effectively into inter- viding a noninvasive means of visualizing gene and protein
nal tissues, the tissue to be stained must be bathed in a solu- expression patterns in living organisms (Figure 14.15d). The
tion that kills the cells. GFP gene, derived from the jellyfish Aequoria victoria, is
Research into reactions that cause the natural emis- the source of the natural bioluminescence of this species.
sion of light in some animals has led to the development Its wild-type protein product, consisting of 238 amino acids,
of reporter genes that cause light to be produced in liv- fluoresces green (a 509-nm wavelength) when illuminated
ing cells. For example, luciferase, the enzyme responsi- with UV light (a 395-nm wavelength), which in this case is
ble for the glow of fireflies, catalyzes a reaction between the “substrate,” delivered by laser.
the substrate luciferin and ATP that results in the emis- Because UV light, with its short wavelength, can be
sion of light. Transgenic plants expressing the luciferase harmful to organisms (e.g., causing thymine dimers to
gene will emit a yellow-green glow if supplied with the form in DNA, as described in Section 11.3), the wild-type
substrate (Figure 14.15c). However, luciferin is not deliv- GFP gene was mutated to produce variants that respond
ered to all cells of the plant in equal measure, which in to lower-energy wavelengths. A major improvement was a
many cases limits the usefulness of the luciferase gene as mutation that shifted the excitation wavelength to 488 nm,
a reporter. corresponding to blue light and minimizing the potential
530 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
damage to cells being illuminated. Subsequent modifica- Shimomura, Martin Chalfie, and Roger Y. Tsien received
tion of the GFP protein sequence has led to the production the 2008 Nobel Prize in Chemistry for their discovery and
of variants that emit other colors (e.g., yellow, cyan, blue). development of GFP.
Genes encoding fluorescent reporter proteins have also Reporter genes can be used to dissect regulatory DNA
been isolated from marine corals and other jellyfish. The sequences and identify specific sequences required for par-
availability of multiple fluorescent reporter genes makes it ticular aspects of gene regulation. The general approach is
possible to visualize the expression of several genes simul- to start with a clone in which all the regulatory sequences
taneously in a single organism (Figure 14.15e). Osamu required for proper gene expression are present and then
lacZ coding
A series of transcriptional fusions with a lacZ region
reporter gene are created using restriction enzymes
to remove parts of the regulatory sequence, and
Deletion constructs
assayed for expression in stripes 2 and 3.
Fusion Expression in
construct stripe 2 stripe 3
5¿A + +
5¿F + +
5¿G + +
5¿H + –
5¿I – –
∆B + +
∆C + –
∆D + +
∆E + +
∆F – +
∆G + +
∆I + +
∆J – –
Figure 14.16 Use of reporter gene in promoter analysis of the even-skipped (eve) gene.
14.4 Transgenes Provide a Means of Dissecting Gene Function 531
to assay the effects of deleting or changing specific por- of random insertion mutants with the expression of a
tions of the clone. An example of such an analysis of the reporter gene (Figure 14.17). In its simplest application,
Drosophila even-skipped (eve) gene, which is expressed a population of transgenic organisms is generated by ran-
in seven stripes in the segmentation pattern of the embryo, dom insertion of a transposon (or T-DNA) containing the
is shown in Figure 14.16. Overlapping deletions spanning coding sequence of a reporter gene fused with a core pro-
large regions are assayed first. Then regions identified as moter region for RNA polymerase II transcription (see
important for gene regulation are dissected with smaller Section 13.1). If the insertion occurs near enhancer or
deletions. The concept is similar to that described earlier silencer regulatory sequences that can act in conjunc-
for deletion mapping (see Sections 6.5 and 10.4). When tion with the minimal promoter of the reporter gene, the
specific sequences required for proper gene expression are reporter can be expressed in a pattern that reflects the reg-
deleted, expression of the reporter gene will be correspond- ulatory capability of the nearby genomic DNA sequences.
ingly altered. The enhancers (or silencers) of the adjacent genomic
If genomic sequence is available from two or more DNA are co-opted, or “trapped,” by the insertion to drive
related species, regulatory elements may be predicted by expression of the reporter gene. Thus, from the expres-
searching for sequences that are conserved between the sion patterns of the inserted reporter gene, researchers
related species, using a method known as phylogenetic foot- can infer the existence of regulatory sequences, presum-
printing (discussed in Chapter 16). Such initial genomic ably from adjacent genes, that drive gene expression in
sequence analyses can direct subsequent experimental tests the observed patterns. While reporter gene expression
that use reporter genes to analyze expression in transgenic may not precisely reflect the expression of the adjacent
organisms. gene, the expression of the reporter often at least partially
reflects the normal gene expression pattern of the adjacent
gene. Enhancer trapping techniques were first pioneered
Enhancer Trapping
in Drosophila and have now been adapted to other sys-
Enhancer trapping uses a variation of an insertional tems. Because they identify genes by gene expression pat-
library to identify genes based on expression patterns. terns, enhancer trapping techniques complement forward
This approach combines the generation of a large number genetic screens.
Reporter Selectable
TATA box is used as a minimal
gene (b-gal) marker
promoter to recruit basal
transcriptional machinery. Randomly insert enhancer trap
into genome via transposon or
T-DNA vector.
Endogenous
regulatory DNA
element Endogenous gene X Endogenous gene X
RNA If enhancer trap disrupts coding region
of gene, a loss-of-function allele is
created. However, insertion of vector
Proteins may occur 5¿ or 3¿ to a gene and still
“trap” enhancers without causing a
b-gal expression in pattern loss-of-function mutation.
driven by endogenous
regulatory element
If enhancer trap DNA is
integrated near endogenous
regulatory elements, the
reporter gene will be expressed
in a pattern driven by adjacent
regulatory sequences.
Figure 14.17 Enhancer trapping to reveal expression patterns of endogenous genes. Strategy for
generation of enhancer trap lines.
532 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
Investigating Gene Function discs are groups of precursor cells that are set aside during
with Chimeric Genes embryonic development. They grow by mitotic prolifera-
tion during larval life and later differentiate into adult body
A chimeric gene, as mentioned earlier, is one in which tissues during metamorphosis. However, a gain-of-function
regulatory and coding sequences derived from two or more eyeless allele can be created by constructing a chimeric
different genes are recombined in a novel manner. For exam- gene in which expression of the eyeless coding sequences is
ple, combining the regulatory sequences from one gene with driven by regulatory sequences active in all imaginal discs.
the coding sequences from another gene often results in a If the eyeless gene is ectopically expressed in noneye imagi-
gain-of-function allele due to ectopic expression of the gene nal discs, such as those that would normally give rise to the
represented by the coding sequences. antennae or legs, the imaginal discs will differentiate as eye
Figure 14.18 shows one way experimenters can take tissue instead. This outcome indicates that cells in any ima-
advantage of this potential to obtain information on gene ginal disc are capable of differentiating into eyes and that
function. This example makes use of the eyeless gene of the eyeless gene product can promote the development of
Drosophila, so named because recessive loss-of-function eyes from any imaginal disc. Thus, when the eyeless allele
mutations in this gene result in a failure of eyes to develop is ectopically expressed as a gain-of-function mutation in
in the fly. inappropriate imaginal discs, the resulting phenotype is the
The eyeless gene is normally expressed in the eye converse of the phenotype of the loss-of-function eyeless
imaginal discs during Drosophila development. Imaginal allele—ectopic eyes as opposed to an absence of eyes.
Wild-type Drosophila
has red eyes.
Gain-of-function
eyeless mutants, in
which eyeless gene
is ectopically
expressed in the
wrong imaginal
discs, develop
ectopic eyes on
antennae, legs, and
wings. Ectopic eyes
are anatomically
normal despite their
ectopic locations.
Loss-of-function
eyeless mutants
lack eyes entirely.
In cases where the gain-of-function and loss-of-function expression of eyeless during embryogenesis leads to embry-
phenotypes are complementary, interpretation of the effects onic lethality, a phenotype that is not easily reconciled with
of ectopic expression is straightforward. Thus, in the preced- the loss-of-function phenotype. Therefore, when considering
ing example, eyeless is revealed to be a master control gene gain-of-function alleles generated by ectopic expression, we
for the differentiation of eyes in Drosophila. However, ecto- must remember that the phenotypes represent what the gene
pic expression of genes can also lead to enigmatic pheno- is capable of doing when expressed in particular contexts and
types that are more difficult to interpret. For example, ectopic may not reflect the normal function of the gene.
C A SE S T U D Y
Reverse Genetics and Genetic Redundancy in Flower Development
In this case study, we see an example of how forward genet- restriction-enzyme–digested Arabidopsis genomic DNA,
ics and reverse genetics work together to provide a broader sequences related to the AGAMOUS gene sequence can be
view of both gene function and evolution. The story begins identified 4 (see Section 1.4 to review Southern blotting).
with forward genetics—the isolation of a mutation that alters The same AGAMOUS cDNA can be used as a probe on the
flower development and the subsequent identification of flower cDNA library to identify clones of related genes. Genes
the mutant gene sequence using recombinant DNA tech- related to AGAMOUS were called AGAMOUS-LIKE, or AGL,
nology. The gene is then cloned and used as a probe for genes. These related genes possess the same highly con-
cloning genes of similar sequence. Finally, reverse genetics served DNA-binding domain but differ in the rest of their pro-
approaches are applied to identify mutant alleles of related tein sequences. To determine how the AGL genes are related
genes, and their biological function is inferred based on the to AGAMOUS and to each other, a phylogenetic tree can be
mutant phenotypes. constructed 5 (see Section 1.5 to review phylogenetic trees).
FORWARD GENETICS REVEALS GENES OF INTEREST In REVERSE GENETICS REVEALS FUNCTIONS OF HOMOL-
flowering plants, the types of floral organs that develop OGOUS GENES Since the related genes are known by gene
are decided by the expression of a set of transcription fac- sequence only, a reverse genetics approach can be under-
tors. (For further description of this activity, see Section taken to determine gene function. CRISPR– or T-DNA–induced
18.5.) The identity of Arabidopsis reproductive organs (sta- mutant alleles of many of the AGL genes in Arabidopsis can be
mens and carpels) is determined in part by the activity of the identified in available knockout libraries 6 (see Section 14.3).
AGAMOUS gene. Recessive null loss-of-function agamous Researchers were initially surprised to find that plants
alleles lead to the development of petals in the positions homozygous for loss-of-function alleles of many single
usually occupied by stamens and of an additional flower in genes did not display an aberrant phenotype. Hypothesiz-
the position usually occupied by carpels. Homozygotes for ing that the more closely related the genes, the more simi-
agamous are sterile and do not produce gametes (hence the lar their functions would be, researchers crossed mutants to
name AGAMOUS). In forward genetic screens aimed at iden- obtain organisms containing multiple loss-of-function alleles
tifying genes involved in Arabidopsis flower development, of closely related genes 7 . For example, sep1 mutants—
agamous mutant alleles induced by either EMS or T-DNA having mutations of the SEPALLATA1 gene—were crossed
have been isolated (Figure 14.19, step 1 ). with sep2 mutants, after which sep1 sep2 double mutants
The T-DNA–induced allele proved a useful tool for clon- were identified in the F2 generation. Disappointingly, the
ing the AGAMOUS gene because the T-DNA “tagged” the sep1 sep2 double mutants did not differ significantly from
gene (step 2 ). Since the mutation of the AGAMOUS gene wild-type plants. However, sep1 sep2 sep3 triple-mutant
was caused by the insertion of T-DNA, the presence of a plants proved to have flowers consisting solely of sepals,
T-DNA sequence in a region of Arabidopsis DNA was an indi- which indicates that these genes have a function related
cation that the sequences encoding the AGAMOUS gene to floral organ specification but distinct from the role of
were adjacent. Recombinant DNA techniques described in AGAMOUS.
Section 15.1 were used to find those sequences. Genetic redundancy due to gene duplications is exten-
Subsequently, the genomic clone encoding AGAMOUS sive in most eukaryotic genomes (see Section 16.3). Imme-
could be used to identify AGAMOUS cDNA clones from a diately following an occurrence of gene duplication, the
library constructed with mRNA from wild-type flowers. Sequenc- duplicate genes often have identical DNA sequences and
ing of the AGAMOUS cDNA clones revealed that the encoded expression patterns, and they are therefore genetically
protein had a similarity to known eukaryotic transcription fac- redundant. Over time, however, the functions of the two
tors. This conclusion was based on the similarity between a 60– genes may diverge due to the accumulation of mutations
amino acid domain of the AGAMOUS protein and DNA-binding that lead to changes in protein sequence and expression
domains in yeast and mammalian transcription factors 3 . pattern. Yet, because the genes are evolutionarily related,
they often function in similar biological processes. Reverse
IDENTIFICATION OF HOMOLOGOUS GENES When genetics approaches can facilitate the analysis of closely
the AGAMOUS cDNA is used to probe a Southern blot of related genetically redundant genes.
534 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics
Forward genetics
Wild type
agamous
1 Generate agamous
mutant by T-DNA
mutagenesis.
Genomic DNA
AGAMOUS
Related sequences
cross-hybridize as
shown on this
Southern blot.
Figure 14.19 Use of forward and reverse genetics to determine gene function.
Problems 535
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
14.1 Forward Genetic Screens Identify Genes 14.3 Reverse Genetics Investigates Gene
by Their Mutant Phenotypes Action by Progressing from Gene Identification
❚❚ Forward genetic screens are designed to identify genes by
to Phenotype
creation of a mutant phenotype, often allowing researchers ❚❚ Reverse genetics approaches, in which determination of
to infer the biological function of a gene. biological function proceeds from gene sequence to mutant
❚❚ Complementation tests are used to discover the number phenotype, make use of collections consisting of mutants
of alleles and the number of genes affected in a forward that are each defective in a different defined gene.
genetic screen. ❚❚ CRISPR–Cas9–mediated genome editing has revolution-
❚❚ Mutations resulting in lethality can be identified in genetic ized reverse genetic approaches and enabled unprecedented
screens for conditional alleles. manipulation of genome sequences in vivo.
❚❚ Enhancer and suppressor genetic screens identify genes that ❚❚ Collections of insertion alleles, the TILLING process, and
act in related or redundant pathways. RNAi-mediated gene silencing all contribute to the reverse
genetics analysis of model organisms.
14.2 Genes Identified by Mutant Phenotype
Are Cloned Using Recombinant DNA 14.4 Transgenes Provide a Means of Dissecting
Technology Gene Function
❚❚ Some genes can be cloned by complementation of a mutant ❚❚ Reporter genes are used to monitor gene-expression pat-
phenotype. terns in transgenic organisms and for the dissection of regu-
❚❚ Advances in sequencing technologies facilitate direct iden- latory sequences. Some reporter genes, such as the green
tification of mutant genes. fluorescent protein, can be visualized in real time in living
organisms.
❚❚ Candidate genes can also be identified by expression analy-
ses, DNA sequence analyses of multiple mutant alleles, or ❚❚ Chimeric genes represent novel alleles that can provide
complementation experiments. clues to gene function.
PRE PA R I N G F O R P R O BLEM S O LV I NG
In addition to the list of problem-solving tips and suggestions 4. Review the different approaches employed in reverse
given here, you can go to the Study Guide and Solutions Man- genetics and the reasons for choosing one approach
ual that accompanies this book for help at solving problems. over another.
1. Be familiar with mutagenesis strategies employed in 5. Know different ways in which CRISPR–Cas can be uti-
forward genetics. lized for genome editing.
2. Know general strategies for analyzing collections of 6. Be acquainted with different types of reporter genes.
mutants generated in forward genetic screens.
7. Understand how chimeric genes can be used to investi-
3. Review the approaches to cloning genes known only by gate gene function.
a mutant phenotype.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. What are the advantages and disadvantages of using GFP biological causes for the difference in expression patterns
versus lacZ as a reporter gene in mice, C. elegans, and of the two transgenes.
Drosophila?
3. Discuss the similarities and differences between forward
2. You conduct a study in which the transcriptional fusion of and reverse genetic approaches, and when you would
regulatory sequences of a particular gene with a reporter choose to utilize each of the approaches.
gene results in relatively uniform expression of the
4. Using the data inside the back cover of the book, calcu-
reporter gene in all cells of an organism. A translational
late the average number of kilobase (kb) pairs per cen-
fusion with the same gene shows reporter gene expression
timorgan in the six multicellular eukaryotic organisms.
only in the nucleus of a specific cell type. Discuss some
536 CHAPTER 14 Analysis of Gene Function by Forward Genetics and Reverse Genetics Problems 536
How would this information influence strategies to 7. Diagram the mechanism by which CRISPR–Cas functions
clone genes known only by a mutant phenotype in these in the immune system of bacteria and archaea.
organisms?
8. Describe how CRISPR–Cas has been modified to create a
5. What are the advantages and disadvantages of using inser- genome-editing tool.
tion alleles versus alleles generated by chemicals (as in
9. Discuss the advantages (and possible disadvantages) of
TILLING) in reverse genetic studies?
the different approaches to reverse genetics.
6. You have cloned the mouse ortholog (see Genetic Analysis
10. Discuss the advantages (and possible disadvantages) of
14.2 for definition) of the gene associated with human Hun-
the different mutagens in Table 14.1.
tington Disease (HD) and wish to examine its expression in
mice. Outline the approaches you might take to examine the
temporal and spatial expression pattern at the cellular level.
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
11. You have identified a gene encoding the protein involved analyze your mutagenesis to learn how many genes are repre-
in the rate-limiting step in vitamin E biosynthesis. How sented and how many alleles of each gene? How would you
would you create a transgenic plant producing large quan- discover whether the genes act in the same or different path-
tities of vitamin E in its seeds? ways, and if in the same pathway, how do you discover the
order in which they act? How would you clone the genes?
12. You have identified a recessive mutation that alters bristle
patterning in Drosophila and have used recombinant 18. In enhancer trapping experiments, a minimal promoter and a
DNA technology to identify a genomic clone that you reporter gene are placed adjacent to the end of a transposon
believe harbors the gene. How would you demonstrate so that genomic enhancers adjacent to the insertion site can
that your gene is on the genomic clone? act to drive expression of the reporter gene. In a modifica-
tion of this approach, a series of enhancers and a promoter
13. The CBF genes of Arabidopsis are induced by exposure
can be placed at the end of a transposon so that transcrip-
of the plants to low temperature.
tion is activated from the transposon into adjacent genomic
a. How would you examine the temporal and spatial pat- DNA. What types of mutations do you expect to be induced
terns of expression after induction by low temperature? by such a transposon in a mutagenesis experiment?
b. Can you design a method that would reveal these
changes in gene expression in a way that a farmer could 19. In Genetic Analysis 14.1, we designed a screen to identify
recognize them by observing plants growing in the field? conditional mutants of S. cerevisiae in which the secre-
14. When the S. cerevisiae genome was sequenced and surveyed tory system was defective. Suppose we were successful in
for possible genes, only about 40% of those genes had been identifying 12 mutants.
previously identified in forward genetic screens. This left a. Describe the crosses you would perform to determine
about 60% of predicted genes with no known function, lead- the number of different genes represented by the 12
ing some to dub the genes fun (function unknown) genes. mutations.
a. As an approach to understanding the function of a b. Based on your knowledge of the genetic tools for study-
certain fun gene, you wish to create a loss-of-function ing baker’s yeast, how would you clone the genes that
allele. How will you accomplish this? are mutated in your respective yeast strains? What is an
b. You wish to know the physical location of the approach to cloning the human orthologs (see Genetic
encoded protein product. How will you obtain such Analysis 14.2 for definition) of the yeast genes?
information? 20. How would you design a genetic screen to find genes
15. Translational fusions between a protein of interest and a involved in meiosis?
reporter protein are used to determine the subcellular loca- 21. The eyes of Drosophila develop from imaginal discs,
tion of proteins in vivo. However, fusion to a reporter protein groups of cells set aside in the fly embryo that differ-
sometimes renders the protein of interest nonfunctional entiate into the adult structures during the pupal stage.
because the addition of the reporter protein interferes with Despite their importance in nature, eyes are dispensable
proper protein folding, enzymatic activity, or protein–protein for fruit-fly life in the laboratory.
interactions. You have constructed a fusion between your
protein of interest and a reporter gene. How will you show a. Devise a genetic screen to identify genes directing
that the fusion protein retains its normal biological function? development of the fly eye.
b. What complications might arise from genetic
16. In humans, Duchenne’s muscular dystrophy is caused by screens targeting an organ that differentiates late in
a mutation in the dystrophin gene, which resides on the development?
X chromosome. How would you create a mouse model of
22. Given your knowledge of the genetic tools for studying
this genetic disease?
Drosophila, outline a method by which you could clone
17. How would you perform a genetic screen to identify genes the dunce and rutabaga genes identified by Seymour
directing Drosophila wing development? Once you have a Benzer’s laboratory in the genetic screen described at the
collection of wing-development mutants, how would you beginning of this chapter.
Problems 537
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
30. How would you edit a specific nucleotide in a genome? 32. The CRISPR–Cas9 complex directs the Cas9 endonucle-
ase to a specific genomic locus. If the endonuclease
31. Through a forward genetics screen in Arabidopsis you
domain is inactivated and replaced with a transcriptional
have identified a mutation that results in leaves curling
activator (or repressor) domain, what would be the func-
upward, rather than being flat as in wild type. You have
tional consequence of directing such a complex to a spe-
cloned the corresponding gene and note that it is a mem-
cific chromosomal location?
ber of a small gene family composed of three additional
members in Arabidopsis. How will you determine if the 33. Describe how enhancer screens can be used to uncover
other three members of the gene family have similar or genetic redundancy.
distinct functions as compared with the gene you first
34. How might you use CRISPR–Cas9 to create a large deletion?
identified?
C
The Genetics of Cancer
APPLICATION
I n summer 2015, former President Jimmy Carter was diagnosed with meta-
static melanoma, an aggressive and potentially lethal cancer that often starts
on the skin but can occur in other tissues as well. In Carter’s case, the mela-
noma had metastasized to his liver and his brain. (Metastasize means to spread
from the original tumor to one or more new locations.) Carter underwent sur-
gery to remove the liver tumors and received radiation treatments that focused
directly on his brain tumors. He was also given the drug that goes by the trade
name Keytruda (its compound name is pembrolizumab) that targets a process
cancer cells often use to evade detection by the immune system. Fortunately for
Carter, his case was caught early enough in the metastatic process to be treat-
able. According to some cancer treatment specialists, the combination of surgery
and radiation might have been sufficient to control the cancer; but the addition of
538
The Genetics of Cancer 539
Keytruda, which Carter has continued to take since his diagnosis, may also have played
a significant role.
Keytruda is one of a class of new drugs known as checkpoint inhibitors that have
been approved to treat cancer since 2014. Working similarly to another drug in this
class, Opdivo (compound name nivolumab), Keytruda operates on a checkpoint inhibi-
tor receptor protein known as PD-1 residing on the immune system cells known as
T-cells. T-cells have the ability to attack cancer cells. Cancer cells can evade destruction
by T-cells, however, by producing the protein PD-L1, which binds to PD-1 and prevents
T-cell recognition of cancer cells. Keytruda and Opdivo prevent the binding of PD-L1 to
PD-1, allowing T-cells to attack cancer cells.
Since the first approval of these checkpoint inhibitor drugs for cancer treatment,
they have proven effective in treating a wide range of cancers, including those affect-
ing the lung, stomach, colon, and skin. With variation depending on the type of cancer
treated, they are effective in about 25 to 40% of cases, prolonging life beyond what
would be expected with chemotherapy and surgery alone. They are not universally
effective, however, and cancer may recur even with successful initial treatment. In addi-
tion, treatment with these drugs can be very expensive—up to $150,000 per year. On
balance, however, these drugs represent the start of a new wave of cancer treatments
that target the immune system in various ways as a defense mechanism against exist-
ing cancer. The goal of these immune system–based cancer treatments is to stimulate
immune system cells, most often T-cells, to fight cancer the way they fight invading
bacterial cells in an infection: identify an abnormal cell, attach to it, and destroy it.
Several avenues of investigation have converged to aid in developing and tar-
geting these immune system–based cancer therapies. From a genetic perspective,
research over the past 25 years that has investigated mutations in cancer cells and
more recently has focused on deciphering the genome sequences of cancer cells has
proven enormously helpful both in understanding the biology of cancer and in helping
to devise effective treatment approaches. Former President Carter is one of a large
number of cancer patients alive today who have benefitted from this recent research
and from new therapeutic approaches that seek to change the way medicine treats
and manages cancer.
Benign Cancer
Dead skin
cells (shed)
Squamous
cells
Cell
migration
Basal layer
(dividing Underlying
cells) tissue
Basal
lamina
Figure C.1 Abnormal tissue growth and cancer development. Abnormal tissue growth commonly
follows a pattern of increasing, but noncancerous, abnormality beginning with hyperplasia that can
progress to dysplasia. Neoplasia (cancer) can follow, and metastasis can occur if neoplastic growth invades
normal tissue.
Before they become cancerous, cells begin to look abnormal cancer cells that can have different genetic profiles. and
and grow abnormally. The abnormality can first appear as with a biology as complex as that of most other tissues. The
hyperplasia, meaning extra growth, and progress to dyspla- current view describes cancer as a progressive disease that
sia, meaning disorganized growth. Hyperplastic and dysplastic develops through stages proceeding from normal to malig-
cells can form tumors, masses of abnormal cells, but in these nant, as Figure C.1 illustrates.
early stages the tumors are classified as benign tumors, mean-
ing they are noncancerous, are usually well encapsulated by The Hallmarks of Cancer Cells
surrounding tissues or membranes. Benign tumors are con-
and Malignant Tumors
sidered “precancerous.” They are composed of abnormal cells
that can grow excessively, but they do not invade surrounding Cancer cells are profoundly abnormal cells, malignant
normal tissue. If they are accessible, benign tumors can often tumors are profoundly abnormal tissues, and cancer is a
be removed relatively easily by surgical or other treatments. profoundly abnormal biological state, the endpoint of a long
For example, abnormal growths on the skin can be removed series of genetic and biological changes that have occurred
quickly and simply by spraying the growth with liquid nitrogen within the affected cell lineage over the life span of a per-
(-321°F, or -196°C) to kill the abnormal cells by freezing. son. Despite the many differences distinguishing the vari-
Abnormal growths in the colon detected during colonoscopy ous types of cancer, these extreme genetic and biological
can be removed surgically during the screening procedure. abnormalities of cancer cells and malignant tumors do have
Dysplastic cells can progress, however, to neoplasia, a certain hallmark features.
state of growth in which they are now cancer cells prolifer- As an introduction to the hallmarks of cancer, it helps
ating in large numbers and in a highly disorganized manner. to be familiar with two conceptual categories of genes that
In this state, the masses are classified as malignant tumors, have been used to describe how mutations often contrib-
which are not confined in their growth. If malignant tumor ute to cancer development. These categories classify some
growth continues it can enter metastasis, a state in which genes as proto-oncogenes and some genes as tumor sup-
the tumor invades normal tissues (such as the basal lamina pressor genes. Both categories contain large numbers of
and underlying tissue seen in Figure C.1) and in which can- genes, and all the genes in each category are genes we all
cer cells may be carried in blood or lymphatic circulation to carry that perform essential functions in cells. It is when
new locations, where they can seed new tumors. these genes are mutated and have aberrant function, no
New cancer research, driven by genetics, has altered the function, or excessive activity that they contribute to cancer
understanding of the nature of tumors and the characteristics development.
of cancer. In years past, a tumor was thought to be a mass The proto-oncogenes are a broad array of normal
of millions of cells that were essentially genetically identi- genes stimulating cell division and progression through the
cal to one another, having been generated as a cell lineage cell cycle (see Chapter 3 for a discussion of the cell cycle).
derived by mitotic division from an original cancer cell. In As a group, proto-oncogenes encode transcription factor
this sense, a tumor was thought to be clonal. Today, cancer proteins and cell-cycle regulating proteins. You can think of
biologists understand that a tumor is a complex mixture of proto-oncogenes as the “gas” that propels the transcription
cells, some malignant but many others normal, containing of other genes in cells or drives the cell cycle forward.
542 APPLICATION C The Genetics of Cancer
Recall from Chapter 8 that transcription factors are part As currently understood, the ten hallmarks of cancer
of the complex protein machinery that binds to the promot- cells outlined by Hanahan and Weinberg are
ers of genes to help initiate transcription. They are required
1. Sustained Cell Proliferation: Cancer cells are in a
for normal and proper control of gene transcription. Simi-
chronic state of growth and division, unlike normal
larly, cell-cycle regulatory proteins are required for normal,
cells that undergo controlled proliferation. Sustained
controlled progression through the cell cycle. Proteins that
proliferation can be produced by gene mutations that
fail to function or that function incorrectly owing to muta-
drive excessive growth.
tions of proto-oncogenes result in inappropriate progression
of the cell cycle. The mutated versions of proto-oncogenes 2. Evasion of Normal Growth Suppression—Gene
are called oncogenes. In their oncogene forms, transcription mutations that eliminate the function of growth-
factor or cell-cycle regulatory genes function abnormally, suppressing proteins or render cells insensitive to
giving too much gas to the process and acting something growth-control signals enable cancer cells to circum-
like a stuck accelerator on a car. The consequence of onco- vent the protein signals and regulatory proteins that
gene action can be an overproduction of cells without the normally regulate cell proliferation.
normal controls on the process. 3. Resistance to Cell Death: Normal cells are generated
Tumor suppressor genes are a large and varied group through mitotic division, age during their active phase,
of normal genes whose protein products largely function and then enter senescence and undergo a process known
at cell cycle checkpoints, such as the transition from G 1 to as apoptosis, during which they die. Cancer cells in
S phase or from S phase to G 2, or function in other ways contrast generally live much longer than normal cells,
during the cell cycle to pause it until conditions are right to owing to gene mutations that, by interfering with the
continue. Tumor suppressor genes can also express proteins normal mechanisms and signals leading to apoptosis,
that function in the normal process for bringing on the death enable cancer cells to delay or bypass cell death.
of aged or damaged cells. Tumor suppressor genes can be 4. Cellular Immortality: In addition to bypassing
thought of as the “brake” that controls the speed and pace of induced cell death, cancer cells also live much longer
cell proliferation. Mutations of tumor suppressor genes are than is normal for cells that do not undergo apoptosis.
like brake failure in a car. In this case, the normal controls Many are effectively rendered immortal by mutations
on cell proliferation are missing, and either the cell cycle that stabilize cells or modify the indicators of cell
moves forward too quickly or cells that should undergo cell aging in a manner that allows them to grow and divide
death evade the process. perpetually.
The maintenance of normal tissue and organ size,
5. Angiogenesis Induction: Angiogenesis is the devel-
boundaries, and cell numbers is achieved by a balance
opment of new blood vessels. Malignant tumors
between the mitotic production of new cells and the death
require blood vessels to supply the growing tumor with
of old cells. The many genes in the proto-oncogene and
oxygen and compounds needed for growth. A number
tumor suppressor gene categories interact in complex ways
of normal cell types are recruited by the tumor to form
to preserve that balance. If important players in that balanc-
blood vessels. These are among of the cadre of normal
ing process are mutated, causing excess cell proliferation
cells that are part of a tumor.
and the insufficient elimination of old cells by cell death, the
balance can break down. 6. Activation of Invasion and Metastasis: The growth
The concepts of proto-oncogenes and tumor suppres- of normal cells usually requires the presence of other
sor genes are helpful but have proven to be incomplete for cells, partly because contact with other cells exercises
describing the genetic abnormalities driving cancer develop- control over that growth, keeping each tissue confined
ment and progression. A major advancement in understand- to a limited area. In cancer, a succession of gene muta-
ing the cancer process has come from the identification of tions alters normal growth restrictions, allowing tumors
ten hallmarks of cancer that represent the various ways in to grow in size and invade surrounding tissues. Addi-
which the biological and genetic controls required in normal tional gene mutations, coupled with cellular immortal-
cells are lost or altered in cancer cells. ity, can enable single cancer cells to break away from
In 2000, Douglas Hanahan and Robert Weinberg syn- the original tumor, plant themselves in a new location,
thesized the large amount of research literature on cancer and proliferate to produce a new malignant tumor. This
and created a list of six hallmarks of cancer. Their paper is the process of metastasis.
outlined supporting data and examples and provided cancer 7. Reprogramming of Energy Metabolism: The active
researchers with a well-organized way to view and investi- proliferation of malignant tumors requires a dispropor-
gate the biology of cancer. In 2011, Hanahan and Weinberg tionate amount of energy. Thus, in addition to stimu-
added four additional hallmarks developed largely through lating angiogenesis to supply itself with oxygen, the
the collection and analysis of cancer cell genomic sequences tumor must reprogram its cellular metabolism to meet
and the assessment of gene mutations in cancer cells. its energy needs.
C.3 The Genetic Basis of Cancer 543
8. Immune System Avoidance: The immune system been found between three epigenetics-regulating genes and
is responsible for detecting and eliminating foreign cancer. The DNA methyltransferase gene DNMT3A is one
microbes and cells that may do harm to the body. In of several genes whose protein products help to methylate
addition, the immune system monitors the body for chromatin as part of gene silencing. Mutations of DNMT3A
abnormal cells and helps eradicate precancerous and appear to occur early in certain leukemias, altering gene
cancer cells before they develop into tumors. For newly expression patterns and causing genome instability in the
forming tumors to proliferate, it is now thought that form of chromosome deletions and rearrangements. Other
they must evade immune system detection. This notion DNA methylation genes, including TET2 and IDH, are also
is related to another emerging theory that links inflam- associated with abnormal methylation patterns in cancer
matory processes in the body to the proliferation of cells. In addition, mutations of these genes are associated
cancer cells and malignant tumor formation. with disruption of the expression of certain chromatin modi-
9. Tumor-promoting Inflammation: Cancerous tumors fier genes. As a whole, the information on epigenetic altera-
attract immune system cells deployed by the immune tions in cancer suggests that epigenetic dysregulation is a
system to attack and eradicate cancer cells. This causes major contributor to cancer development and proliferation.
an inflammatory reaction within tumors that, paradoxi-
cally, helps promote some aspects of tumor growth,
such as angiogenesis. Inflammation can also help sup- C.3 The Genetic Basis of Cancer
ply the tumor with growth factors that in turn promote
growth and survival factors, helping cancer cells evade To review, most cases of cancer result from the accumula-
destruction. tion in somatic cells of multiple and diverse gene mutations
that combine to gradually but progressively transition nor-
10. Genome Instability and Mutation: Cancer cells are
mal cells into cancerous ones. These cases are classified as
highly unstable and rapidly acquire new mutations of
sporadic because they can potentially affect anyone and
various kinds. This frequently gives them a growth
because they result from mutations that occur at random dur-
advantage that allows them to proliferate much faster
ing the lifetime of the affected individual. Based on the cur-
than surrounding normal cells. Large numbers of indi-
rent state of knowledge, and taking into account the many
vidual gene mutations are present in cancer cells, and a
different types of cancer under study, 90% or more of all
great deal of research activity is devoted to identifying
cases of cancer are thought to fall into the sporadic category.
which of these mutations are “drivers” of cancer cell
In this section, however, we shift our focus away from
proliferation (i.e., which mutations actively promote
the large majority of cancers that are sporadic toward those
tumor growth) and which mutations are “passengers”
that either have a simpler pathway to malignancy or that
(i.e., mutated due to cancer cell genome instability but
develop in part through the inheritance of a mutation that
not essential for tumor growth). These mutations can
significantly increases the likelihood that an individual will
be identified by cancer cell genome sequencing.
develop cancer. We examine certain rare cancers for which
Cancer cell genome instability can be observed visually. the disease is the result of mutation of a single gene, and we
Cancer cell chromosomes typically contain large numbers look at inherited susceptibility to cancer through the inheri-
of duplications, deletions, and chromosome rearrangements, tance of germ-line mutations, meaning mutations that occur
in addition to frequent changes in chromosome number in sperm or eggs and are passed to offspring during repro-
(see Chapter 10 for discussion of chromosome mutations). duction. Germ-line mutations that predispose to the devel-
The chapter opener micrograph shows the chromosomes of opment of cancer tend to cluster in families as a result of
a cancer cell stained by a method that produces a distinct hereditary transmission. This pattern of cancer is identified
fluorescent color for each homologous chromosome pair. as a familial or hereditary cancer Certain of these cancers
Normally, each chromosome should be a single color, but develop through the mutation of a single gene by de novo
notice that many of them instead contain two or three col- mutation (new mutation). These mutations that hit a critical
ors. This is direct evidence of chromosome translocations, gene are followed by a second mutation that leads to cancer.
and further inspection of these chromosomes would reveal
chromosome deletions, duplications, and inversions. Single Gene Mutations
Underlying many of the hallmarks of cancer is another
layer of abnormality, consisting of the disruption of normal
and Cancer Development
epigenetic regulation in cancer cells and of mutations that In this section, we describe two types of cancer that usually
disrupt epigenetic writers, readers, and erasers (see Sec- result from de novo mutations and two other cancers that
tion 13.2). Among the cancer hallmarks to which epigenetic can be due either to de novo mutations or to the inheritance
changes are known to contribute are the effects on cancer of a predisposing mutation from a parent. All these rare can-
cell metabolism and cancer immunology. In hematologic cers can be traced to changes in a single gene, but the first
cancers, where the data are strongest, associations have two arise from chromosome rearrangements that lead to the
544 APPLICATION C The Genetics of Cancer
c-ABL
9
Normal t(9;22)
chromosomes Translocation chromosomes
t(9;22)
Figure C.2 Reciprocal translocation in chronic myelogenous leukemia. Reciprocal translocation between
chromosomes 9 and 22 [t(9;22)] moves the c-ABL gene from chromosome 9 into the BCR gene region on
chromosome 22, forming a c-ABL–BCR fusion gene on the shortened copy, called the Philadelphia chromo-
some, of chromosome 22. The chimeric c-ABL–BCR protein produces CML.
Centromere
Reciprocal translocation
of chromosomes
8 and 14
+ +
8 14 8q– 814q+
Normal Translocation
chromosomes chromosomes
Figure C.3 Example of reciprocal translocation in Burkitt’s lymphoma. Translocation of the c-MYC gene
from chromosome 8 that has lost a portion of its long arm (8q–) to the IgV immunoglobulin gene region on
chromosome 14 that has gained a portion from chromosome 8 containing c-MYC (14q+). The translocation
overproduces c-MYC protein to cause Burkitt’s lymphoma.
altered gene expression. In both of these cancers, the chro- time she discovered the chromosome in cell samples from
mosome rearrangement occurs so frequently that it is effec- CML patients. A gene known as c-ABL, located on chro-
tively diagnostic for the particular type of cancer. mosome 9, is translocated into the chromosome 22 region
containing the gene BCR. The result of the translocation
Chronic Myelogenous Leukemia Figure C.2 shows a produces a c-ABL–BCR “fusion gene.” Expression of this
reciprocal translocation between one copy of chromosome fusion gene produces a chimeric BCR–c-ABL protein.
9 and one copy of chromosome 22 that is seen in most cases Normal BCR protein is part of a cell signaling pathway.
of chronic myelogenous leukemia (CML). This mutation It normally transfers cell growth signals from the external
is usually a de novo mutation. Leukemias (there are many environment to the cell nucleus to stimulate cell prolifera-
types) are cancers of the blood in which the bone marrow tion. The chimeric BCR–c-ABL protein continuously stimu-
produces certain white blood cells in an uncontrolled man- lates cell division, even in the absence of an external growth
ner. In CML, the white blood cells known as granulocytes signal. The capability for sustained growth is an example of
are overproduced. The chromosome translocation that cancer hallmark 1 (sustained proliferation) described above.
is typical of CML leads to the production of an abnormal Since the specific cause of CML is known, it was an
protein. early focus of targeted cancer therapy, involving a drug
The nuclei of cancer cells in patients with CML have treatment aimed at controlling the aberrant chimeric protein
one normal copy of each of the chromosomes 9 and 22 along activity. This effort has been successful, and today CML can
with a copy of chromosome 9 and a copy of chromosome 22 be effectively treated, as discussed in a Section C.4.
that have undergone reciprocal translocation. The transloca-
tion produces a short version of chromosome 22 known as Burkitt’s Lymphoma Another cancer resulting from a de
the Philadelphia chromosome. It is named after the city in novo chromosome rearrangement is Burkitt’s lymphoma
which cancer researcher Janet Rowley was working at the (Figure C.3). In Burkitt’s lymphoma, a reciprocal translocation
C.3 The Genetic Basis of Cancer 545
bears repeating that the vast majority of cancers result from without mutation of these genes. Mutation of the APC gene
the accumulation of a large number of somatic mutations. is a common first step in the transition from normal colon
As the mutations accrue, the once-normal cells are gradu- epithelium to abnormally proliferating epithelium. The pro-
ally converted to an abnormal state and eventually develop tein product of the APC gene limits the growth of epithelial
into cancer cells. cells that are in contact with other cells. As adenomas form
One of the clearest examples of this process comes and advance, mutation of the KRAS gene frequently occurs.
from the study of gene mutations in the development of This gene normally produces a cell division signal transduc-
colon and rectal cancer. Studies of the genetic abnormalities tion protein that responds to external signals and conveys a
in this type of cancer offer both a glimpse into the process of message to the nucleus that drives cell division. The dele-
somatic mutation leading to cancer development and a les- tion of a gene known as DCC results in the loss of a protein
son on how the inheritance of germ-line mutations can pre- that suppresses cell growth. This mutation allows adenomas
dispose individuals to develop cancers like colon and rectal to generate finger-like outgrowths (villi) that advance the
cancer. Colorectal cancer is a good example of a condition spread of the adenoma. The transition to a cancerous state
brought on by multiple somatic mutations because most often occurs with the mutation of the TP53 gene. As was
cases progress very slowly, through stages of progressive described above for Li–Fraumeni syndrome, mutation of
cellular abnormality over several decades. The abnormal TP53 leads to failure of cell cycle pausing for DNA damage
cells that develop prior to the formation of colon and rec- repair and also severely impairs the initiation of apoptosis
tal cancer are not cancerous. They occur in clusters on the in heavily damaged and aged cells. These gene mutations,
epithelial surface lining the colon and rectum. These early common but not always present in colorectal cancer, are
abnormal growths are known clinically as “adenomas” or, usually accompanied by mutations in other genes as well; in
more commonly, as “polyps.” They can easily be visualized fact, other genes must mutate if the colorectal cancer lesion
by colonoscopy, and their removal prevents the potential is to become metastatic.
development of colon or rectal cancer (see the progression About 75 to 80% of people developing colorectal can-
of benign and cancerous stages depicted in Figure C.1). cer have sporadic disease that occurs as a result of the acqui-
The genetic progression that takes cells from a normal sition of these or other gene mutations in somatic cells. The
to a malignant state is not a fixed series of specific steps in remaining 20 to 25% of colorectal cancers are linked to
any cancer. What is often observed, however, is that muta- inheritance of a germ-line mutation that predisposes a per-
tions in certain genes are found much more commonly than son to develop cancer. These inherited mutations do not by
mutations in other genes. These often-mutated genes are themselves lead to cancer, since several additional somatic
likely to be the “drivers” of cancer development, i.e. the mutations must still occur to drive the progression of tissue
mutations that are most directly tied to the development through the adenomatous stages to cancer.
of cancer and to the hallmarks of cancer. The acquisition Many of the gene mutations inherited in colorectal can-
of these commonly occurring mutations correlates with cer–prone families are not known or are not fully charac-
progression from one stage of abnormality to the next, as terized, but one gene, APC, is known to be transmitted in
Figure C.6 illustrates for different stages of colorectal can- mutated form in some families prone to colorectal cancer.
cer development. The process begins with the excessive About 1 to 2% of colorectal cancer cases result from genetic
proliferation of abnormal tissue. It then progresses to the predisposition to a cancer known as familial adenomatous
production of adenomas (polyps), the larger, easily detected polyposis (FAP). FAP is a hereditary form of cancer in
clusters of abnormal tissue. A small proportion of adenomas which affected family members inherit a mutated copy of
that continue to grow can become cancerous and produce APC. As with sporadic colorectal cancer, germ-line trans-
colon or rectal cancer. mission of APC mutations leads to the development of ade-
The figure identifies four specific genes that frequently, nomas in the colon. Since all the cells of a person inheriting
but not universally, are found to be mutated in association one mutated copy of APC are heterozygous for the mutation,
with the transition from one particular stage of abnormal- the formation of colonic polyps in such a person is prolific.
ity to the next. These mutations are common, occurring in In some cases, hundreds to thousands of polyps may form
about 25 to 75% of cases, but progression can also occur as early as the teenage years or in a person’s early twenties.
Figure C.6 Mutation acquisition in familial adenomatous polyposis. FAP features the development of
hundreds of colonic polyps that can become cancerous. Mutational analysis identifies at least four genes
that are frequently, but not universally, mutated as polyps progress toward malignancy.
548 APPLICATION C The Genetics of Cancer
cancer, neither mutation guarantees the development of can- not all, tumor types studied. KRAS mutation permits excessive
cer. Other mutations and, perhaps, specific nongenetic events cell proliferation, as described above for colorectal cancer.
must also occur for cancer to develop. Stated another way, a PTEN produces a protein product that normally acts simi-
woman with one of these mutations has about a 40% of not larly to TP53. It helps regulate cell cycle progression and also
experiencing breast cancer, a roughly 40 to 84% chance of participates in apoptosis. Mutations of numerous other genes
not experiencing ovarian cancer, and a roughly 17 to 38% were found to be more or less common in individual types of
chance of not having a second breast cancer in the healthy tumors. This and other studies like it make three features of
breast after the other breast has become diseased. The cancer mutations clear: (1) no two tumors of the same type
involvement of other genes and, perhaps, of nongenetic fac- have exactly the same profile of mutations, (2) some mutations
tors is a principal reason why cancer cell genomes have been are common to multiple types of cancer, and (3) specific types
so aggressively investigated. One outcome of this avenue of cancer often, but not always, contain certain mutations.
of investigation for breast and ovarian cancer is that genetic
testing is now available for more than two dozen other genes The Cancer Genome Atlas
whose mutations make small but meaningful contributions
to the overall risk of breast and ovarian cancer development. A comprehensive approach to cancer genome sequencing
has emerged in recent years as part of an international effort
to understand the genetic basis of cancer. This program,
called the Cancer Genome Atlas (TCGA) is compiling
C.4 Cancer Cell Genome genome sequence and analysis of somatic genetic muta-
Sequencing and Improvements tions in thousands of tumors of many types, with the goals
of achieving a complete understanding of cancer genetic
in Therapy abnormalities, identifying different categories of cancer
occurring within a single organ or tissue, and helping to
With continuous advances improving the accuracy, reducing develop more effective detection and treatment options. The
the cost, and dramatically increasing the speed of genome 2013 study described above is part of TCGA.
sequencing, several major studies have sequenced the Pancreatic cancer is among the most lethal of all malignan-
genomes of multiple types of cancer cells and identified the cies. In 2015, researchers participating in TCGA accomplished
mutations present in each of them. Collectively, these stud- the complete genome sequencing analysis of 100 patients with
ies have sequenced the genomes of several thousand malig- pancreatic cancer. Genomes from pancreatic tumor cells and
nant tumors of more than two dozen kinds of cancer. normal cells from the same patients were sequenced to fully
Major goals of these studies are to discover the identity identify mutations present in cancer cells. Chromosome rear-
of mutated genes, the frequency of individual gene mutations, rangements were common in cancer cells, and mutations of
and the driver mutations likely to be of significance to the dis- several genes, including TP53, BRCA1, and BRCA2, were
ease process, separating them from the “passenger” mutations frequently detected. Based on the complete set of gene muta-
that do not make a significant contribution to cancer develop- tions and chromosome rearrangements, the researchers were
ment and proliferation (see hallmark 10 of the hallmarks of able to classify the pancreatic cancers examined into four sub-
cancer cells listed earlier in this chapter). For example, one types. Each subtype had its own set of commonly occurring
large study in 2013 sequenced the genomes of 3,281 tumors mutations and chromosome rearrangements, although there
from 12 major cancer types. The study identified more than was some overlap of mutations in various subtypes. One of the
617,000 somatic mutations in the tumors examined. The num- subtypes, called “unstable,” had an array of mutations that sug-
ber of mutations varied widely across the tumor types. On gested the disease might be responsive to a particular type of
average, each tumor had two to six mutations, although some chemotherapy. Of five patients with the unstable type of pan-
had many more than that. The relatively low average number creatic cancer who received this chemotherapy, four showed
of mutations suggests that the number of driver mutations is substantial responsiveness to the treatment.
relatively small. By comparing the genomes of sequenced
tumors and taking into account known gene functions, the
study identified 127 driver mutations that appear to play a sig-
Epigenetic Irregularities
nificant role in the development of one or more tumor types. Other recent genomic research on cancer has focused on
Of these 127 significant driver mutations, the most com- epigenetic dysregulation and the role it plays in the devel-
monly mutated gene in cancer cells was found to be TP53. opment and proliferation of cancers, especially hemato-
Approximately 42% of all tumors studied carried a TP53 logic cancers (as described in Section C.2). The largest of
mutation. Given the dual functions of TP53 protein in pausing these studies to date found that the integrity of epigenetic
the cell cycle for DNA damage repair and the role of TP53 in processes is disrupted in cancer in the two ways. First, epi-
initiating apoptosis in heavily damaged cells, it makes sense genetic regulation can be abnormal, and second, mutation of
that a mutation of TP53 would frequently play an important epigenetic readers, writers, and erasers can alter epigenetic
role in cancer development. The KRAS gene and a second patterns in cells, leading to irregularities of gene expres-
gene known as PTEN are commonly mutated in some, but sion. Mutations affecting methylases and demethylases, for
550 APPLICATION C The Genetics of Cancer
example, can lead to alterations of normal methylation pat- (ALL) and non-Hodgkins lymphoma. Each of the patients
terns of the genome in cancer cells. in the study had been nonresponsive to other chemother-
Several studies identify global hypomethylation as a apy approaches or had a relapse of cancer. Life expectancy
cause of genome instability, including chromosome deletions under either circumstance is a few months.
and rearrangements. The studies also find a significant degree Researchers first isolated a type of immune system cell
of hypomethylation of microRNA (miRNA) genes. In addi- called T cells from each patient. T cells normally carry mol-
tion, studies find that hypermethylation in cancer appears to ecules on their surfaces that target specific proteins on foreign
play an important role in silencing tumor suppressor genes, cells and destroy those foreign cells. The isolated T cells were
particularly by hypermethylating CpG islands in promoters. then genetically modified with a chimeric antigen receptor
Examples of cancer hypermethylation have been identified in (CAR), a protein that allows modified T-cells to target leuke-
tumor suppressor genes such as RB1, BRCA1, and MutL. mia cells with the antigen CD-19 on the surface. The modified
T cells were grown in the laboratory. The modified T-cells,
called CAR-T cells, and injected back into the patient, where
Targeted Cancer Therapy they attack and destroy cells with the CD-19 antigen.
The 2015 study of pancreatic cancer genomics adds to a The preliminary results of this targeted cancer therapy
growing list of cancers that can be classified by their pat- were very encouraging. Of 29 ALL patients, 27 were free of all
terns of mutations. This information can then be used to tar- traces of cancer following treatment. Additional studies were
get the cancer with specific kinds of chemotherapy. The first undertaken and a total of 63 children with treatment-resistant
successful example of the use of chemotherapy to target the ALL or ALL that had relapsed were treated with the CAR-T
specific malfunction in a cancer was for chronic myelog- therapy. Fifty-two of the 63 children (83%) had cancer remis-
enous leukemia (CML). sion within three months—a high rate, given that these cases
Recall that CML is caused by a chromosome translo- are usually quickly fatal. In August 2017 the U.S. Food and
cation that forms a fusion c-ABL–BCR gene. The resulting Drug administration approved this CAR-T cell therapy for the
c-ABL–BCR chimeric protein continuously activates cell treatment of ALL in patients up to age 25 with treatment-resis-
proliferation. The problem with the chimeric protein is that it tant ALL or ALL relapse. The therapy is named Kymriah and
is always in an active state and it cannot be inactivated. In the the genetically-modified CAR-T cells are made by Novartis.
late 1990s, researchers looking for chemicals that might be Kymirah therapy can have severe potential side effects and
able to block the continuous activation of c-ABL–BCR tried a it is very expensive. patients are treated just once with Kymirah
drug named imatinib and found that it bound to c-ABL–BCR therapy and the cost is approximately $475,000. The long-term
in a manner that could inhibit the protein from activating cell survival of these patients remains to be determined. Despite
proliferation. Under the trade name Gleevec, the drug was these drawbacks, targeted cancer therapies made possible by
initially administered to 54 CML patients in the early 2000s, cancer genome sequencing point to positive advances and new
and the cancer almost or completely vanished in 53 of the 54 directions in the understanding and treatment of cancer. The
patients. Gleevec has now been used for more than 15 years development of such approaches is one of the goals of genome
and has proven to be a highly effective targeted treatment for sequencing, which seeks to make personalized medicine a real-
CML. Similar kinds of success are now achieved with tar- ity in the coming decades. Along with the apparent success
geted cancer chemotherapy in some lung cancers, melanoma of Keytruda and other drugs that improve patients’ immune
skin cancers, colon cancers, and breast cancers. The targeted response to cancer indicates that continued pursuit of immune
cancers carry mutations of specific genes, making them system stimulation may offer new avenues of cancer treatment.
responsive to targeted cancer therapy by chemical treatment. Other new additions to the anticancer arsenal of drugs are
Most recently, targeted cancer therapy has turned in those that aim at counteracting mutations affecting cancer cell
a new direction, using genetically modified cells from a epigenetics. To date, epigenetic therapies have been limited
cancer patients own immune system to attack cancer cells to the use of DNA methyltransferase inhibitors and histone
that otherwise evade immune system detection. A study deacetylase inhibitors. The U.S. Food and Drug Administra-
published in 2015 reported the results of treating patients tion has recently approved drugs in both categories for use in
diagnosed with terminal acute lymphoblastic leukemia treating T-cell lymphoma and multiple myeloma.
1. Identify the normal functions of the following genes c. p53 (Li–Fraumeni syndrome)
whose mutations are associated with the development of d. APC (familial adenomatous polyposis)
cancer. e. Which of these genes would you classify as a proto-
a. RB1 (retinoblastoma) oncogene and which as a tumor suppressor gene?
b. c-MYC (Burkitt’s lymphoma) Explain your categorization for each gene.
Problems 551
2. A tumor is a growing mass of abnormal cells. 8. The inheritance of certain mutations of BRCA1 can make
a. Describe the difference between a benign tumor and a it much more likely that a woman will develop breast or
malignant tumor. ovarian cancer in her lifetime.
b. Give an example from this chapter of a benign tumor a. Can you say with certainty that a woman inheriting a
that becomes a malignant tumor. mutation of BRCA1 will definitely develop breast or
c. What must happen for a benign tumor to become ovarian cancer in her lifetime? Why or why not?
malignant? b. In addition to inheriting a BRCA1 mutation, what else
3. For the retinal cancer retinoblastoma, the inheritance of must happen for a woman to develop breast or ovarian
one mutated copy of RB1 from one of the parents is often cancer?
referred to as a mutation that produces a “dominant pre- 9. Go to the website http://www.cancer.gov and scroll down
disposition to cancer.” This means that the first mutation to the box labeled “Find a Cancer Type.” Select “B” and
does not produce cancer but makes it very likely that can- then select “Breast Cancer.” Scroll down to “Causes and
cer will develop. Prevention” and then select “BRCA1 and BRCA2: Cancer
a. Define the “two-hit hypothesis” for retinoblastoma. Risk and Genetic Testing.” Use the information on this
b. Explain why cancer is almost certain to develop with page to answer the following questions.
the inheritance of one mutated copy of RB1. a. What are the approximate percentage increases in risk
c. Using RB1+ for the normal wild-type allele and RB1- of having breast cancer and of having ovarian cancer
for the mutant allele, identify the genotype of a cell in for women inheriting harmful mutations of BRCA1
a retinoblastoma tumor. and BRCA2 compared with the risks in the general
d. What is the genotype of a normal cell in the retina in population?
a person who has sporadic retinoblastoma? What is b. What features of family history increase the likelihood
the normal cell genotype if the person has hereditary that a woman will have a harmful mutation of BRCA1
retinoblastoma? Explain the reason for the difference or BRCA2?
between the genotypes. c. With regard to the results of genetic testing for BRCA1
4. Explain the following processes involving chromosome and BRCA2 mutations, what is meant by a “positive
mutations and cancer development. result”?
a. How the chromosome mutation producing the d. Are there measures a woman with a positive result can
Philadelphia chromosome leads to CML. take to lessen her chances of developing cancer or to
b. How the chromosome mutation producing Burkitt’s catch a cancer early in its development?
lymphoma generates the disease. e. As a special project, instead of selecting “Breast
5. In March 2011 an earthquake measuring approximately Cancer” from the list of types of cancer select another
9.0 on the Richter scale struck Fukushima, Japan. Several cancer you would like to know more about and
nuclear reactors at the Fukushima Daichii nuclear plant were produce a short summary of what you find.
damaged, and nuclear core meltdown occurred. A massive 10. What kind of information will be made available by the
release of radiation accompanied damage to the plant, and Cancer Genome Atlas (TCGA)? What sort of role do you
5 years later the incidence of thyroid cancer in children think TCGA information will play in cancer diagnosis and
exposed to the radiation was determined to be well over 100 cancer treatment in the future?
times more frequent than expected without radiation expo-
sure. DNA damage and mutations resulting from radiation 11. Go to the website http://www.ncbi.nlm.nih.gov/omim
exposure are suspected of causing this increased cancer rate. and enter “Lynch syndrome” in the Search box at the
a. What gene discussed in this chapter might be respon- top of the page. From the list of options given, select
sible for pausing the cell cycle of dividing cells long “#120435—Lynch Syndrome.” Use the information you
enough for radiation-induced damage to be repaired in retrieve to answer the following questions.
cells? a. There are two types of Lynch syndrome, what are they?
b. Do you think it is possible that significant increases in b. What genes are most commonly mutated in Lynch
the incidence of other types of cancer will occur in the syndrome?
future among people who were exposed to the Fuku- c. Provide a brief summary of the normal functions of the
shima radiation? Why? protein products of these genes.
6. Radiation is frequently used as part of the treatment of d. What are the approximate rates of cancer that develop
cancer. The radiation works by damaging DNA and com- in people carrying a mutation of one of these genes?
ponents of the cell.
12. Genetic counseling has not been discussed in this chap-
a. How can radiation treatment control or cure cancer?
ter, but it is a service provided by trained professional
b. Is there a risk of damage to noncancer cells?
counselors who also have detailed knowledge of medical
c. Under what circumstances do you think radiation treat-
genetics, as described in Application Chapter A. Genetic
ment is a good choice to treat cancer?
counselors provide details about gene mutations and have
7. Based on what you read in this chapter knowledge of most of the details of diseases associated
a. Can a tumor arise from a single mutated cell? Are all with genetic abnormalities. With regard to genetic test-
the cells in a tumor identical? ing to identify one’s personal risk of cancer, what are the
b. Why do most cancers require the mutation of multiple three or four topics you think are most important to be
genes? able to discuss with a genetic counselor?
15 Recombinant DNA
Technology and Its
Applications
CHAPTER OUTLINE
15.1 Specific DNA Sequences Are
Identified and Manipulated
Using Recombinant DNA
Technology
15.2 Introducing Foreign Genes into
Genomes Creates Transgenic
Organisms
15.3 Gene Therapy Uses
Recombinant DNA Technology
15.4 Cloning of Plants and Animals
Produces Genetically Identical
Individuals
ESSENTIAL IDEAS The writing in this image consists of transgenic E. coli expressing the
genes for the carotenoid biosynthetic pathway, derived from plants.
❚❚ DNA can be amplified by either molecu- Carotenoid pigments, responsible for the red and orange colors of toma-
lar cloning or the polymerase chain toes, peppers, and oranges, act as a buffer system to absorb excess elec-
reaction. trons and radicals produced during photosynthesis.
❚❚ In molecular cloning, DNA fragments are
inserted into a cloning vector, which in
T
turn is replicated in a live host.
❚❚ Libraries are collections of clones of he advent of recombinant DNA technology for recom-
DNA fragments, derived from the DNA bining, copying, and analyzing genetic sequences
or mRNA isolated from cells or an
opened the way to studying gene function at the molecular
organism.
❚❚ Transgenic organisms are created by har- level. This aspect of genetic exploration began with a set of
nessing biological vectors to introduce basic strategies for the in vitro manipulation of DNA and for
genes into organisms. identifying the sequence of any given gene. The next step
❚❚ Recombinant DNA technology in humans
after that achievement was to invent methods for the precise
is a pathway to the development of gene
therapy. manipulation of gene action in living organisms.
❚❚ Cloning of plants and animals produces One of the central technical developments propelling
genetically identical individuals. that latter advance was development of the ability to create
transgenic organisms—organisms that have had genes from
552
15.1 Specific DNA Sequences Are Identified and Manipulated Using Recombinant DNA Technology 553
other organisms inserted into their genomes. The recombinant, DNA molecules; (4) determine the exact
methodology, now routine in genetic analysis, can be sequence of specific DNA molecules; (5) identify fragments
of DNA containing complementary sequences; (6) introduce
adapted to an almost limitless number of experimen-
specific DNA molecules into living organisms; (7) precisely
tal approaches. It is a powerful tool for manipulating edit the genomes of organisms; and (8) assay the phenotypic
the activity of specific genes, observing the resultant effects of the genetic changes.
phenotypes, and in this way acquiring new insight into The major challenges of recombinant DNA tech-
nology are the identification of specific DNA sequences
biological processes.
and their manipulation in vitro. To see these challenges
Collectively, the techniques of recombinant DNA in perspective, consider that each of your cells contains
technology have permitted the sequencing of the two copies each of 22 autosomes and 2 sex chromosomes.
entire genomes of many species, including our own, Collectively, a haploid set of 23 chromosomes contains
3 billion base pairs and carries some 20,400 or so genes.
providing an unprecedented view of life. Increasingly
A typical gene encodes an mRNA transcript consisting of
sophisticated techniques have enabled both in vitro a few thousand bases, although the mRNA may be tran-
and in vivo manipulation of DNA sequences, shed- scribed from a region that spans millions of base pairs.
ding light on the molecular basis for development Molecular analysis of genes and of allelic variation is pos-
sible only by distinguishing a gene of interest from others
and physiology and for genetic variation both within
in the genome.
and between species. Recently, genome-editing Recombinant DNA technology allows researchers to
techniques have transformed both the study of gene divide the genome into smaller segments that can then be
function and the development of applications for analyzed and reassembled to provide a molecular view of
genes and the genome. In the following sections we survey
specific medical, agricultural, or industrial purposes. If
the development of recombinant DNA technology tools and
used wisely, this knowledge can be applied to better their application to identify specific DNA sequences. We
the human condition as well as the condition of the begin with discoveries in the 1970s that have led to increas-
planet. ingly sophisticated methods for manipulating genomic
sequences.
In this chapter, we discuss these applications
of recombinant DNA technology while focusing
Restriction Enzymes
on the methods used to create transgenic organ-
isms and manipulate gene activity. These discus- Restriction enzymes, which cut DNA at specific sequences,
have become a basic tool of recombinant DNA technol-
sions furnish the nuts-and-bolts details of how
ogy. Each type of restriction enzyme recognizes a particular
reverse genetics is accomplished in different model sequence at which it cuts both strands of the sugar-phosphate
organisms. backbone of the DNA, cleaving the restriction sequence
in the same way each time it is encountered. Restriction
enzymes were originally discovered in bacterial cells, where
they protect the bacteria from invasions of nucleic acids,
15.1 Specific DNA Sequences Are such as the injected genomes of bacteriophages, by digesting
Identified and Manipulated Using foreign DNA. They were given the name restriction enzymes
because they restrict the growth of the bacteriophages. Bac-
Recombinant DNA Technology terial cells also contain restriction–modification systems,
which modify the restriction sequences in the bacterial
Recombinant DNA technology is the set of techniques DNA by the addition of methyl groups and thus protect the
developed for amplifying, maintaining, and manipulat- bacteria’s own DNA from being digested by endogenous
ing specific DNA sequences in vitro and also in vivo. This restriction enzymes. Experimental Insight 15.1 explains how
technology, which is based on advances in microbiology— restriction enzymes and restriction-modification systems
particularly in understanding the life cycles of bacteria and were identified and how they became an indispensable part
their viruses, the bacteriophages—has revolutionized the of molecular biology.
study of genetics. With the ultimate goal of studying spe- Restriction enzymes are common in bacteria. The
cific genes and their functions, biologists use recombinant names given these enzymes are generally derived from
DNA techniques to (1) fragment DNA into easily managed the first letter of the bacterial genus and first two letters of
pieces and then separate and purify these fragments; (2) cre- the species moniker, followed by a Roman numeral. For
ate many copies of DNA molecules of identical sequence; example, EcoRI is derived from Escherichia coli (E. coli);
(3) combine DNA fragments to construct chimeric, or the letter R denotes the strain from which the enzyme was
554 CHAPTER 15 Recombinant DNA Technology and Its Applications
From Bacteriophage to Restriction into large fragments, but it does not affect H. influenzae DNA.
This confirmed Arber’s idea that bacterial DNA is protected
Enzymes: Basic Research Spawned from the action of the bacteria’s own restriction enzymes.
a Biological Revolution Second, each resulting DNA fragment has the same three
Basic biological research aims to discover and understand base pairs at its ends, indicating that cleavage occurs only
phenomena from every part of the spectrum of life. Thou- at the target sequence. Smith also discovered that restriction
sands of biologists engage in this research every day, and enzymes cleave every copy they encounter of their target
most have specialties that may seem obscure or trivial to sequence.
nonscientists. Nevertheless, their discoveries can not only In 1971, Daniel Nathans pioneered the use of restric-
revolutionize research but affect how we view the world. tion endonucleases to address genetic and genomic ques-
In the mid-1960s, Werner Arber was studying a bacterial tions. Nathans used HindII to digest the small genome of the
phenomenon called host-controlled restriction and modifi- Simian virus SV40 and found that 11 DNA fragments were
cation, which acts as a simple immune system for bacteria formed. In 1973, Nathans digested SV40 with two newly dis-
invaded by bacteriophages. He showed that E. coli produces covered restriction endonucleases. He then used the three
two enzymes that affect the same short palindromic DNA sets of restriction fragments to create the first restriction map
sequence (meaning a sequence that has the same 5′@to@3′ of the SV40 genome, by determining the number of restric-
base sequence in both of its antiparallel DNA strands). One tion sites for each enzyme and their order in the genome and
enzyme, called a restriction endonuclease, cleaves DNA at assembling the information into a map (as demonstrated
that sequence, like a pair of molecular scissors. The second elsewhere in this chapter).
enzyme, called a modification enzyme, adds methyl groups By the time Nathans completed his SV40 genome map,
(CH3) to DNA, thereby preventing restriction endonucleases biologists were already looking for other restriction enzymes.
from binding to and cleaving the DNA. Within 5 years, more than 100 more restriction enzymes were
In 1970, Hamilton Smith extended Arber’s work by study- discovered. Many formed “sticky” ends on digested DNA
ing a restriction endonuclease from Haemophilus influen- (described on this page), and Paul Berg realized that DNA
zae. Smith isolated the restriction endonuclease, now called fragments from different organisms could be joined together
HindII, and determined that it cleaves at the sequence if they had complementary sticky ends. This finding led to his
creating the first recombinant DNA molecule, in 1975.
5′-GTPyPuAC-3′ 5′-GTPy PuAC-3′ Arber, Smith, and Nathans shared the Nobel Prize in
S
3′-CAPuPyTG-5′ 3′-CAPu PyTG-5′ Physiology or Medicine in 1978 for their work on restriction
enzymes, and Berg won the prize in 1980 for the develop-
HindII cleaves both strands of its target sequence between ment of recombinant DNA. Since then, restriction enzymes
the central purine (Pu = A or G) and pyrimidine (Py = T or have become a ubiquitous tool in genetic and genomic
C), leaving blunt ends on either side of the cut (blunt ends are research. Arber’s initial study of an obscure event in bacte-
discussed on page 559). ria had spawned a revolution as momentous as Watson and
Smith’s work on HindII identified some important character- Crick’s description of DNA structure or Mendel’s description
istics of restriction enzymes. First, HindII cleaves foreign DNA of the laws of heredity.
obtained (RY13), and the numeral (I) indicates it was the of DNA fragments generated with restriction enzymes,
first enzyme identified. EcoRI recognizes the palindromic and complementary base pairing plays a role in almost all
sequence recombinant DNA techniques. The principle is that if two
5′-GAATTC-3′ DNA molecules produced by restriction enzyme digestion
have complementary sticky ends, they can be combined by
3′-CTTAAG-5′
complementary base pairing.
Recall that a palindrome has the same 5′@to@3′ base Another enzyme, EcoRI methylase, protects the E. coli
sequence in both of its antiparallel DNA strands. Most restric- genome from being itself digested by the EcoRI endonucle-
tion enzymes recognize palindromic sequences. EcoRI cuts the ase. EcoRI methylase does this by adding a methyl group
sugar-phosphate bond between the G and the adjacent A resi- to the A adjacent to the T in both strands of the DNA. This
dues in both strands, and the staggered cut results in two prod- is the “modification” performed by the EcoRI restriction–
ucts, each ending with a four-base, single-stranded sequence: modification system.
5′-G AATTC-3′
Hundreds of restriction enzymes have been isolated from
bacteria and are commercially available. Although many
3′-CTTAA G-5′
restriction enzymes produce sticky ends, either with 5′ over-
The single-stranded segments at the ends of each EcoRI hangs (as produced by EcoRI) or with 3′ overhangs, some
fragment are referred to as sticky ends because they can restriction enzymes leave blunt ends that lack a single-stranded
“stick” to a complementary base-pair sequence by hydrogen segment. Blunt-ended DNA molecules can also be recombined,
bonding. Production of sticky ends facilitates the combining by techniques discussed later in this chapter (see page 559).
15.1 Specific DNA Sequences Are Identified and Manipulated Using Recombinant DNA Technology 555
Some restriction enzymes recognize 4-bp sequences, HindIII HindIII HindIII HindIII HindIII
others recognize sequences of 5 bp or 6 or 8 bp. The length (23130) (25157) (27479) (36895) (37459)
Apal XbaI
of the recognition sequence influences how frequently a (10090) (24508) Xhol HindIII
given enzyme will cut DNA. If an organism had DNA con- (33498) (44141)
sisting of 25% A, 25% T, 25% G, and 25% C and the bases
were randomly distributed, then a restriction enzyme that Lambda chromosome 48,502 bp
XhoI•ApaI
had a 4-bp recognition sequence would be expected to cut
Uncut l
HindIII
1 1 1 1 1
the DNA once every 256 bp a * * * = b
XhoI
ApaI
4 4 4 4 256 Lane:
Likewise, a restriction enzyme that recognized a 6-bp
kb
1 48.5 38.4
sequence would cut the DNA once every 4096 bp a 6 b on 33.5 23.4 23.1
4
average, and a restriction enzyme that recognized an 8-bp
1 15 15
sequence would cut the DNA once every 65,536 bp a 8 b on
4
average. In reality, genomes of most organisms do not con-
10.1 10.1
sist of equal amounts of each of the four bases. For example, 9.4
most genomes of multicellular eukaryotes are AT-rich (that
is, their genomes have a higher content of A and T than of G 6.7
and C), and so restriction enzymes that recognize a GC-rich 4.4
sequence would cut less frequently on average than would
2.3
enzymes that recognize an AT-rich sequence. 2.0
Scientists use data from restriction experiments, includ-
ing the number of restriction sites and the number of base 0.5
pairs between the sites, to create maps of specific DNA
sequences. These restriction maps provide a foundation for Figure 15.1 Restriction mapping of lambda phage.
further manipulation of the DNA fragments—for example,
by suggesting where to further subdivide cloned fragments However, two orientations are possible for the XhoI restric-
in order to clone still smaller fragments in a process known tion map relative to the ApaI restriction map drawn above. It
as subcloning. could also be drawn as shown below.
Let’s use the genome of E. coli lambda phage in an
example of the restriction mapping process. The DNA of the XhoI
phage genome can be isolated by purifying the phage and l
removing its protein coat. If this is done gently, the isolated 15 kb 33.5 kb
nucleic acid will be the entire lambda chromosome, which
is a linear molecule 48,502 bp in length. Electrophoresis of To determine which order is correct, we need to perform a
the chromosome in an agarose gel containing a fluorescent double digest, in which both enzymes are used simultane-
stain for DNA would reveal a single fluorescent 48.5-kb ously to cut the lambda genome. This experiment generates
band (first lane in Figure 15.1). If the purified lambda three pieces: 10.1 kb, 15 kb, and 23.4 kb. Since the 15-kb
chromosome is first digested with ApaI, two fragments, XhoI fragment remained intact but the 33.5-kb XhoI frag-
one measuring 10.1 kb and the other 38.4 kb, are generated, ment was cut into two fragments (10.1 kb and 23.4 kb) by
indicating that ApaI must cut the genome once. This allows ApaI, we conclude that the map must be:
us to begin drawing the restriction map as shown below
ApaI XhoI
l
ApaI
10.1 kb 23.4 kb 15 kb
l
10.1 kb 38.4 kb The other possible map can be eliminated as incorrect since
it would generate fragments of 4.9, 10.1, and 33.5 kb:
If we digest the purified lambda chromosome with
ApaI XhoI
XhoI, two fragments, one 33.5 kb and one 15 kb, are gener-
ated, indicating that XhoI must also cut the genome once: l
10.1 kb 33.5 kb
XhoI 4.9 kb
1.0
0.8
0.6 0.5
0.4
0.2
Evaluate
1. Identify the topic this problem 1. This problem is about restriction mapping and asks you to construct a restriction
addresses and the nature of the map of a plasmid.
required answer.
2. Identify the critical information 2. Electrophoresis results are given for three single digests and the three possible
given in the problem. double-digest combinations.
Deduce
3. Identify the sizes of each of the 3. BamHI—A single 7-kb fragment. Since plasmids are circular, BamHI must cut the
fragments of the single digests, plasmid only once.
TIP: Compare the sizes of and determine how EcoRI—A single 7-kb fragment. One site in the plasmid.
fragments in the sample many times each enzyme NotI—Two fragments: 3 kb and 4 kb. NotI must cut the plasmid at two sites.
lanes with the sizes of the
cuts plasmid.
standards in the outer 4. NotI + BamHI—Three fragments: 3 kb, 2.3 kb, 1.7 kb.
lanes.
NotI + EcoRI—Two fragments: 4 kb, 3 kb.
4. Identify the sizes of each of the
BamHI + EcoRI—Two fragments: 5.3 kb, 1.7 kb.
fragments of the double digests.
5. NotI + BamHI—Three fragments, with the 3-kb NotI fragment intact, suggesting
the BamHI site is within the 4-kb NotI fragment.
5. Compare single- and double-
digest results for similarities and NotI + EcoRI—Two fragments, with both the 4-kb and 3-kb NotI fragments intact,
differences. suggesting the EcoRI site is adjacent to one of the NotI sites.
PITFALL: If two sites are BamHI + EcoRI—Two fragments, indicating the two sites are separated by 1.7 kb
TIP: In analyzing double digests, the very close to one another,
relative position of restriction sites can there will be fewer frag- (or 5.3 kb the long way around the plasmid).
be determined by observing which ments than expected in
fragments remain intact and which are the double digest.
cut into smaller fragments.
For more
For morepractice,
practice, see
see Problems
Problems 16, 18, Visit theand
19, 20, Study
21. Area
Visit for
the aStudy
VideoTutor solution.
Area to access study tools. Mastering Genetics
556
15.1 Specific DNA Sequences Are Identified and Manipulated Using Recombinant DNA Technology 557
BamHI
HindIII
biological system. Then the recombinant DNA molecule is
EcoRI
BglII
Notl
Alul
introduced into a biological system (a living organism) that
amplifies the DNA, making many identical copies called
Digestion of genomic DNA
kb from Physcomitrella patens, DNA clones. Molecular cloning produces a large quantity
23.1 4 × 108 bp of identical DNA molecules that can be analyzed by a vari-
ety of techniques, including restriction enzyme analysis and
9.4
Prominent bands are DNA sequencing.
6.6 chloroplast DNA (123 kb), Molecular cloning has three general steps:
present in hundreds to
4.4 thousands of copies per cell. 1. The joining together of the cloning vector and a donor
DNA fragment to produce a recombinant DNA molecule
2. Screening to select recombinant vectors containing
2.3 Recognition sequences copies of the DNA segment of interest
2.0
NotI 5¿ GC*GGCCGC 3¿ 3. Amplification (cloning) of the recombinant DNA mol-
ecule in a biological system
BamHI 5¿ G*GATCC 3¿
In the rest of this section, we first describe how DNA frag-
BglII 5¿ A*GATCT 3¿
ments are combined in vitro, the attributes of some com-
EcoRI 5¿ G*AATTC 3¿ mon cloning vectors, and the means of their amplification.
We then describe how DNA libraries—collections of
HindIII 5¿ A*AGCTT 3¿
cloned DNA fragments, usually derived from a single DNA
AluI 5¿ AG*CT 3¿ source—are constructed.
Figure 15.2 Restriction-enzyme digestion of genomic DNA.
Creating Recombinant DNA Molecules One common
method of producing recombinant DNA for cloning is to
To analyze DNA from organisms with large genomes, digest DNA from the donor source and DNA of the cloning
researchers must fragment the genomes into more man- vector with the same restriction enzyme. The resulting linear
ageable pieces. For example, the genome of the moss fragments from the two DNA sources can then be annealed
Physcomitrella patens consists of 400 million base pairs, so at their complementary sticky ends. Figure 15.3 illustrates
digestion with a restriction enzyme like EcoRI that cuts on restriction digestion by EcoRI of both the vector DNA—a
average every 4096 bp produces approximately 100,000 dif- plasmid, in this case—and DNA from the human genome.
ferent DNA fragments. When this digested DNA is electro- Mixing the two DNAs in a test tube allows the sticky ends
phoresed through an agarose gel, the fragments making up to hybridize to one another by complementary base pairing,
the resulting “smear” seen in Figure 15.2 range from greater after which the remaining single-stranded nicks are sealed
than 20 kb down to smaller than 100 bp. The smeared appear- (“ligated”) with DNA ligase (see Section 7.4), resulting in
ance results because, although the enzyme cuts every 4096 a recombinant DNA molecule. In this case, a recombinant
bp on average, the distances between EcoRI sites will vary plasmid containing human DNA is formed.
due to variation in the genome sequence, and the resolving Although it is common to cut both source and vector
power of agarose gel electrophoresis is not sufficient to sep- DNA with the same enzyme, variations on this theme are
arate all of the different-sized fragments into discrete bands. frequently employed. For example, two different restriction
This lack of resolution is compounded with larger genomes, enzymes that create complementary sticky ends are some-
such as ours, where digestion with EcoRI produces approxi- times used. When different restriction enzymes are used to
mately 730,000 pieces (3,000,000,000/4096). digest vector and donor DNA, complementary sticky ends
are called cohesive compatible ends. For example, BamHI
Molecular Cloning recognizes the 6-bp sequence
5′-GGATCC-3′
After a genome under study has been reduced to smaller
pieces by restriction enzymes, the individual pieces must be 3′-CCTAGG-5′
reproduced in large amounts—generally, either by molecu- and leaves sticky ends
lar cloning or by the polymerase chain reaction (PCR)—so
that each of them can be analyzed in greater detail. Molecu- 5′-G GATCC-3′
lar cloning arose from discoveries in bacterial enzymology 3′-CCTAG G-5′
and utilizes bacteria and their plasmids or phages to amplify
Sau3A recognizes the 4-bp sequence
and propagate specific fragments of DNA.
In molecular cloning, isolated DNA fragments are 5′-GATC-3′
inserted into a vector, a carrier fragment of DNA with 3′-CTAG-5′
558 CHAPTER 15 Recombinant DNA Technology and Its Applications
Plasmid vectors Human DNA other rather than incorporating a donor insert, p roducing
a nonrecombinant vector. Because neither nonrecom-
EcoRI EcoRI EcoRI
EcoRI binant vectors nor clones with multiple inserts are desired
EcoRI EcoRI results, techniques to favor the production of single-insert
EcoRI clones have been developed. For example, the occurrence of
nonrecombinant vectors can be reduced by removal of the
Digest 5′ phosphates on the vector DNA, so that the vector DNA
with cannot ligate to itself to produce nonrecombinant clones.
Digest with EcoRI. EcoRI.
A feature of experiments using a single restriction
enzyme or using two enzymes with cohesive compatible
ends is that the insert DNA can be ligated into the vector in
either orientation. One way to ensure that DNA to be cloned
is inserted into a vector in a specific orientation is to use two
Identical,
complementary
restriction enzymes that each cut a different sequence, thus
sticky ends creating two different sticky ends on the vector that are com-
G 3¿ 5¿ A A T T C 5¿ A A T T C G 3¿ patible with the same nonidentical sticky ends of the insert,
C T T A A 5¿ 3¿ G 3¿ G C T T A A 5¿ a process called directional cloning (Figure 15.4). Direc-
tional cloning has three desirable features. First, only insert-
DNA fragments possessing the two different compatible
Combine fragments.
Create nonidentical, complementary
sticky ends (directional cloning).
Recombinant Nonrecombinant
plasmids vector Vector DNA Insert DNA
EcoRI BamHI
EcoRI BamHI
ends will be efficiently inserted into the vector. Second, the Plasmid vector Insert DNA
inserted fragments are ligated in a particular orientation dic- EcoRI
tated by the cohesive compatible ends. And third, due to the Kpnl Kpnl
incompatibility of the two ends of the digested vector DNA,
the vector cannot re-ligate to itself, thus minimizing the cre-
ation of nonrecombinant vectors.
Although hundreds of restriction enzymes are commer-
cially available, cohesive compatible ends are not always Digest Digest
with EcoRI. with Kpnl.
possible to produce at the positions necessary for construct-
ing the desired recombinant DNA molecules. One approach
to creating compatible ends in such a case is to generate
blunt ends—ends without any overhang—that can then be
ligated to form a recombinant molecule.
Some restriction enzymes naturally create blunt ends,
but any restriction enzyme site can be converted into a G 3¿ 5¿ A A T T C 5¿ C G G T A C 3¿
blunt end. There are two general strategies (Figure 15.5). C T T A A 5¿ 3¿ G 3¿ C A T G G C 3¿
For example, DNA polymerase (see Section 7.2) can use a
5′ overhang as a template and add dNTPs to the recessed Fill in 5’ Remove 3’
3′ end until a blunt end has been produced. Alternatively, overhang overhang
3′ overhangs can be made blunt by a DNA exonuclease (see with DNA with exo-
Section 7.4) that degrades only single-stranded DNA and polymerase. nuclease.
“chews back” the 3′ overhang. Some procedures use shear-
ing force rather than restriction enzymes (for example, by G A A T T 3¿ 5¿ A A T T C 5¿ C G 3¿
passing DNA through a fine needle), producing random C T T A A 5¿ 3¿ T T A A G 3¿ G C 5¿
DNA fragments whose ends can then be blunted by treat-
ment with a DNA polymerase and exonuclease. Conversely,
blunt ends can be converted into sticky ends by ligation of
short oligonucleotides (nucleic acid molecules composed Combine fragments.
of a relatively small number of nucleotides) onto the blunt-
ended DNA molecules. The oligonucleotides can be synthe- Recombinant vector
sized to have sequences for any restriction enzyme desired,
thus adding any specific restriction site to the end of any
DNA molecule. Oligonucleotides of this type are called
linkers.
MCS
lacZ
(b)
an insertion of DNA into the MCS (Figure 15.6b). Although introduced into E. coli by transformation, the same pro-
the normal substrate for b@galactosidase is lactose, the cess described by Griffiths and by Avery, MacLeod, and
enzyme can also cleave lactose analogs, such as X-gal. McCarty in their early investigations of the hereditary func-
When the colorless substrate X-gal is added to the growth tion of DNA (Figure 15.7; also see Section 6.3). In modern
medium, bacteria with a functional lacZ gene producing laboratories, DNA is mixed with E. coli in a test tube. The
b@galactosidase will convert X-gal to a blue product (see bacteria are chemically treated with either divalent cations
Section 14.4). When a fragment of DNA is inserted into (such as Ca2+) or an electrical shock to open pores in their
the MCS, the lacZ gene is disrupted and rendered nonfunc- membranes, thus making the bacteria “competent” to take
tional. Bacteria then will appear as white colonies, whereas up exogenous DNA by transformation. For safety purposes,
bacteria harboring a cloning vector that does not contain a the bacterial strains used in recombinant DNA experiments
fragment of DNA inserted in the MCS are blue. This dif- are chosen for characteristics that do not allow them to sur-
ference allows rapid identification of colonies harboring vive well outside of the laboratory.
vectors with inserts in the MCS. Thus, selection based on The concentrations of DNA used to transform compe-
antibiotic resistance allows identification of bacteria that tent bacteria are those determined empirically to be concen-
have been transformed, and blue versus white screening trations at which individual bacterial cells are likely to take
allows identification of bacteria harboring plasmid vectors up no more than one DNA molecule. After transformation,
with an insertion of recombinant DNA. the bacteria are allowed to recover for a short period of time
and are then plated on growth medium that selects for cells
Amplifying Recombinant DNA Molecules For amplifi- containing the selectable marker gene, conferring resis-
cation—that is, replication of the recombinant DNA mol- tance to an antibiotic, encoded on the DNA vector. When
ecules in large numbers—the recombinant molecules are the transformed bacteria are plated on media containing the
15.1 Specific DNA Sequences Are Identified and Manipulated Using Recombinant DNA Technology 561
Recombinant plasmids within the bacteria. Since the recombinant vector has an
origin of replication, it will amplify by autonomous replica-
a c tion using bacterial enzymes. After that, the next time the
bacterium divides, each of its progeny will receive copies
ori b ori
AmpR AmpR of the recombinant DNA molecule. Because a single bac-
ori terium with a recombinant DNA molecule can grow into a
AmpR colony consisting of some 108 bacteria, each with multiple
copies of the recombinant DNA molecule, billions of identi-
Transformation into E. coli and cal copies of DNA molecules are made.
selection on ampicillin-containing The use of plasmid vectors for cloning large DNA
medium. Plasmids enter only fragments is limited, mainly because large plasmids (more
Bacterial about 1 in 1000 cells, so the
genome probability of a cell having two than 20 kb) are not efficiently maintained in a high copy
E. coli cell independent plasmids is 10–6. number. This limitation restricts the usefulness of plas-
mids in cloning eukaryotic genomic DNA. Eukaryotic
a b c genomes can be large (the human genome is 3 * 109 bp),
with individual genes that are often much longer than
20 kb and therefore cannot be cloned in a single plasmid.
To overcome these limitations of plasmids, vectors capable
In bacteria, plasmids are amplified by DNA replication and of handling larger clones have been developed. Two gen-
transmitted to progeny by cell division. eral approaches have been employed to propagate larger
DNA fragments. In one approach, vectors based on the life
DNA replication cycle of bacteriophages accommodate larger fragments of
DNA. The second approach harnesses chromosomal ori-
gins of replication to efficiently propagate larger recom-
a a b c c
b binant DNA molecules. We look now more closely at this
a a c c second approach.
to be cloned to span the entire genome. As we discuss in Synthesize first strand cDNA
Section 16.1, however, a set of genomic libraries that each using reverse transcriptase.
have a different-sized insertion can be useful for determin- 3¿ T T T T T T 5¿
ing the sequence of an entire genome. 5¿ A A A A A A 3¿
Constructing cDNA Libraries The starting material for a Partially degrade mRNA using
RNase H.
cDNA library is mRNA, often derived from a specific tis-
sue or cell type. Messenger RNA cannot be cloned directly 3¿ T T T T T T 5¿
because it is single stranded and is of course RNA, not 5¿ 3¿
DNA. Cloning of mRNA sequences can be accomplished Synthesize second strand
by synthesizing a double-stranded cDNA copy of the cDNA using DNA polymerase
mRNA and then ligating the cDNA into a vector. cDNA and remaining mRNA
fragments as primers.
libraries are especially useful for working with eukaryotic
organisms whose gene sequences are interrupted by many 3¿ T T T T T T 5¿
long introns. 5¿ A A A A A A 3¿
The concept and development of cDNA libraries required
S1 nuclease Protect EcoRI sites in cDNA from
advances in understanding the life cycle of retroviruses and blunts ends. digestion using EcoRI methylase.
the movement of retrotransposons (see Section 11.7). The
availability of the enzyme reverse transcriptase, found 3¿ T T T T T T 5¿
The recombinant DNA technology described in this In contrast to situations where exogenous DNA is
section, combined with the techniques of DNA sequenc- introduced into the genome of an organism, creating a
ing, and the polymerase chain reaction (PCR) described in transgene, the use of CRISPR–Cas9–mediated genome
Section 7.5, enable sophisticated in vitro manipulation and editing (see Section 14.3) often does not involve the intro-
characterization of DNA molecules. However, biology is duction of exogenous DNA. In cases where no exogenous
“in vivo,” and the questions geneticists ask pertain to how DNA has been added, the United States Agriculture Depart-
genes behave in the context of the living cell or organism. ment has deemed the resulting organisms to be nontrans-
Thus, techniques have been developed to introduce in vitro– genic. The discussions that follow in this section primarily
constructed DNA molecules into living organisms. describe methods to construct transgenic organisms, but
they will also at times refer to CRISPR–Cas9–mediated
genome editing.
15.2 Introducing Foreign Genes
into Genomes Creates Transgenic Expression of Heterologous Genes
Organisms in Bacterial and Fungal Hosts
Bacterial transformation by a recombinant plasmid is the
The introduction of a gene from one organism into primary method for generating transgenic bacteria. As seen
the genome of another organism creates a transgenic in Section 15.1, foreign DNA can be introduced into bac-
organism. The introduced gene is known as a transgene; teria, such as E. coli, using a plasmid vector possessing
if the introduced gene comes from a different species, it sequences required for DNA replication and also possessing
is a heterologous transgene. The two principal challenges a selectable marker, such as antibiotic resistance, to facili-
to creating a transgenic organism are (1) the need to intro- tate the identification of transformants.
duce DNA into a cell in such a way that the DNA integrates Expression vectors are vectors that have been fur-
into the genome and (2) the need to provide appropriate nished with sequences capable of directing efficient tran-
regulatory sequences so that the transgene will be properly scription and translation of transgenes (Figure 15.11). For
expressed. transgenes to be properly expressed in E. coli, regulatory
Because cells of different organisms differ in the abil- sequences compatible with the transcription and transla-
ity to import DNA from their environment and in their pro- tion machinery in E. coli need to be present in the vector.
pensity to recombine exogenous DNA into their genomes,
protocols for introducing transgenes vary according to the E. coli expression vector
organism. Nevertheless, the production of transgenic organ- Multi-cloning
isms is surprisingly straightforward, perhaps because natu- site (MCS)
rally occurring mechanisms have evolved in most lineages Transcription
of life for the uptake or delivery of DNA. Many organisms –10 terminator
or cells will absorb DNA from their environment, and once –35
inside the cell, one potential fate of the DNA is to recombine Shine–Dalgarno
into the genome. Recall our discussion of certain naturally sequence in 5¿ UTR for
efficient translation
occurring versions of this process, including gene transfer by Promoter
in bacteria
Hfr donors into recipient bacteria, transduction of genes from
a bacterial donor to a recipient, and gene transfer between
and within species by transformation (see Chapter 6). Regulatory sequences to control
Although the designing of transgenes utilizes tech- transcription of inserted gene in
niques of recombinant DNA technology, the expres- E. coli (regulatory sequences
from the lac operon).
sion of transgenes is like the expression of any gene: The
Bacterial
gene sequence must first be transcribed into mRNA and origin of
then translated into a polypeptide. The universality of the replication
genetic code permits the translation of coding sequences
even when they have been transferred between the most
distantly related organisms—even when one of them is Bacterial
selectable
bacterial or archaeal and the other a eukaryote. However, marker
regulatory sequences and their molecular interactions with
transcriptional and translational machinery vary signifi-
cantly among organisms, and they are not interchangeable
between distantly related organisms. Thus, for transgenes to Figure 15.11 Typical features of expression vectors for E. coli.
be efficiently expressed, they must be combined with host Q How would you design a eukaryotic expression vector—that is,
regulatory sequences. what regulatory elements would you need to include?
566 CHAPTER 15 Recombinant DNA Technology and Its Applications
Expression vectors for use in E. coli are constructed from heterologous transgenes in any case where genes are being
plasmids that have been equipped with promoter sequences transferred between distantly related species.
that bind RNA polymerase upstream of the multi-cloning A second possible obstruction to the production of func-
site (MCS) of the plasmid. Recall that the MCS is a clus- tional heterologous proteins in E. coli is presented by the
ter of unique restriction sites into which the gene to be posttranslational modifications many proteins must undergo
expressed is inserted in recombinant clones. Efficient trans- to function. Posttranslational modifications of proteins dif-
lation of mRNA in E. coli also requires the presence of a fer between species, in particular between eukaryotes and
Shine–Dalgarno sequence in the 5′ untranslated region of bacteria. For example, carbohydrate and lipid groups are
the mRNA, another feature that is built into E. coli expres- added to many kinds of eukaryotic proteins. In addition,
sion vectors. In addition, since mRNA-splicing machinery the functions of proteins may be modified by phosphory-
does not exist in bacteria, eukaryotic transgenes must be lation, acetylation, or methylation of amino acid residues;
free of introns if they are to be properly translated in bac- other posttranslational polypeptide processing; and specific
teria. This requirement necessitates the use of cDNAs as protein-folding activities. Most of these processes either do
eukaryotic transgenes in E. coli expression systems. not occur in bacterial cells or they occur but with significant
Expression of the heterologous gene carried by an differences. In such cases, eukaryotic cells, such as yeast
expression vector can be either constitutive (“on” all the or cells in tissue culture, and eukaryotic expression vec-
time) or regulated by the addition or removal of inducer tors must be used. Eukaryotic expression vectors have the
compounds. An example of the latter approach is the use eukaryotic features analogous to the features found in bac-
of the regulatory apparatus of the lac operon of E. coli to terial expression vectors, including sequences for the regu-
induce expression of transgenes: Fusion of the lac opera- lation of transcription (such as a TATA box for binding of
tor and CAP binding sites of the lac operon to the RNA RNA polymerase II), enhancer sequences for qualitative and
polymerase binding site allows the transgene to be con- quantitative control of transcription, and polyadenylation
trolled in the same inducible manner as the genes of the lac and transcription-termination signals.
operon (the lac operon is described in Section 12.2).
Two kinds of variation in the genetic mechanisms Production of Human Insulin in E. coli. A gene encoding
of living organisms can hamper the efficient production insulin was among the first human genes to be expressed
of functional transgenic products. The first complication in E. coli, and human insulin was the first protein manu-
affects the efficiency of translation. Although the universal factured from recombinant DNA technology for therapeu-
genetic code does indeed allow the translation of heterolo- tic use in humans. Insulin, a protein hormone, regulates
gous transgenes, organisms vary in the degree to which they sugar metabolism in animals by stimulating liver and mus-
use specific codons when the genetic code contains more cle cells to take in glucose, and fat cells to take in lipids,
than one for a given amino acid or signal. In most species, from the blood. Individuals who are unable to produce
synonymous codons are not used with equal frequency. For insulin, or whose cells cannot respond to it, have diabetes,
example, glycine is encoded by GGN, with N representing an often debilitating disease that affects millions of people
any nucleotide, but GGA and GGG are rarely used in E. coli, worldwide.
whereas these codons are commonly used in the other organ- Insulin is cyclically produced in the pancreas by spe-
isms listed in Table 15.1. The tRNAs corresponding to fre- cialized cells in the islets of Langerhans and is released
quently used codons are expressed at higher levels than are into circulating blood in response to the ingestion of sugar-
the tRNAs for rarely used codons. This preferential use of containing carbohydrates. The pancreatic cells initially
codons is called codon bias. Thus, for efficient production synthesize a 110–amino acid precursor protein called pre-
of heterologous proteins in E. coli, the codon usage within proinsulin that is not secreted and does not have hormonal
the heterologous gene sequences may have to be altered function until it is proteolytically processed. Twenty-four
to approximate the codon bias in E. coli. Note that such N-terminal amino acids—the “pre” amino acids of prepro-
changes do not alter the amino acid sequence of the encoded insulin—are cleaved from the precursor to produce proin-
protein; they only alter the efficiency with which translation sulin, an event followed by the cleavage of an additional 35
occurs in E. coli. Codon bias can affect the expression of amino acids—called the “pro” segment—from the middle
of the protein (Figure 9.17). Further cleavage generates two The amino acid sequence of insulin was determined
amino acid chains, called the A chain and the B chain, that by Fred Sanger in the early 1950s (Figure 15.12, 1 ), but
are 21 and 30 amino acids, respectively, in length. The A the human gene encoding insulin was not identified until
chain is joined to the B chain by disulfide bonds between the late 1970s. Even before the human insulin gene was
cysteine residues to produce insulin. cloned, however, molecular biologists began experiments
1 Amino acid sequence of human insulin B chain was determined by peptide sequencing.
Phe Val Asn Gln His Leu Cys Gly Ser His Leu Val Glu Ala Leu Tyr Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys Thr
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
2 A nucleotide sequence was created by reverse translation of the amino acid sequence. Two successive stop
codons were added following the open reading frame.
Coding 5¿ TTCGTCAATCAGCACCTTTGTGGTTCTCACCTCGTTGAAGCTTTGTACCTTGTTTGCGGTGAACGTGGTTTCTTCTACACTCCTAAGACTTAATAG 3¿
Template 3¿ AAGCAGTTAGTCGTGGAAACACCAAGAGTGGAGCAACTTCGAAACATGGAACAAACGCCACTTGCACCAAAGAAGATGTGAGGATTCTGAATTATC 5¿
3 A methionine codon was inserted at the beginning of the insulin B coding sequence to facilitate
subsequent isolation of the insulin B protein.
5¿ ATGTTCGTCAATCAGCACCTTTGTGGTTCTCACCTCGTTGAAGCTTTGTACCTTGTTTGCGGTGAACGTGGTTTCTTCTACACTCCTAAGACTTAATAG 3¿
3¿ TACAAGCAGTTAGTCGTGGAAACACCAAGAGTGGAGCAACTTCGAAACATGGAACAAACGCCACTTGCACCAAAGAAGATGTGAGGATTCTGAATTATC 5¿
4 EcoRI and BamHI sites were added to the ends of the DNA to facilitate cloning into a vector.
5¿ GAATTCATGTTCGTCAATCAGCACCTTTGTGGTTCTCACCTCGTTGAAGCTTTGTACCTTGTTTGCGGTGAACGTGGTTTCTTCTACACTCCTAAGACTTAATAGGATCC 3¿
3¿ CTTAAGTACAAGCAGTTAGTCGTGGAAACACCAAGAGTGGAGCAACTTCGAAACATGGAACAAACGCCACTTGCACCAAAGAAGATGTGAGGATTCTGAATTATCCTAGG 5¿
piB1
AmpR
Figure 15.12 Producing human insulin in E. coli. This strategy was used in the late 1970s by the City
of Hope National Medical Center and the biotechnology company Genentech to produce human insulin in
E. coli. The entire DNA fragment was chemically synthesized.
568 CHAPTER 15 Recombinant DNA Technology and Its Applications
designed to produce human insulin in E. coli by construct- many eukaryotic proteins in bacteria for medical, indus-
ing recombinant plasmids containing chemically syn- trial, and agricultural applications. For example, in addition
thesized DNA encoding human insulin. An experimental to human insulin, proteins such as human growth hormone
strategy called the two-chain method utilized two synthetic (HGH) and erythropoetin (which induces red blood cell for-
genes, one encoding the A chain and the other encoding mation) are produced in bacterial systems. The recombinant
the B chain. Each synthetic gene was constructed from oli- systems used to produce these and many other pharmaceuti-
gonucleotides whose sequence was based on the reverse cal and industrial agents are safe and effective sources of
translation of the amino acid sequences of the human insu- otherwise scarce material. For example, before the produc-
lin gene chains 2 . tion of human insulin by recombinant DNA technology,
The synthetic genes were cloned into separate plas- insulin was extracted from pig and cow pancreases collected
mid vectors. In each case the chain was fused, in the same as a by-product of the meat industry. Pig and cow insulin
reading frame, to the 3′ terminus of the lacZ gene encod- are very similar to human insulin, but not identical to it; as
ing b@galactosidase. Genetic constructs like this, consist- a result, allergic reactions compromised their use by people
ing of two or more genes or gene segments joined together with diabetes. Insulin extractions from animals also carry
to form a new, artificial gene, are called chimeric (Section a risk of contamination from the source tissues. Likewise,
14.4) or fusion genes. Transcription and translation of a HGH extracted from the pituitary glands of human cadav-
fusion gene produce a fusion protein, which in each of ers carries a risk of transmitting neurological disease (e.g.,
these cases contained the polypeptide of one insulin chain Creutzfeldt–Jacob disease) due to the possible presence of
fused to the carboxyl terminus of b@galactosidase (the pro- contaminating proteins. Both recombinant human insulin
tein product of the lacZ gene). To separate the insulin pep- and recombinant HGH have proven safe and effective over
tides from b@galactosidase peptides and to form functional decades of use.
insulin molecules, a methionine residue was engineered Many proteins used in industrial processes as well as in
into the fusion protein at the junction between the N-termi- everyday household products are produced in bacteria. For
nal end of the insulin peptides 3 and the C-terminal end of example, proteases are protein-degrading enzymes added to
the b@galactosidase peptides to serve as a peptide cleavage laundry detergents to aid in removing stains from clothing.
site 4 . Isolation of genes encoding proteases from psychrophilic,
In the recombinant plasmid, transcription is under con- or cold-loving, bacteria has allowed the industrial produc-
trol of the lac operator regulatory sequences. Gene tran- tion of proteases that act in cold water, leading to substantial
scription is induced by lactose in the absence of glucose 5 savings in energy costs stemming from household hot water
(see also Section 12.2). Under appropriate growth conditions, usage.
up to 20% of the total protein produced by the recombinant Bacteria are also utilized to produce many food-pro-
E. coli strains is the fusion protein. Treatment of proteins cessing enzymes and food additives, such as vitamins.
with cyanogen bromide (CNBr) cleaves peptide bonds at A complex of enzymes called rennet, which is produced
the carboxyl end of methionine residues 6 . Apart from the in mammalian stomachs, has traditionally been used in
methionine that was inserted at the junction of the two pep- cheese production to form curds in milk. Due to the lim-
tides, there are no other methionine residues in the fusion ited supply of rennet derived from stomachs isolated pri-
protein, so CNBr treatment releases the insulin chains from marily from young calves, alternative sources have been
the b@galactosidase peptides without causing any other developed. Although some microbes naturally produce
breaks. When the A and B chains are purified from their enzymes that curdle milk, the primary source today is a
recombinant host strains and mixed together under oxidizing process using genetically engineered microbes, either
conditions, disulfide bonds form to link the A and B chains bacteria or fungi, that produce the curdling enzyme chy-
and produce active insulin molecules 7 . mosin from genes originally derived from animals. Like-
The recombinant human insulin molecules origi- wise, many vitamins, such as A, D, B12, and B2, that are
nally produced by this method were identical to naturally added to “fortified” cereals and breads are produced in
occurring human insulin. Since the implementation of this genetically modified microbes. Perhaps to the detriment
synthetic process in the 1980s, however, more-efficient of American’s nutrition, the addition of these microbially
methods for producing recombinant human insulin have produced vitamins to breakfast cereals was halted in the
been developed. Some of these methods have introduced United States.
amino acid changes in the recombinant human insulin, to The genetic engineering of E. coli and other microbes
create proteins that have different desired effects on the to produce proteins or compounds used in industry, agri-
uptake of glucose by targeted cells. These various forms culture, and health care is an active field that will flourish
of recombinant human insulin are used every day around in the coming years as more microbial systems are investi-
the world by millions of people with insulin-dependent gated at the genomic and physiological levels. An example
diabetes. of the transfer of an entire biochemical pathway into E. coli
The ease and economy of working with bacteria com- to produce a medically important compound is described in
pared with eukaryotes have made it practical to produce Experimental Insight 15.2.
15.2 Introducing Foreign Genes into Genomes Creates Transgenic Organisms 569
From the bacterial perspective, the outcome of this nat- by genes of the Ti plasmid outside of the T-DNA recognize
ural transformation event is the expression of genes in the specific sequences in the left and right border and catalyze
T-DNA that encode proteins causing plant cells to (1) divide the transfer of a single strand of T-DNA from the bacterium
in an uncontrolled manner and (2) produce amino acids to the plant cell; when this occurs, the gene of interest that
only the bacterium can utilize as an energy source. Agro- has been inserted between the two border sequences will be
bacterium essentially reprograms the plant cells into food transferred as well. As with any other protocol for construct-
factories for the bacteria. Bacterial genes encoding plant- ing transgenic organisms, a selectable marker is included
hormone–biosynthesizing enzymes cause transformed plant (between the left and right borders) in addition to the gene
cells to produce high levels of two plant hormones, auxin of interest to allow efficient selection of transformed plants.
and cytokinin, which in turn cause uncontrolled division of For experiments with plants, genes conferring resistance to
plant cells, resulting in tumor formation (Figure 15.13c). The either antibiotics (which inhibit translation in the chloro-
other genes on the T-DNA encode opine-biosynthesizing plast) or herbicides may be employed as selectable mark-
enzymes. Opines, such as nopaline and octopine, are amino ers. The selectable marker genes are usually expressed
acids that do not naturally occur in plants; therefore, plants using a promoter that confers constitutive expression, so
do not produce any enzymes capable of metabolizing opines. that transgenic plants can be selected at any stage of their
Agrobacterium does have such enzymes, however; conse- development.
quently, the opines produced by the plant cells can be used Because the Ti plasmid is too large to be easily manipu-
as carbon and nitrogen sources by the bacteria. Other genes lated, most experimental protocols that use Agrobacterium
on the Ti plasmid, but not located within the T-DNA region, construct a strain harboring two plasmids: One is a dis-
encode enzymes required for the transfer of the T-DNA to armed Ti plasmid, and the second is a plasmid that contains
the plant cell. In nature, the transfer of T-DNA into a plant left and right border sequences flanking the DNA of interest
genome usually occurs in somatic cells and is thus not trans- (Figure 15.14a). This strategy, separating the functional ele-
mitted to the next generation. However, we know of at least ments of the Ti plasmid into two plasmids, is referred to as
one case in which the T-DNA has entered the germ line the binary approach. It results in the efficient transfer of the
and now forms part of the genome of a species—the sweet DNA of interest into the plant cell and its subsequent inte-
potato, Ipomoea batatas. It is estimated that the transfer gration into the plant genome (Figure 15.14b).
occurred at least 8000 years ago, since the T-DNA is found Unlike bacteria and yeast, which are single-celled
in both cultivated and wild varieties, making the sweet organisms, transformed plant cells must be regenerated into
potato a naturally transgenic food crop. an entire plant to reveal the effects of transgenes on the plant
Sequence analysis has revealed that the genes involved phenotype. Traditionally, scientists have taken advantage of
in the transfer of T-DNA are evolutionarily related to a unique feature of plant development, the totipotency of
those involved in the transfer of the F-factor in E. coli (see most plant cells: Under the appropriate environmental and
Section 6.2). Thus, Agrobacterium has evolved a mechanism hormonal conditions, an entire normal plant can be regener-
to transfer DNA into plant cells by adapting genes originally ated from a single isolated plant cell. Thus, after infection
involved in bacterial conjugation. A striking aspect of this of plant cells with the modified Agrobacterium strain and
cross-kingdom gene transfer is that the genes on the T-DNA selection of transformed cells on the basis of the selectable
have evolved to be transcribed and translated efficiently in marker gene, progeny plants can be regenerated from the
plant cells instead of in bacterial cells. In nature, Agrobacte- individual transformed cells (Figure 15.14c). This technique
rium normally transforms plants only; but in the laboratory, has been successfully applied to a wide variety of flower-
the bacterium has the ability to transfer DNA into almost ing plant species, including crop species such as rice, maize,
any eukaryotic cell, including human cells. and tomatoes.
Plant researchers using Arabidopsis as a model system
Creating Transgenic Plants Scientists can use Agrobacte- for studying basic biological processes sought an easier
rium to transfer any gene of interest into plants. To do so, method of transformation that would not require regenera-
they remove the opine- and tumor-producing genes nor- tion from a single transformed cell. After several different
mally found in the T-DNA and replace them with DNA techniques were attempted, they discovered that the simple
encoding the gene of interest. The T-DNA then transfers the technique of dipping Arabidopsis flowers into a culture
gene of interest into the plant cell, where it becomes inte- of Agrobacterium works surprisingly well. It allows the
grated into the genomic DNA of the plant. T-DNA to be transferred directly from Agrobacterium to the
To repeat, the general strategy for modifying the Ti egg cell of the female gametophyte. In this protocol, trans-
plasmid for transformation procedures starts with deletion genic plants are selected from seed produced by the plant
of the tumor-inducing and opine genes, producing what is exposed to Agrobacterium.
called a “disarmed” Ti plasmid. Then the gene of interest is Many plant species are susceptible to Agrobacterium-
inserted between the two ends of the T-DNA region, referred mediated transformation. If they are not, DNA can be
to as the left and right borders. These border regions contain directly introduced into their cells. The cell walls of isolated
sequences required for efficient transfer. Proteins encoded plant cells are first removed enzymatically, after which the
572 CHAPTER 15 Recombinant DNA Technology and Its Applications
ori
region (bacterial
Disarmed
selectable Transformation
Ti plasmid
marker) vector
(T-region removed)
AmpR
(bacterial
selectable
marker)
ori
ori
“Disarmed” plasmid contains genes required for
virulence and conjugative transfer; lacking T-region, Transformation vector contains T-region
it is no longer able to induce crown gall disease. flanked by right and left border sequences.
Disarmed
Ti plasmid
Transformation vector
Genes on disarmed plasmid produce conjugative and
virulence proteins that act in trans on T-DNA border sequences
of transformation vector to effect transfer of T-DNA, which
contains the inserted gene of interest, into plant cell.
cells are mixed with heterologous DNA and given a heat or pathways, but in other cases, a transgenic approach, exem-
electrical shock to depolarize the membrane and facilitate plified by Golden Rice, is required.
the entry of DNA. Once in the cell, the DNA has the same Rice (Oryza sativa) is the major staple food for much of
fate as described above for DNA transferred into fungi. In the world. Because oil tends to become rancid, especially in
plants, homologous recombination is rare relative to illegiti- tropical climates, rice is often milled until its oil-rich outer
mate recombination, so the most common outcome is the layer has been removed. Unfortunately, the remaining edible
insertion of the heterologous DNA into a random location in grain, the endosperm, lacks several micronutrients, includ-
the genome. In another technique, DNA is introduced into ing provitamin A, a vitamin A precursor. (Vitamin A can be
plant cells by particle gun bombardment, the use of high obtained directly through consumption of animal products
pressure to fire microscopic particles coated with DNA into or indirectly from plants that produce carotenoids, which
plant cells. The particles are propelled with enough force to are converted to vitamin A after ingestion and are therefore
penetrate the cell wall and plasma membrane. Both of these termed provitamin A.)
techniques can be applied to any plant species. Vitamin A deficiency results in blindness and increased
disease susceptibility, thus contributing to childhood mor-
Transgenic Plants in Agriculture The two most common tality in many developing countries. It is estimated that
traits engineered into transgenic crops grown today are vitamin A deficiency affects between 140 million and 250
herbicide resistance and insect resistance. With herbicide- million preschool children worldwide, leading to 250,000
resistant crops—for example, the varieties sold as Roundup to 500,000 cases of blindness per year. Because no wild or
Ready—farmers can apply herbicide to a field to clear the domesticated cultivars of rice produce provitamin A in the
ground of weeds and other noncrop plants without damag- endosperm, recombinant technologies, rather than a conven-
ing the crop itself. This reduces the amount of tilling done tional breeding program, are required to produce rice that
to plow weeds under at the beginning of the season. Less has an endosperm containing provitamin A.
tilling results in less soil loss and also saves on the use of Scientists knew that rice endosperm synthesizes gera-
fossil fuels. nylgeranyl diphosphate (GGPP), a precursor in the synthesis
Cotton and maize crops resistant to insect herbivory are of carotenoids. Study of the carotenoid biosynthetic path-
two of the most widely grown transgenic crops. Insect resis- way in plants suggested that five plant-derived enzymes are
tance is usually conferred by the expression of genes derived needed to convert GGPP to b@carotene. However, the dis-
from the bacterium Bacillus thuringiensis. Genes encoding covery that a single bacterial enzyme (CRTI) could replace
approximately 100 insect toxins, known as Bt toxins, have three of the plant enzymes (PDS, ZDS, CRTISO) simplified
been identified in different strains of B. thuringiensis. The the genetic engineering strategy (Figure 15.15a). Then, in
toxins work by perforating the guts of different insect spe- 2000, Ingo Potrykus, Peter Beyer, and colleagues reported
cies, and different toxins have different “host” specificity. that the addition of only two genes, a daffodil-derived gene
Transgenic plants expressing genes encoding Bt toxins are called PSY and the bacterial gene called CRTI, resulted in the
less palatable to insects and exhibit reduced insect herbiv- production of b@carotene in rice endosperm (Figure 15.15b).
ory. As a consequence, transgenic plants expressing Bt toxin This outcome was surprising because a gene called LCY was
genes require significantly less application of insecticides expected to be necessary as well, but apparently the endog-
than do nontransgenic plants, thus reducing the insecticide enous rice LCY gene is already expressed in endosperm.
load in the environment. Subsequently, work has focused on tailoring the
Although Bt toxins are clearly toxic to insects, other process so that (1) the transgenes would be expressed only
herbivores, such as humans, are impervious to the com- during endosperm formation and only in endosperm, (2) the
pounds. The properties of Bt toxins have been appreciated b@carotene synthesis could be increased using different
for some time. Organic farmers routinely spray B. thuringi- versions of the genes, (3) the selectable marker could be
ensis directly on their crops to act as a “natural” insecticide. removed from the transgenic lines, and (4) the transgenes
Millions of acres of transgenic maize, cotton, and potatoes could be introduced into rice cultivars that are typically used
expressing Bt genes and of herbicide-resistant soybeans are by subsistence farmers in southeast and south central Asia
presently cultivated in the United States and several other and Africa. These improvements have led to transgenic lines
countries. that could provide part of the required daily intake of provi-
tamin A (Figure 15.15c).
Golden Rice Although many transgenic crops thus far used The funding for the research to produce Golden Rice
in agriculture have primarily benefited farmers in the devel- was public, in part from the Rockefeller Foundation, but
oped world, the humanitarian potential for crop modifica- patents on many of the techniques and tools used to gener-
tion in aid of subsistence farmers in developing countries ate the transgenic rice are held by biotech companies. For-
is exemplified by techniques for biofortifying staple foods tunately, these companies agreed to license the inventors of
with vitamins or minerals. In some crops, an increase in Golden Rice to provide the technology free of charge for
nutritional content can be accomplished by conventional humanitarian use in developing countries. Golden Rice is
breeding or by genome editing of endogenous genetic an example of how customized crops could be developed
574 CHAPTER 15 Recombinant DNA Technology and Its Applications
(a) Synthesis of beta-carotene to address specific nutritional needs and public health prob-
lems caused by dietary deficiencies.
In bacteria In plants
One potential hurdle for the introduction of genetically
Geranylgeranyl diphosphate (GGPP) modified biofortified crops is that they are usually, at least
initially, produced in only a single genetic background,
PP or genotype, that may or may not be suited to local grow-
ing conditions or tastes. This obstacle, which is also often
1 PSY 1 PSY a problem with conventionally bred new varieties, can be
2 PDS overcome by conventional breeding to cross the desired trait
2 CRTI 3 ZDS into different genetic backgrounds, or by introduction of the
4 CRTISO transgene directly into locally favored genotypes.
Transgenic plants have been largely accepted in some
Lycopene Lycopene
parts of the world, but many concerns have been raised
3 bLCY 5 aLCY, bLCY about their introduction. Some critics fear that transgenes
could be adverse to human health—for example, that people
4 b-HYD 6 a-HYD, b-HYD
may have allergic reactions to the protein product of a trans-
b-carotene OH gene. Another concern is that the transgenes may “escape”
into the environment if transgenic crop plants interbreed
with related species growing nearby. The likelihood of this
occurrence can be reduced by not growing transgenic crops
HO
in environments harboring related species that have poten-
tial to interbreed. Transgenic crops must be tested to allay
(b) Recombinant plasmids these concerns, but we must also recognize that, although
Gt1 regulatory sequence the concerns about transgenic agricultural crops are valid,
Daffodil PSY Bacterial CRTI they are equally applicable to the cultivation of crops devel-
Gt1 regulatory Selectable marker oped by traditional breeding methods.
sequence
Right T-DNA
Left T-DNA
T-region Transgenic Animals
GRI plasmid
Protocols for the generation of transgenic animals are
First-generation golden rice (GRI): Daffodil phytoene synthase gene similar to those described for fungi and plants, but as with
(PSY) and bacterial CRTI gene from Erwinia uredovora are driven plants, homologous recombination occurs much less fre-
with rice glutelin-1 (Gt1) endosperm regulatory sequences (green).
quently than illegitimate recombination (i.e., recombination
Bacterial CRTI Maize PSY not based on sequence homology). Totipotency is not char-
Selectable marker acteristic of most animal cells; thus, methods to produce
Right T-DNA transgenic animals rely on the injection of DNA into eggs,
Left T-DNA
T-region embryos, or cells that will give rise to gametes, with the hope
that the injected DNA will be integrated into the genome
GRII plasmid
either by homologous or illegitimate recombination.
Second-generation golden rice (GRII): A maize PSY gene was Where injection directly into gametes is not feasible,
exchanged for the daffodil PSY gene, boosting the production of DNA can be injected into isolated cells that are subsequently
b-carotene. transplanted into an embryo. The embryo then develops as a
genetic mosaic, an organism in which some cells have a dif-
(c) Appearance of wild-type and transgenic rice ferent genotype than others, and will transmit transgenes to
GRII GRI progeny only if the embryonic germ cells carry a copy of the
transgene.
As with the protocols utilized in fungi and plants, meth-
ods for the production of transgenic animals vary depend-
b-carotene
produced in ing on the biological characteristics specific to each type of
endosperm organism. Here we provide examples of the various meth-
ods available for creation of transgenic animals. We focus
Wild type on Drosophila melanogaster and Mus musculus (mice), two
(no b-carotene)
widely used genetic model animals.
transposons, offered an efficient means of creating trans- their method, two DNA molecules, one a modified P element
genic Drosophila, in most cases inserting only one copy of and the other a DNA molecule encoding the transposase
the DNA being transferred (see Section 11.7 for a descrip- but lacking the sequences required for transposition, are co-
tion of P elements). Their idea was to use the endogenous injected into a Drosophila embryo. The modified P elements
activity of P elements to transpose transgenes into the are induced to insert into the genome at random positions by
genome (Figure 15.16). the action of the transposase. Typically, only a single P element
Based on their knowledge of P element transposition, is inserted. This strategy resembles the use of Agrobacterium
Rubin and Spradling reasoned that they could replace much to transform plants in that it too utilizes a biological system
of the P element DNA with exogenous DNA as long as that has evolved to recombine DNA into a host genome.
(1) transposase, the enzyme that controls P element move- Since P elements transpose only in the germ-line cells
ment, was provided; and (2) the P element ends were retained, of Drosophila, the injection is made into an early-stage
since these are required for recognition by the transposase. In embryo, targeting those cells that will give rise to the germ
line. Early-stage Drosophila embryos are syncytial (consist-
ing of a single, multinucleate cell; Section 18.2), and nuclei
The P element used as a vector contains the gene of interest and at the posterior end of the syncytium are most likely to give
also the rosy + gene conferring wild-type eye color but lacks a rise to the germ cells. The fly derived from the injected
functional transposase. A second plasmid supplies the transposase embryo is therefore a mosaic in which most soma (the parts
activity in trans. of the organism other than germ cells) and some gametes
Vector plasmid Second plasmid are wild type, but some soma and gametes are transgenic.
Gene of interest When the injected fly is mated with an uninjected fly of the
P element same strain, gametes into whose genomes a P element was
3¿ transcriptional transposase
terminator inserted will produce transgenic progeny.
A commonly used selectable marker in Drosophila is
5¿ P element
inverted repeat rosy + the rosy (ry) gene. In the procedure under discussion, the
end sequences gene embryos to be injected are ry -/ry - and have rosy eyes,
rather than the wild-type red eyes. A wild-type, ry +, copy
AmpR
AmpR 3¿ of the gene is included in the modified P element, in addi-
ori
ori tion to the DNA to be transformed into the fly. Although
Co-inject plasmids flies derived from the injected embryos will have rosy eyes,
AmpR and ori are for into rosy– embryos
propagation of
some of the progeny of those flies, derived from transgenic
plasmids in bacteria. gametes of the injected embryo, will have red eyes due to
rosy –/rosy –
the action of the dominant ry + allele on the inserted P ele-
ment. As is characteristic of transposons, P elements insert
Transposase activity
inserts the P element.
into the genome at random locations.
Endogenous sockeye salmon growth hormone gene Two general approaches are available for creating
transgenic mice, a targeted approach and a nontargeted
Enhancer elements of sockeye approach. The nontargeted approach, in which the trans-
salmon growth hormone gene Sockeye salmon
are responsive to light and active growth gene is randomly inserted into the genome through illegiti-
only in spring and summer. hormone gene mate recombination, is similar to that illustrated for salmon
TATA 5¿ in Figure 15.17. In contrast, targeted approaches insert
box UTR
the transgene into a specific locus in the genome, either
through homologous recombination or CRISPR–Cas9–
Combine gene mediated genome editing. The targeted methods have
fragments in vitro
using recombinant transformed the study of mouse biology by allowing for the
Engineered sockeye salmon gene DNA technology. creation of mice with specific loss-of-function (or knock-
out) and gain-of-function alleles. In 2007, Mario Capecchi,
Enhancer elements from sockeye
salmon metallothionein-B gene Sockeye salmon Martin Evans, and Oliver Smithies shared the Nobel Prize
activate gene expression growth in Medicine or Physiology for their work leading to the
throughout the year. hormone gene development of knockout mice via homologous recombi-
TATA 5¿
box UTR
nation. Today most studies would employ CRISPR–Cas9–
mediated genome editing, but for historical reasons we
look at the technique of homologous recombination here.
DNA injected
into coho The CRISPR–Cas9–mediated genome editing approach is
salmon egg presented in the next section.
The overall strategy for producing knockout mice by
homologous recombination is similar to that described for
homologous recombination in yeast. The transformation
DNA integrates vector contains two regions of DNA homologous to the tar-
into the nuclear
genome. get locus flanking a positive selectable marker—meaning
a gene that enables its host to survive the screening pro-
Wild-type Injected eggs cess (Figure 15.18a). An example of a positive select-
coho salmon develop into able marker is the Neomycin (Neo) gene, whose product
Transgenic coho salmon adult salmon. metabolizes the drug G418, which blocks translation and is
Salmon of the two genotypes lethal to mammalian cells. A vector containing the homol-
are of the same age. ogous regions is capable of integration into the genome by
homologous recombination, but more than 99% of integra-
Figure 15.17 Creation of transgenic salmon through injection tions will occur by illegitimate recombination. To select
of DNA into salmon eggs. against nonhomologous recombination events, a negative
selectable marker—a gene that by its presence suppresses
growth or survival of the host—is added to the vector out-
integrated as multicopy concatemers—that is, multiple tan- side one of the regions of homology to the target gene
dem copies of the inserted DNA—often resulting in abnor- (Figure 15.18a).
mal levels of gene expression. Second, the expression of A commonly used negative selectable marker is a thy-
the transgene can be abnormal because of the chromosomal midine kinase (tk) gene derived from a herpes simplex virus.
environment in which it is located. For example, if the trans- Thymidine kinase catalyzes the addition of a phosphate
gene is inserted into heterochromatin, gene expression may to deoxythymidine, forming deoxythymidine monophos-
be altered as described for position effect variegation in phate, which is eventually converted to deoxythymidine
Drosophila (see Section 13.2). triphosphate, one of the substrates for DNA synthesis. In
Note that the problem of transgene position effects is contrast to mammalian thymidine kinase, thymidine kinase
shared by all transgenic organisms in which the transgene from herpes simplex virus can also catalyze the addition of
is integrated into the genome by illegitimate recombina- phosphate to thymidine analogs that cause chain termina-
tion; but whereas position effects can pose problems in Dro- tion when incorporated into DNA. Because the endogenous
sophila and plants, they are exacerbated in vertebrates, like mammalian thymidine kinase does not recognize the thy-
salmon and mice, due to the larger average size of vertebrate midine analogs as substrates, only those cells expressing
genes and the larger amount of heterochromatin in verte- the herpes simplex virus tk gene are sensitive to the thymi-
brate genomes. The mouse (Mus musculus) is an important dine analogs. Thus, cells harboring the viral tk gene will be
genetic model for human diseases and human physiology, selected against when plated on media containing the thy-
so it was important to overcome the problems of variability midine analog ganciclovir. Such thymidine analogs are also
in transgene expression by developing methods to more pre- used as potent antiviral medications, since only cells harbor-
cisely insert transgenes into mice. ing the virus are sensitive to the analog.
15.2 Introducing Foreign Genes into Genomes Creates Transgenic Organisms 577
(a) Create knockout allele by homologous recombination (b) Generate knockout mouse from ES cells.
in embryonic stem cells.
Homology with CFTR gene
Linearized targeting Neomycin Thymidine
vector injected into resistance kinase Isolate blastocysts.
ES cells BB CFTR +/cftr –
Positive Negative CFTR+/CFTR+
Construct a clone bb Inject heterozygous ES
containing the mouse selectable selectable
marker marker cells into host blastocysts,
CFTR gene and replace creating blastocysts that
the central region with are genetic chimeras
a positive selectable containing both wild-type
marker gene, disrupting Three possible fates cells and heterozygous
the CFTR gene. of injected DNA. When coat-color mutant cells.
mutants are used,
chimeric offspring are
readily identified: B– =
1 Homologous 2 Illegitimate 3 NoNo
3. brown; bb = black.
recombination recombination recombination Inject blastocysts into
uterus.
Figure 15.18 Creating a loss-of-function CFTR (cystic fibrosis transmembrane conductance regulator) allele in mice through
homologous recombination. Mutations in the human ortholog are the cause of cystic fibrosis.
For transformed mouse cells to survive, they must to produce a cell that possesses the positive and lacks the
acquire the positive marker and must lose the negative negative marker. Selection for this type of transformation
marker. The occurrence of a homologous recombination is called positive–negative selection. A related protocol,
event between the negative and positive markers is one pos- negative–positive–negative selection, where negative select-
sible way in which the introduced DNA can be integrated able markers are positioned at each end of the introduced
578 CHAPTER 15 Recombinant DNA Technology and Its Applications
DNA, has been successfully used to identify homologous recombination within the bacteriophage genome or for inter-
recombination events in plants, such as rice, and should be molecular recombination into host genomes. These recombi-
generally applicable to any species. nation systems can be harnessed for producing recombinant
What types of mammalian cells are typically tar- DNA molecules in vitro and for recombining DNA mol-
geted for gene transfer? The blastocyst-stage mammalian ecules in vivo. Bacteriophage site-specific recombination
embryo consists of an outer sphere of cells and a small systems have two components: (1) DNA sequences in the
pool of cells inside the sphere. At the blastocyst stage, bacteriophage genome that are identical to sequences in the
the internal cells, known as embryonic stem (ES) cells, target bacterial genome and (2) an enzyme, commonly called
are totipotent. The production of a transgenic mouse a recombinase or integrase, that binds to the identical DNA
starts with the isolation of ES cells from the mouse strain sequences and catalyzes their recombination. Two bacterio-
to be transformed. The ES cells are grown in culture, phage recombination systems, one in bacteriophage l and the
and DNA is introduced into the cells, often by transient other in bacteriophage P1, have proven particularly valuable
depolarization of their membranes to make the cells per- in the development of site-specific recombination for use in
meable to DNA. The cells are then transferred to media molecular biology experiments.
containing the agents for positive and negative selection, A site-specific recombination system derived from bac-
and transformed cells in which homologous recombina- teriophage P1 utilizes Cre recombinase, a bacteriophage-
tion occurred are selected. encoded protein that acts to recombine DNA containing
The selected transformed ES cells are reintroduced into loxP sequences (Figure 15.19). The loxP sites are 34-bp
a blastocyst from a mouse of a genotype different from that sequences consisting of two 13-bp inverted repeats sepa-
of the transformed cells, allowing the progeny derived from rated by an 8-bp spacer that provides asymmetry, and they
the transformed ES cells to be detected (Figure 15.18b). are specifically recognized by Cre recombinase. The Cre
For example, alleles conferring differences in coat color are recombinase binds to two loxP sites and catalyzes a recom-
often used. The blastocyst, now carrying transformed ES bination event between them. If the two loxP sites are direct
cells, is implanted into a surrogate female mouse. Because repeats, the intervening DNA is deleted, whereas if the two
only some of the ES cells in the host blastocyst are trans- loxP sites are inverted relative to one another, the interven-
genic, the mouse that develops from the embryo in which ing sequence is inverted.
the transformed cells were introduced is a genetic mosaic The Cre–lox recombination system has been adapted
in which some tissues are derived from the transformed to recombine DNA in vivo in transgenic organisms. For
ES cells and other tissues are derived from host ES cells. example, loxP sites are added to the ends of the DNA to
Mosaic animals can be readily identified by their variegated be deleted or inverted and the construct is then introduced
coat color. as a transgene into an organism. Later, a second trans-
It is hoped that at least some of the gametes of the chi- gene encoding the Cre recombinase is also introduced into
meric offspring of the host mouse will be derived from the the same organism. In cells where the Cre recombinase
transformed ES cells, so that some mice in the subsequent is expressed, the DNA flanked by the loxP sites will be
generation will be heterozygous for the mutation caused by deleted or inverted.
the homologous recombination event. If two heterozygous
offspring of this generation are interbred, mice homozy-
gous for the mutation can be produced. Technologies for
the construction of other transgenic mammals, including When loxP sites are direct When loxP sites are inverted
repeats: relative to each other:
sheep, cats, cows, horses, monkeys, and rats, follow a simi-
lar protocol. loxP loxP loxP loxP
One reason a geneticist might want to delete a trans- cells by mitosis, but the alteration will not be inherited by
gene after having introduced it into the genome is to assess progeny of the individual undergoing somatic gene therapy.
the function of the gene at specific times and in specific The specific somatic cells to be targeted depend on the dis-
tissues during development. For example, if a null loss-of- ease in question. For example, in individuals with cystic
function allele results in embryonic lethality, the role of fibrosis, the epithelial cells of the lungs represent a logi-
the gene at later developmental stages is difficult to assess. cal target, since lungs are severely affected in cystic fibro-
One approach to determining the postembryonic function of sis. On the other hand, for diseases of the blood, cells of
such genes is to complement a loss-of-function mutant with the various hemopoietic lineages are the target cells; they
a functional copy of the gene flanked by loxP sites. Then, can be removed from bone marrow, treated, and returned to
Cre recombinase can be supplied in specific cells or tissues the same individual. Somatic gene therapy turns the treated
of interest. In cells where the Cre recombinase is active, individual into a genetic chimera that has the transgene
the transgene will be deleted, causing these cells and their present in the target cells but not in other somatic cells or
descendants to have a mutant genotype. If the Cre recombi- in germ cells. Somatic gene therapy can potentially be used
nase is driven by a promoter that confers inducible expres- to treat several genetic diseases whose phenotype becomes
sion or expression that is temporally or spatially restricted, apparent early in childhood.
a genetic chimera can be created, allowing an assessment of In essence, gene therapy involves extracting cells from
gene function in specific tissues. an organism, correcting the genetic defect, and then rein-
A second application of site-specific recombination is serting the cells back into the body in a manner that permits
the removal of selectable markers in transgenic organisms. them to function appropriately. Each gene therapy procedure
An objection to the use of transgenic organisms in agri- is accompanied by technical difficulties that depend on the
culture is that some transgenic strains contain a selectable specific circumstances of the disease. In some cases—for
marker providing resistance to antibiotics, which might example, in hemopoietic diseases—the cells to be treated
spread into the natural population. The antibiotic-resistant can be extracted from the body, treated in vitro, and then
marker genes were used to select the transgenic organism injected back into the body. However, in other cases—for
but are no longer needed once the transgenic organism has example, cystic fibrosis, in which lung epithelial cells are
been identified. One strategy for eliminating the selectable the target—the cells must be treated in situ because they
marker is to flank the unwanted transgene with loxP sites in cannot be removed from the patient.
a direct repeat orientation. A plant containing this transgene The alternative strategy for gene therapy, germinal
is then crossed with another transgenic plant expressing the gene therapy, targets cells of the germ line, which give rise
Cre recombinase, and the unwanted transgene is deleted in to gametes. Because germinal gene therapy alters germ-line
the F1. It is then possible, by selective breeding, to segregate cells, the therapeutic transgene is transmitted to the progeny
the transgene encoding the Cre recombinase away from the of the treated individual. Both types of gene therapy have
desired transgene in subsequent generations. been successful in animal systems; but for ethical reasons,
Genetic Analysis 15.2 asks you to put some of these ideas only somatic gene therapy has been attempted in humans. In
to work by designing a mouse model of a human disease. the following paragraphs, we discuss somatic gene therapy
using embryonic stem cells in humans and describe modifi-
cations of these protocols suggested by successful somatic
15.3 Gene Therapy Uses gene therapy experiments in mice.
Evaluate
1. Identify the topic this problem addresses 1. This problem about recombinant DNA technology asks how to construct a
and the nature of the required answer. specific strain of transgenic mouse.
2. Identify the critical information given in 2. The desired disease model is of Huntington disease (HD), described as an
the problem. autosomal dominant mutation that consists of an expanded sequence of
trinucleotide repeats. The transgenic mouse is to be used to test therapies
and drugs.
Deduce
3. Inheritance patterns are always a key 3. Since HD is dominant, a phenotype should be evident if a single mutant
consideration in genetic research designs. allele is introduced into the genome.
Identify the inheritance pattern of the HD
phenotype.
4. Evaluate the ways in which the HD allele 4. Transgenic mice can be generated by random integration of a transgene
can be transferred into mice. or, alternatively, by homologous recombination that replaces the endog-
enous gene with a mutant version.
5. Choose the method of generating a 5. Because we want the transgene to be expressed in the same pattern as
transgenic mouse that will come closest the wild-type mouse HD gene, homologous recombination is the best
to modelling the disease of interest. approach, because the mutant HD gene will then be in the same genomic
PITFALL: Randomly integrated transgenes context and will be expressed in the same pattern as the wild-type gene.
may exhibit variation in expression patterns.
Solve
6. Design a strategy to replace the wild-type 6. The positive–negative selection approach outlined in Figure 15.18 to
mouse HD gene with a mutant version of produce a transgenic mouse by homologous recombination results in a
the human HD gene. loss-of-function allele. This approach must be modified to create a gain-of-
function allele.
a. Construct a vector in which a human mutant HD gene is flanked by
PITFALL: Since a functional allele is mouse HD regulatory sequences (5′ and 3′ of the HD gene).
desired, the positive selectable marker
must not interfere with HD transgene b. The positive selective marker gene can be placed downstream of the
function. HD gene, in a position not likely to interfere with HD gene expression,
or could be removed using the Cre–lox approach outlined in Figure
15.19.
c. A second type of transgenic mouse, expressing the wild-type human
gene driven by the same regulatory sequences, would provide a useful
control to compare with the specific phenotypic effects induced by the
expression of the mutant allele.
For more practice, see Problems 7, 8, 11, 27, and 30. Visit the Study Area to access study tools. Mastering Genetics
induced to differentiate into the appropriate cell type to treat Creating ES Cells From Differentiated Cells of the Adult
the genetic disease. As illustrated in the mouse experiment Body In many cases, the diagnosis of a genetic disease is
described below, the ability to create and manipulate ES not made until early childhood, when the body no longer
cells provides a means of isolating cells from an individual, possesses any ES cells, because they form only during early
correcting mutations in the cells, and reintroducing the “cor- embryogenesis. How can ES cells be obtained from a person
rected” cells into the body. who has none? The answer is to create ES cells from other
580
15.3 Gene Therapy Uses Recombinant DNA Technology 581
cells of the body. In 2006 and 2007, a series of experiments Gene Therapy Proof of Principle: Curing Sickle Cell
demonstrated that mouse or human fibroblasts, a type of cell Disease in Mice These advances in iPS cell biology have
occurring in connective tissue, could be reprogrammed in set the stage for the use of iPS cells in gene therapy. Proof of
vitro to behave like stem cells. These reprogrammed cells principle (a phrase used by scientists to mean proof that the
have been called induced pluripotent stem cells, or iPS cells. general idea is valid) was provided using a mouse model for
(The word pluripotent is used because scientists do not yet sickle cell disease (Figure 15.20). The basic strategy being
know if the iPS cells are totipotent.) tested consisted of harvesting adult cells 1 , reprogramming
The reprogramming of differentiated cells was accom- adult cells into iPS cells 2 , repairing the genetic defect 3 ,
plished by expressing a combination of three to four tran- differentiating the iPS cells into hemopoietic precursors in
scription factors (choices included Oct4, Sox2, c-Myc, and vitro 4 , and transplanting the corrected cells into bone mar-
Klf4). These transcription factors are normally expressed row of affected mice 5 .
in ES cells and appear to be sufficient to induce repro- The starting point for this test of somatic gene therapy
gramming of the transcriptional networks of differentiated was the creation of a “humanized” mouse model for sickle
somatic cells into networks characteristic of ES cells. These cell anemia by substituting human a@globin genes for the
four transcription factors act in combination as pioneer fac- endogenous mouse a@globin genes and substituting human
tors to activate embryonic gene expression and indirectly bS (sickle) globin genes for the mouse b@globin genes. Mice
repress the genetic program of the differentiated cell via homozygous for the bS@globin allele (bS/bS) exhibited typi-
reprogramming of the epigenetic marks on the chroma- cal disease symptoms, including severe anemia and eryth-
tin (see Section 13.2). Although it is not clear whether the rocyte sickling. Fibroblasts isolated from the tail of bS/bS
epigenetic marks in iPS cells and ES cells are identical, iPS mice were infected with retroviruses encoding the Oct4,
cells appear to be essentially equivalent to ES cells. The Sox2, and Klf4 transcription factors and with a lentivirus
four factors are sometimes referred to as Yamanaka fac- encoding the c-Myc transcription factor. Expression of these
tors, after Shinya Yamanaka, who shared the 2012 Nobel four transcription factors resulted in the reprogramming of
Prize in Medicine with John B. Gurdon for their discovery the fibroblast cells into iPS cells. On either side of the c-Myc
that adult differentiated cells could be reprogrammed to be gene on the lentivirus, lox sites had been placed, to allow the
pluripotent. gene to be excised from the genome when the cells were
One impediment to all strategies of gene therapy is the infected with an adenovirus encoding Cre recombinase.
challenge of delivering genes or gene products to the cells Although the other three transgenes were not removed in
of interest. For example, after you isolate fibroblast cells, this experiment, their removal by a similar mechanism is
how do you introduce the four transcription factors into also recommended.
the cell? Gene therapy methods often take advantage of In the original experiment in 2007, homologous
viruses that have evolved mechanisms to enter specific cell recombination–based gene replacement was used to correct
types. Essentially, viruses are harnessed to transduce the the bS@globin allele (see Figure 15.18). However, in Fig-
transgene into the target cells the way that bacteriophages ure 15.20 we illustrate how CRISPR–Cas9 genome editing
accomplish the transduction of DNA between bacteria (see would be employed for that purpose, since that is now the
Section 6.4). The viruses can be “disarmed” so that they no method of choice. The components of the CRISPR–Cas9
longer have the ability to cause the diseases associated with system—the guideRNAs and either mRNA encoding Cas9
their wild-type relatives. Several types of viral vectors have or the Cas9 protein itself—can be injected directly into iPS
been used, including gamma-retroviruses, lentiviruses, and cells. To correct the defect, a linear DNA template encoding
adenoviruses. the wild-type version of the gene is injected along with the
Many viral vectors deliver transgenes by integrating CRISPR–Cas9 components. In some cases, two guideRNAs
into the genome of the target cell. Integration provides a are used, one on each side of the site to be repaired, to create
mechanism for stable gene transfer and thus permanent double-strand breaks flanking the site. Following the forma-
correction of the defect. Integration of the vector into the tion of double-strand breaks, homologous recombination
genome is not without risks, however; the insertion may of the wild-type template results in correction of the defect
cause a detrimental mutation, a problem that has plagued (i.e., genome editing). Unlike homologous recombination,
most human gene therapy experiments to date. where the resulting iPS cells are heterozygous, the approach
Another problem associated with the use of iPS cells illustrated here permits genome editing to occur on both
is that the continued expression of the Yamanaka factors chromosomes, resulting in homozygosity of the wild-type
predisposes cells to become cancerous. Thus, methods allele.
are needed to stop the expression of the Yamanaka fac- The bA/bA iPS cells can then be differentiated into
tors once they have induced iPS cell formation. One such hemopoietic progenitors (HPs, cells that have the potential
method is to flank the genes encoding the Yamanaka to differentiate into any of the hemopoietic lineages) by
factors with lox sites so the genes can later be excised infection with another retrovirus encoding the HoxB4 gene,
from the genome by providing the iPS cells with Cre which induces the differentiation of ES cells into HPs when
recombinase. incubated with cytokines secreted from bone marrow cells.
582 CHAPTER 15 Recombinant DNA Technology and Its Applications
Figure 15.20 Genetic therapy for mice with sickle cell disease.
The bA/bA HPs are next transplanted back into bS/bS mice implementing such a protocol in humans. For example, due
in which the endogenous bS/bS bone marrow cells have to their insertion in the genome retroviruses can cause unin-
been eliminated by irradiation, so that now the bA/bA HPs tended mutations and the introduction of an oncogene has
constitute the primary source of hemopoietic cells. In the the potential to cause cancer (see Application Chapter C).
original 2007 experiment, the HoxB4 coding sequence was In addition, researchers have yet to ascertain whether iPS
translationally fused with that of green fluorescent protein cells are truly totipotent or still contain an epigenetic mem-
(GFP), so the activity of the HP cells could be monitored by ory of their origin. Because an individual’s own cells are
the presence of GFP + cells in the blood. Subsequently, by used as the raw material for genetic modification, there is no
all physiological tests, the mice receiving the corrected HPs problem of immune system incompatibility. However, this
were cured of sickle cell disease. approach is limited to diseases, such as blood disorders, in
These experiments in mice suggest there is promise which cells can be isolated, genetically corrected, and rein-
in the use of ES or iPS cells for gene therapy, but at least troduced into the body.
two facets of gene therapy procedures continue to cause In recent years other approaches combining elements of
concern. Problems associated with using retroviruses and the method described above have been investigated for treat-
oncogenes for reprogramming need to be resolved before ing genetic diseases unrelated to the blood. For example,
15.4 Cloning of Plants and Animals Produces Genetically Identical Individuals 583
Duchenne muscular dystrophy (DMD) is a progressive 15.4 Cloning of Plants and Animals
muscle-wasting disease caused by loss-of-function alleles in
the DYSTROPHIN gene. The gene has 79 exons, but even Produces Genetically Identical
if some internal exons are skipped, the encoded protein can Individuals
still function as long as the two ends are intact. A majority
of DMD patients have mutations in middle exons and thus Many plants have the capacity for vegetative (asexual)
could benefit if the mutant exon were skipped (shortening propagation in addition to sexual propagation. Poplar and
the resulting encoded polypeptide) but the ends remained aspen (Populus sp.) groves often consist of vegetatively
intact. Three groups used CRISPR–Cas9 genome editing propagated clones, all genetically identical. Some of these
to delete mutant exon 23 from a mouse model of DMD. In clonal groves are estimated to be at least 10,000 years old.
this model, the mutant exon harbors a nonsense mutation Humans, taking advantage of the ability of plants to repro-
consisting of an in-frame stop codon (Figure 15.21). An duce vegetatively, have been clonally propagating plants for
adenovirus, injected intramuscularly, was used to carry the centuries in agricultural practices. The bananas that you eat
genome-editing components into the muscle cells. In some are an example, all propagated via vegetative cuttings. In
of the muscle cells, exon 23 was specifically deleted, and this case, the vegetative propagation is necessary because
those cells began to produce functional dystrophin. Among the cultivated bananas are triploid and therefore do not pro-
other results of this research was the demonstration that duce viable seed—the black specks you see embedded in
editing could occur in muscle stem cells. The treated mice the flesh of the fruit are the aborted seeds (see Section 10.3
exhibited restored muscle structure and function. These for discussion of the effects of triploidy). With these tech-
studies hold promise for development of a somatic treat- niques, heterozygous genotypes of agriculturally desirable
ment for muscular dystrophy. specimens can be propagated intact, without the segregation
A second example is hemophilia B, caused by loss- of alleles that occurs during sexual reproduction; this main-
of-function alleles of a gene encoding clotting factor IX, tains the consistency of desirable traits while promoting the
which is normally produced in the liver and then exported hybrid vigor that can result in higher yields in comparison
into the bloodstream. Two adenovirus vectors, one carry- with inbred varieties (a topic also discussed in Section 10.3).
ing the CRISPR–Cas9 components, with Cas9 being fused Perhaps the most conspicuous example of agricultural
to enhancer modules driving expression only in the liver, vegetative propagation is the cultivation of grapes (Vitis
and the second vector containing a human factor IX cDNA, vinifera), which were domesticated 6000 to 7000 years
were introduced into mice carrying mutations in their fac- ago. Most grape cultivars are highly heterozygous; that is,
tor IX gene. The resulting genome editing led to a chime- they have two different alleles at many genomic loci. Thus,
ric mouse–human factor IX gene driven by the endogenous when they are self-fertilized or crossed with another culti-
mouse regulatory sequences, effectively “curing” the mice var, extensive segregation of genotypes and phenotypes is
of hemophilia B. observed in the progeny. Because this presents an obstacle
Permanent
Cas9 gRNAs
exon skipping
Gene lacking
exon 23 22 24
Transcription
mRNA lacking
exon 23 22 24 Inject intramuscularly
into dmd mutant mouse
Translation
to controlling the properties of grape plants through breed- Sheep to be cloned Surrogate mother
ing, cultivars that possess favorable phenotypes are propa- Finn Dorset Scottish Blackface
gated by cuttings (that is, additional plants are grown from
pieces of source plants). In most vineyards, the vines are
chimeric: The shoots are all genetically identical and cho-
sen on the basis of their fruit phenotype, and the roots, also
identical to one another, are of a different genotype that is
chosen for being well adapted to local soil conditions.
Several wine grape cultivars can be traced back to the Remove cells from Remove egg.
mammary glands.
Middle Ages, and some are likely to be even older. For
example, Pinot was first described in Roman times and is n
thought to be at least 2000 years old. Although clonal prop- 2n
agation has allowed maintenance of specific genotypes,
somatic mutations—due, for example, to errors in DNA rep-
lication and transposable element activity—can accumulate
over time and have led to phenotypic variation. For exam- Remove
ple, a mutation in a gene required for pigment synthesis led Mammary cell nucleus.
to the formation of Pinot blanc, a white-berry cultivar, from in culture
Pinot noir, the ancestral black-berry cultivar.
Extract
Unlike plants, most animals do not readily propagate nucleus.
clonally in nature—but there are exceptions. For exam-
ple, some aphid species undergo multiple parthenogenetic
Inject nucleus.
(clonal) generations in the spring and summer, followed by
sexual reproduction in the autumn. Since most animal cells
are not totipotent (embryonic stem cells excepted), animals Electroshock
do not readily regenerate from single cells. Thus, techniques to induce cell
for cloning animals, and in particular mammals, from single division and
differentiated cells are considerably more complicated than allow to
develop until
those for cloning plants. blastocyst stage.
Dolly, a sheep, born in July 1996, was the first cloned
mammal. In the protocol used to produce Dolly, a diploid
nucleus is isolated from a differentiated cell of the animal
to be cloned (Figure 15.22). This nucleus, containing all
the nuclear genetic information of the animal from which it
was taken, is injected into an egg cell that has had its own
nucleus removed. The egg cell can be derived from the ani-
mal to be cloned (if it possesses egg cells) or from a differ- Implant blastocyst
ent individual. If the nuclear transplantation is successful, in surrogate
the genome of the donor nucleus will direct the development mother’s womb.
of the embryo derived from the egg cell. The use of a dip-
loid donor nucleus means that fertilization with a sperm cell
is not required to produce a diploid nucleus in the embryo;
thus, the genetic constitution of the embryo will be identical
to that of the donor. Bear in mind, however, that while the
nuclear genome is genetically identical to that of the donor,
“Dolly,” a Finn
the mitochondrial genome is derived from the surrogate egg Dorset ewe, is born.
cell. The diploid egg cell is then induced to begin embryo-
genesis and implanted into a surrogate mother. If all goes
well, it will develop into a normal embryo, and birth of a
normal offspring will follow. Dolly with her
In most mammals, the frequency of success with this surrogate mother
protocol has been quite low. Dolly’s was the only one out of
270 implanted egg cells that resulted in the birth of a sheep. Figure 15.22 Cloning animals by nuclear implantation.
Donor cells have been derived from adult animals—Dolly’s
donor cell was a mammary gland cell—and are therefore
highly differentiated somatic cells rather than totipotent
Case Study 585
embryonic stem cells. In differentiated somatic cells, such was about half that of the average sheep in captivity, there
as those of the mammary gland, the patterns of facultative is no evidence that a failure in epigenetic reprogramming
heterochromatin (see Section 13.2) are vastly different from contributed to her shortened life span. Rather, she died of
those of embryonic stem cells. In other words, although lung cancer caused by a virus, a not uncommon cause of
the sequences of nucleotides in the genomes of differenti- mortality in sheep kept indoors.
ated and embryonic stem cells are identical, the epigenetic In the past decade, advances in knowledge of ES cell
modifications of the histones and DNA methylation patterns biology, in particular the discovery of the Yamanaka factors
differ. The low frequency of success in the initial attempts and their use to reprogram differentiated cells into iPS cells,
to clone mammals was likely due to deficiencies in repro- suggest that the cloning of mammals will improve over time.
gramming the genetic material of the injected nucleus to Already, many different mammals besides sheep have been
mimic the epigenetic modifications characteristic of an successfully cloned, including mice, cows, horses, donkeys,
embryonic stem cell. However, although Dolly’s life span cats, and dogs.
C A SE S T U D Y
Gene Drive Alleles Can Rapidly Spread Through Populations
In Chapter 2 we learned that during sexual reproduction manner 4 . The Cas9 and guideRNA genes from the first
in diploid organisms, each of the two alleles at any locus is allele will produce a CRISPR–Cas9 complex that can induce
inherited by 50% of the offspring. However, in some rare double-strand breaks in the second allele 5 that can be
cases, genetic elements called gene drives circumvent this repaired via homologous recombination using the first allele
Mendelian pattern of segregation by increasing the fre- as a template 6 . The end result is a homozygous individual
quency of inheritance of the gene drive allele over the wild- in which both alleles are now gene drive alleles.
type allele (Figure 15.23a). Gene drive alleles induce biased If the Cas9 and guideRNA genes are driven constitu-
inheritance patterns either by converting the wild-type allele tively, alleles in all somatic and germ-line cells can be con-
to a gene drive allele or by reducing the fitness of the wild- verted into gene drive alleles. If an individual with a gene
type allele in some manner. The former mechanism entails drive allele is crossed with a wild-type individual, the gene
the gene drive element copying and inserting itself into the drive allele has the capacity to convert the allele inherited
wild-type locus. This mechanism is described in more detail from the wild-type parent into a gene drive allele (Figure
below. Regardless of mechanism, if gene drive elements are 15.23a) in the same manner as the homologous chromosome
highly efficient, they have the potential to spread through a was in Figure 15.23b. Thus, the gene drive allele has the
population even if they impose a fitness cost on the organ- potential to spread throughout an interbreeding population.
ism. Although gene drive alleles exist in nature, they are The speed and extent of spread is dictated by the efficacy of
usually inefficiently propagated such that they are not often the gene drive allele at converting homologous alleles and
rapidly spread throughout a population. by the nature of the breeding population.
Note that each time an allele is converted to a gene
A GENE DRIVE ELEMENT CREATED WITH CRISPR–Cas drive allele, it also encodes both the Cas9 and guideRNA
For a gene drive allele to function, it must first recognize the genes, because they are located between the regions of
homologous wild-type allele and copy itself into that loca- genomic homology. The original construct could be modi-
tion. The insertion of a copy requires both the creation of a fied to include additional genes as well, often referred to as
double-strand break in the target DNA and, if the target DNA is cargo genes (Figure 15.23c), and as the gene drive allele
the homologous locus, the ability to recognize that sequence. spreads through a population, the cargo genes would also
These faculties, while rarely found in nature, can be engineered be disseminated. If either the gene drive allele itself or a
using the CRISPR–Cas9 genome editing complex; the Cas9 cargo gene confers a phenotype, this will also be propagated
protein harbors the endonuclease activity and a guideRNA throughout the population.
provides the required sequence specificity (Figure 15.23b).
To examine how this works in practice, envision a target APPLICATIONS Among the potential applications of gene
locus in the genome for which you design a complementary drives, the most commonly mentioned are to control the
guideRNA. A vector is constructed in which the Cas9 gene spread of vector-borne diseases, to suppress populations
and your guideRNA gene are placed in tandem, and flanking of agricultural pest species, and to reduce populations of
the two genes is included sequence identical to the genome environmentally destructive invasive species. Proof of prin-
sequence flanking your target site 1 . When this construct ciple has been obtained in two approaches to controlling the
is introduced into a cell, the CRISPR–Cas9 complex will cut spread of mosquito-borne malarial parasites. Cargo genes
the genomic target, creating a double-strand break 2 . The encoding anti-Plasmodium falciparum (the Apicomplexan
double-strand break can be repaired by homologous recombi- malarial parasite) effector proteins were disseminated in
nation using the DNA construct that was introduced, creating one approach. The other approach was aimed at spreading
an allele in which the Cas9 and guideRNA genes are inserted recessive loss-of-function alleles for three genes to produce
into the genomic target site 3 . This allele is a gene drive female sterility. Both approaches led to rapid spread of the
because it has the capacity to convert the homologous allele desired alleles in laboratory populations of the Anopheles
on the second chromosome into a drive allele in a similar mosquitos, the hosts for P. falciparum.
586 CHAPTER 15 Recombinant DNA Technology and Its Applications
2 Genomic DNA
Target site 5
Homologous
recombination
Homologous
recombination
6
7
(c)
Cargo Cas9 GuideRNA
Figure 15.23 How gene drive alleles can spread through populations.
Q How would the efficiency of gene drive be affected if the Cas9 gene were located on a different chromosome than the guideRNA
target locus?
Summary 587
Gene drives for reducing populations of invasive spe- of Sciences convened a meeting to discuss both the poten-
cies or agricultural pests could utilize various strategies. tial applications of gene drive alleles and the containment
For example, the gene drive allele could be targeted to protocols that must be in place when they are used, even in
an essential gene, one in which phenotypic defects are laboratory settings.
minimal in heterozygotes but are severely deleterious in Containment would be needed at both the molecular
homozygotes. If the drive allele is only active (e.g., Cas9 is and ecological levels. For example, a required molecular
expressed) during meiosis, then heterozygous animals will control would be to separate the Cas9 gene from the gene
be phenotypically normal but would pass the drive allele to a drive allele so that the gene containing the guideRNA and
high proportion of their gametes. This pattern of inheritance target site locus would be apart from (not linked to) and able
would eventually lead to spread of the allele and a collapse to segregate from the Cas9 gene. Because the guideRNA
of the population. Alternatively, in an organism with an XY would not be able to act as a gene drive allele without a
sex chromosome system often found in animals, gene drives supply of Cas9, the spread through a population would be
could selectively target destruction of a sex chromosome. greatly reduced and eventually extinguished. More sophisti-
For example, if a gene drive allele that targets sequences cated multicomponent systems are being tested to examine
on the X chromosome leading to X-chromosome destruction whether they would act through only a fixed number of gen-
is located on the Y chromosome, and its expression limited erations and thereby perform as transient gene drive systems
to spermatogenesis, it would target the destruction of the for local population control.
X chromosome in gametes. Thus, the only viable gametes These new scientific possibilities raise some unprec-
produced would be ones harboring the Y chromosome, edented ethical issues. From the earliest days in the develop-
bringing about a reduction in viable females and eventually ment of recombinant DNA technologies, the potential ethical
a population crash. problems and possible environmental and other concerns
have been the subject of intense debate. In 1975, following
CONCERNS Given the potential of gene drive alleles to an initial self-imposed moratorium, scientists met at Asilomar
affect entire populations of organisms, with ripple effects Conference Grounds, in California, to draw up a set of guide-
spreading through entire ecosystems, there is great concern lines addressing many of the safety concerns. Potential ethi-
about how, or if, to deploy such systems as biological control cal problems raised by gene drive technology will need to be
agents. Because of this concern, the U.S. National Academy addressed by similar public debates.
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
15.1 Specific DNA Sequences Are Identified 15.2 Introducing Foreign Genes into Genomes
and Manipulated Using Recombinant DNA Creates Transgenic Organisms
Technology ❚❚ Genes introduced into an organism are called transgenes.
❚❚ Restriction enzymes, which cut at specific DNA sequences, Genes introduced from another species are termed heterolo-
are used to fragment large DNA molecules into defined gous transgenes.
smaller pieces. ❚❚ Transgenes can be introduced into microbes by homolo-
❚❚ A restriction map of a DNA molecule can be constructed gous recombination into the chromosome.
by analyzing patterns of DNA fragments after restriction ❚❚ Agrobacterium and its tumor-inducing plasmid can be
enzyme digestion. harnessed to create transgenic plants in which the transfer
❚❚ DNA fragments can be ligated to create recombinant DNA DNA carries the desired transgene.
molecules, usually composed of a vector that can be ampli- ❚❚ Transgenic Drosophila are created by injection into
fied in a biological system and a target DNA insert to be embryos of a P element transposon carrying the transgene.
amplified. ❚❚ Transgenes are introduced into mice by direct injection
❚❚ Although cohesive compatible ends facilitate the creation of of DNA into isolated cells. The injected DNA can be
recombinant DNA molecules, any two DNA fragments can integrated either by homologous recombination or using
be ligated if their ends are made blunt. CRISPR–Cas9–induced DNA breaks.
❚❚ Amplification of recombinant DNA molecules in a biologi- ❚❚ Differentiated mammalian cells can be converted into plu-
cal system allows the production of DNA clones. ripotent iPS cells by the activity of the Yamanaka transcrip-
❚❚ Bacterial artificial chromosomes allow the cloning of large tion factors.
DNA molecules. ❚❚ Bacteriophage recombination systems can be used to
❚❚ Genomic libraries are collections of cloned DNA fragments manipulate DNA sequences in vitro and transgenes in vivo.
that represent the entire genome of an organism.
❚❚ cDNA libraries are collections of cloned DNA fragments
that represent the mRNA population of an organism or
tissue.
588 CHAPTER 15 Recombinant DNA Technology and Its Applications
15.3 Gene Therapy Uses Recombinant DNA 15.4 Cloning of Plants and Animals Produces
Technology Genetically Identical Individuals
❚❚ Gene therapy is the application of recombinant DNA tech- ❚❚ Many plants reproduce clonally in nature, whereas clonal
nology and transgenesis to treat human diseases. reproduction in animals is rare.
❚❚ In somatic gene therapy, transgenes are targeted to somatic ❚❚ Clonal reproduction in mammals requires reprogramming
cells and are not heritable. In germinal gene therapy, trans- of differentiated somatic cells into stem cells.
genes are targeted to germ cells and are thus heritable.
❚❚ Recent approaches to gene therapy involve genome editing
using CRISPR–Cas9.
PREPA R IN G F O R P R O B LE M S O LV I NG
In addition to the list of problem-solving tips and suggestions 3. Know the different techniques by which exogenous
given here, you can go to the Study Guide and Solutions Man- DNA (e.g., transgenes) are introduced into different
ual that accompanies this book for help at solving problems. organisms.
1. Be familiar with the basic techniques of recombinant 4. Know the approaches to somatic gene therapy using
DNA technology. Understand how DNA molecules are CRISPR–Cas9.
manipulated in vitro and how clones are propagated in
5. Recognize how the ways plants can be cloned differ
bacterial hosts
from the ways animals can be cloned.
2. Know the similarities and differences between genomic
and cDNA libraries.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. What purpose do the bla and lacZ genes serve in the plas- muscle tissue, a human brain cDNA library, and a human
mid vector pUC18? muscle cDNA library.
2. The human genome is 3 * 109 bp in length. a. Which of these would have the greatest diversity of
sequences?
a. How many fragments would be predicted to result
b. Would the sequences contained in each library be
from the complete digestion of the human genome
expected to overlap completely, partially, or not at
with the following enzymes: Sau3A (˘GATC),
all with the sequences present in each of the other
BamHI (G˘GATCC), EcoRI (G˘AATTC), and NotI
libraries?
(GC˘GGCCGC)?
b. How would your initial answer change if you knew 5. Using the genomic libraries in Problem 4, you wish to
that the average GC content of the human genome was clone the human gene encoding myostatin, which is
40%? expressed only in muscle cells.
3. Ligase catalyzes a reaction between the 5′ phosphate and a. Assuming the human genome is 3 * 109 bp and that the
the 3′ hydroxyl groups at the ends of DNA molecules. average insert size in the genomic libraries is 100 kb, how
The enzyme calf intestinal phosphatase catalyzes the frequently will a clone representing myostatin be found in
removal of the 5′ phosphate from DNA molecules. What the genomic library made from muscle?
would be the consequence of treating a cloning vector, b. How frequently will a clone representing myostatin be
before ligation, with calf intestinal phosphatase? found in the genomic library made from brain?
c. How frequently will a clone representing myostatin be
4. You have constructed four different libraries: a genomic found in the cDNA library made from muscle?
library made from DNA isolated from human brain tissue, d. How frequently will a clone representing myostatin be
a genomic library made from DNA isolated from human found in the cDNA library made from brain?
Problems 589
6. The human genome is 3 * 109 bp. You wish to design a either strand of the dsRNA. Could RNAi (see Sections
primer to amplify a specific gene in the genome. In gen- 13.3 and 14.3) be used in gene therapy for a defect caused
eral, what length of oligonucleotide would be sufficient by a recessive allele? A dominant allele? If so, what might
to amplify a single unique sequence? To simplify your be the major obstacle to using RNAi as a therapeutic
calculation, assume that all bases occur with an equal agent?
frequency. 12. Compare and contrast methods for making transgenic
7. Using animal models of human diseases can lead to plants and transgenic Drosophila.
insights into the cellular and genetic bases of the diseases. 13. It is often desirable to insert cDNAs into a cloning vector
Duchenne muscular dystrophy (DMD) is the consequence in such a way that all the cDNA clones will have the same
of an X-linked recessive allele. orientation with respect to the sequences of the plasmid.
a. How would you make a mouse model of DMD? This is referred to as directional cloning. Outline how you
b. How would you make a Drosophila model of DMD? would directionally clone a cDNA library in the plasmid
8. Compare methods for constructing homologous recombi- vector pUC18.
nant transgenic mice and yeast. 14. A major advance in the 1980s was the development
9. Chimeric gene-fusion products can be used for medical of technology to synthesize short oligonucleotides.
or industrial purposes. One idea is to produce biologi- This work both facilitated DNA sequencing and led
cal therapeutics for human medical use in animals from to the advent of the development of PCR. Recently,
which the products can be easily harvested—in the milk rapid advances have occurred in the technology to
of sheep or cattle, for example. Outline how you would chemically synthesize DNA, and sequences up to 10
produce human insulin in the milk of sheep. kb are now readily produced. As this process becomes
more economical, how will it affect the gene-cloning
10. Why are diseases of the blood simpler targets for treatment approaches outlined in this chapter? In other words,
by gene therapy than are many other genetic diseases? what types of techniques does this new technology
11. Injection of double-stranded RNA can lead to gene silenc- have potential to supplant, and what techniques will
ing by degradation of RNA molecules complementary to not be affected by it?
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
15. The bacteriophage lambda genome can exist in either a enzymes and obtain the following results. Draw a map of
linear form (see Figures 15.1 and 15.8) or a circular form. the fX174 genome.
a. How many fragments will be formed by restriction
enzyme digestion with XhoI alone, with XbaI alone, PstI 5386 PstI + PsiI 3078, 2308
and with both XhoI and XbaI in the linear and circular PsiI 5386 PstI + DraI 331, 1079, 3976
forms of the lambda genome? DraI 4307, 1079 PsiI + DraI 898, 1079, 3409
b. Diagram the resulting fragments as they would appear
on an agarose gel after electrophoresis. 18. To further analyze the CRABS CLAW gene (see Problems
16. The restriction enzymes XhoI and SalI cut their specific 19 and 20), you create a map of the genomic clone. The
sequences as shown below: 11-kb EcoRI fragment is ligated into the EcoRI site of the
MCS of the vector shown in Problem 18. You digest the
XhoI 5′-C TCGAG-3′
double-stranded form of the genome with several restric-
3′-GAGCT
C-5′
tion enzymes and obtain the following results. Draw, as far
SalI 5′-G TCGAC-3′ as possible, a map of the genomic clone of CRABS CLAW.
3′-CAGCT G-5′
Can the sticky ends created by XhoI and SalI sites be EcoRI 11.0, 3.0
ligated? If yes, can the resulting sequences be cleaved by EcoRI + XbaI 4.5, 6.5, 3.0 XbaI 4.5, 9.5
either XhoI or SalI? EcoRI + XhoI 10.2, 3.0, 0.8 XhoI 13.2, 0.8
17. The bacteriophage fX174 has a single-stranded DNA EcoRI + SalI 6.0, 5.0, 3.0 SalI 6.0, 8.0
genome of 5386 bases. During DNA replication, double- EcoRI + HindIII 9.0, 3.0, 1.5, 0.5 HindIII 12.0, 1.5, 0.5
stranded forms of the genome are generated. In an effort
to create a restriction map of fX174, you digest the What restriction digest would help resolve any ambiguity
z-stranded form of the genome with several restriction in the map?
590 CHAPTER 15 Recombinant DNA Technology and Its Applications
19. You have isolated a genomic clone with an EcoRI 20. You have identified a 0.80-kb cDNA clone that contains
fragment of 11 kb that encompasses the CRABS the entire coding sequence of the Arabidopsis gene CRABS
CLAW gene (see Problem 18). You digest the genomic CLAW. In the construction of the cDNA library, linkers
clone with HindIII and note that the 11-kb EcoRI with EcoRI sites were added to each end of the cDNA, and
fragment is split into three fragments of 9 kb, 1.5 kb, the cDNA was inserted into the EcoRI site of the MCS of
and 0.5 kb. the vector shown in the accompanying figure. You perform
a. Does this tell you anything about where the CRABS digests on the CRABS CLAW cDNA clone with restric-
CLAW gene is located within the 11-kb genomic tion enzymes and obtain the following results. Can you
clone? determine the orientation of the cDNA clone with respect
b. Restriction enzyme sites within a cDNA clone are to the restriction enzyme sites in the vector? The restriction
often also found in the genomic sequence. Can you enzyme sites listed in the dark blue region are found only in
think of a reason why occasionally this is not the case? the MCS of the vector.
What about the converse: Are restriction enzyme sites
in a genomic clone always in a cDNA clone of the EcoRI 0.8, 3.0
same gene? HindIII 0.3, 3.5
EcoRI + HindIII 0.3, 0.5, 3.0
ori
AmpR lacZ
T7
MCS
2961 bp
T3
T7 sequencing primer
5¿ G TAA AAC GAC GGC CAG TGA ATT GTA ATA CGA CTC ACT ATA GGG CGA ATT
3¿ C ATT TTG CTG CCG GTC ACT TAA CAT TAT GCT GAG TGA TAT CCC GCT TAA
GGA GCT CCA CCG CGG TGG CGG CCG CTC TAG AAC TAG TGG ATC CCC CGG GCT GCA GGA ATT CGA TAT CAA GCT TAT CGA TAC CGT CGA CCT CGA GGG GGG GCC CGG TAC CCA
CCT CGA GGT GGC GCC ACC GCC GGC GAG ATC TTG ATC ACC TAG GGG GCC CGA CGT CCT TAA GCT ATA GTT CGA ATA GCT ATG GCA GCT GGA GCT CCC CCC CGG GCC ATG GGT
GCT TTT GTT CCC TTT AGT GAG GGT TAA TTG CGC GCT TGG CGT AAT CAT GGT CAT AGC TGT TTC CTG 3¿
CGA AAA CAA GGG AAA TCA CTC CCA ATT AAC GCG CGA ACC GCA TTA GTA CCA GTA TCG ACA AAG GAC 5¿
T3 sequencing primer
differences in their retention by binding to gut proteins mutant form to study its biological activity in vitro and in
during digestion. The one retained at the highest level is transgenic mice. Outline how you would proceed.
a@tocopherol, whereas g@tocopherol is retained at less
than 10% of that efficiency. In Arabidopsis, a@tocopherol Gly Ala Gly Gly Val Gly
Wild-type RAS DNA:
is the most abundant tocopherol in leaves, whereas 5′. . . GGC GCC GGC GGT GTG GGC . . .3′
g@tocopherol is the most abundant in seeds. An enzyme
T
encoded by the VTE4 gene can convert g@tocopherol to
a@tocopherol. How would you create an Arabidopsis plant Mutant RAS DNA: GTC
that produces high levels of a@tocopherol in the seeds? Va1
30. The RAS gene encodes a signaling protein that hydrolyzes
GTP to GDP. When bound by GDP, the RAS protein is 31. You have cloned a gene for an enzyme that degrades lipids
inactive, whereas when bound by GTP, RAS protein acti- in a bacterium that normally lives in cold temperatures.
vates a target protein, resulting in stimulation of cells to You wish to transfer this gene into E. coli to produce
actively grow and divide. As shown in the accompanying industrial amounts of enzyme for use in laundry detergent.
sequence, a single base-pair mutation results in a mutant a. How would you accomplish this?
protein that is constitutively active, leading to continual b. You have managed to produce transgenic E. coli
promotion of cell proliferation. Such mutations play a expressing mRNA of your gene, but only a low level of
role in the formation of cancer. You have cloned the wild- protein is produced. Why might this be so? How could
type version of the mouse RAS gene and wish to create a you overcome this problem?
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
32. About 1% of occurrences of nonautoimmune type I diabe- engineered gene drive system (see Figure 15.23) could
tes are due to loss-of-function alleles in the insulin gene. slow the propagation of the gene drive allele in a popula-
Individuals heterozygous for such mutations develop tion into which a small number of individuals carrying
diabetes as infants or in the first few years of their lives. both the gene drive allele and the Cas9 locus are released.
Outline how you might approach gene therapy for such a
34. Would a gene drive system spread rapidly through a popu-
disease and what difficulties you might encounter.
lation in a species that tends to self-mate (e.g., Arabidop-
33. Describe how having the Cas9 gene at a genomic locus sis, C. elegans)? In a species in which the breeding cycle
unlinked to the guideRNA and target site locus in an is slow (e.g., humans)?
Genomics: Genetics
from a Whole-Genome
Perspective
16
CHAPTER OUTLINE
16.1 Structural Genomics Provides a
Catalog of Genes in a Genome
16.2 Annotation Ascribes Biological
Function to DNA Sequences
16.3 Evolutionary Genomics Traces
the History of Genomes
16.4 Functional Genomics Aims to
Elucidate Gene Function
The sequencing of entire genomes of many species from Charles Darwin’s ESSENTIAL IDEAS
“tangled bank” has clarified evolutionary relationships of life on Earth and
provided the genetic blueprints that define organisms, although the pre- ❚❚ The goal of sequencing the human
cise functions of most genes are presently unknown. genome stimulated technological
advances that enabled its realization.
G
In addition to the human genome,
researchers have now sequenced the
enomics, the scientific study of biological processes genomes of hundreds of bacteria and
from the perspective of the whole genome, origi- archaea and scores of eukaryotes.
nated in the Human Genome Project (HGP). This audacious ❚❚ The evolutionary history of a species is
written in its genome and can be read
project was initiated in the 1980s to sequence and analyze
both from its gene content and its chro-
the human genome. At the time, neither the technologies for mosome architecture.
generating large amounts of DNA sequence nor the comput- ❚❚ Genome-wide analyses of gene expres-
ing power to analyze such large amounts of data existed. sion, protein–protein interactions,
protein–DNA interactions, and genetic
Although a primary goal of the HGP was to sequence interactions provide insights into the
the human genome, several model genetic organisms were biological functions of the genes.
also sequenced under its auspices, including those that have
appeared most often in the pages of this book: Escherichia coli,
Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila
593
594 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
melanogaster, Arabidopsis thaliana, and Mus muscu- Genomes may consist of a single DNA molecule, as in
lus. The genome sequences of these model organisms many bacterial and archaeal species, or of hundreds of chro-
mosomes, as in some eukaryotic species. From a broad per-
have contributed to our understanding of the organ-
spective, gene number generally increases with organismal
isms themselves as well as to interpretations of human complexity. However, genomes also vary in their propor-
genome structure, function, and evolution. Since then, tions of coding versus noncoding DNA sequences, and in
the genomes of thousands of other bacteria, hundreds multicellular eukaryotes, genome size can increase much
more than gene number due to a disproportionate increase
of other eukaryotes, and many archaea have also been
in noncoding DNA.
sequenced. Due to ever-decreasing costs and ever- Ideally, one would start sequencing a genome from
improving technologies, genome sequencing is now so one end of each chromosome and proceed to the other end.
affordable and routine that it is becoming part of your In reality, this ideal is not yet possible. Even the smallest
bacterial genomes are thousands of times longer than the
medical record. In the future, species may be defined
600 to 900 bp that can be sequenced in a traditional single
by characteristics of their genomic sequence. dideoxy sequencing reaction, and longer than the “sequence
In the initial analyses of the genomes of model reads” (sequenced DNA fragments) that can be generated
organisms, two findings stand out. First, even in well- with third-generation sequencing (see Chapter 7). Clearly,
to sequence any genome would require many iterations of
studied organisms, only a fraction of genes identified
these procedures.
by genome sequencing had been previously identi- There are two basic strategies for sequencing large
fied by forward genetic analysis; this brings up the DNA molecules. The first technique, primer walking
question of the function of all the previously unknown (Figure 16.1a), relies on the successive synthesis of prim-
ers based on the progressive attainment of new sequence
genes. Second, genomic analyses have also revealed
information. The DNA sequence information obtained in
the highly dynamic nature of genomes, providing the first dideoxy sequencing reaction provides a foun-
insights into the extent of differences between individ- dation for the design of a second primer. If the second
uals of a species and between species, and also into primer is 600 to 800 bases from the first primer, the sec-
ond dideoxy sequencing reaction can extend the known
the rates at which DNA sequences evolve.
sequence up to 1800 bases from the first primer. Reitera-
This chapter provides an overview of genomics by tions of this process allow technicians to “walk” along
describing three of its major subdivisions. Structural a long DNA molecule, designing new primers every
genomics is concerned with the sequencing of 600 to 800 bases. The speed with which a molecule is
sequenced by this method is limited by its reiterative
whole genomes and the cataloging, or annotation, of
nature.
sequences within a given genome. It provides a parts A second method for sequencing large molecules
list of the genetic tool kit of an organism. Evolutionary of DNA is shotgun sequencing, an approach that relies
genomics is the comparison of genomes, both within on redundant sequencing of fragmented target DNA in
the hope that all regions will be sequenced at least a few
and between species. It illuminates the genetic bases
times. In this technique, a large DNA molecule (e.g., a
of similarities and differences between individuals BAC clone of 100 kb or an entire genome) is fragmented
or species. Functional genomics uses genomic se- into smaller pieces (Figure 16.1b). The fragments may be
quences to understand gene function in an organism. generated by partial restriction enzyme digestion or by
shearing the DNA. The key here is that fragmentation is
Together, these three approaches contribute to the
done in such a way as to produce random and hence over-
ultimate goal of understanding the role of every gene lapping pieces of the original molecule. The ends of these
a given genome contains. fragments can then be sequenced using a primer based on
vector sequences if the fragments are ligated into clon-
ing vectors, or based on the added linker sequence if a
next-generation sequencing approach is being used (see
Figure 7.31). The collection of fragments can be consid-
16.1 Structural Genomics Provides ered a library of sequences from the larger DNA mol-
a Catalog of Genes in a Genome ecule. The strategy is to sequence enough fragments to
assemble a complete contiguous sequence on the basis of
Genomes vary enormously in size, from several hundred overlaps in the generated sequences. Computer algorithms
kilobases in some bacterial species to several thousand are available to perform much of this task, allowing data
megabases in some vertebrate and plant species (Table 16.1). from millions of sequencing reactions to be assembled
16.1 Structural Genomics Provides a Catalog of Genes in a Genome 595
quickly. Thus, in shotgun sequencing, the sequencing of fragments in parallel. Computer algorithms are then used
the many different fragments proceeds simultaneously to assemble the sequences of the fragments into a single
(“in parallel”), allowing long DNA molecules to be contiguous sequence (contig). Two basic approaches to this
sequenced rapidly. general mode of attack differ only in the starting DNA to be
Clearly the more efficient way to sequence DNA mol- fragmented and sequenced. In one approach, called whole-
ecules (i.e., chromosomes) millions of bases in length is genome shotgun (WGS) sequencing, DNA representing the
to employ a shotgun sequencing strategy, breaking the entire genome is fragmented into smaller pieces, and a large
long DNA fragments into smaller ones and sequencing the number of fragments are chosen at random and sequenced.
596 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
(a) Sequencing by primer walking genetic resources (such as genetic and physical maps) and
1 Primers (gray), initially based on vector thus was applicable only to some model organisms.
sequences (orange), allow ends of clone
to be sequenced from both sides.
Whole-Genome Shotgun Sequencing
3 kb
The WGS approach sequences genomic DNA by the shot-
gun method without prior construction of a physical map.
For this reason, WGS can be applied to any genome. Once
2 New primers (gray) the genomic DNA is broken into fragments and sequenced,
are designed the sequences are assembled into contigs based on sequence
based on newly 3 Procedure is reiterated overlaps (Figure 16.1b). To ensure enough overlapping of
obtained until sequence from sequences for this purpose, technicians commonly gener-
sequence (red). both ends overlaps.
ate sequences totaling approximately 30 to 40 times the
(b) Shotgun sequencing actual length of the genome (this degree of overlap is called
100 kb 30940* coverage); thus, any one sequence occurs in mul-
DNA tiple reads, minimizing the chance of sequencing errors.
The ease with which sequences are assembled into contigs
1 Fragment into smaller
lengths (~2–3 kb) and clone
depends on the lengths of the sequencing reads, and these
using plasmid vectors. vary between technologies (see Section 7.5). Prior to the
Library of clones
development of third-generation sequencing technologies,
(purple) from DNA
sequence reads were limited to less than 1000 bp in length,
but now reads many kilobases in length may be utilized in
WGS approaches.
Repetitive DNA presents an obstacle in the assembly of
WGS sequencing data. Dispersed repetitive DNA sequences
(for example, transposons and retrotranposons) interfere
with genome assembly, as explained in Figure 16.2, because
they can map to multiple locations within the genome. Con-
sequently, the assembled sequence often remains broken at
repetitive sequences. One way of circumventing this prob-
lem is to use paired-end sequence data to bridge the gaps.
In paired-end sequencing, sequence is generated from both
ends of genomic DNA fragments of known size. The paired-
2 Sequence ends of clones (red). end sequences, some of which are on the ends of fragments
Each portion of the DNA
should be sequenced containing a repetitive element, can then be used to assem-
3 Assemble sequences >20 independent times ble the fragments into a scaffold, a set of contigs that are
into contiguous to facilitate assembly.
physically linked by paired-end sequences. The relative ori-
sequences (green) by
computer. entations of paired-end sequences and their distance from
one another can be incorporated into assembly algorithms
Contig
to construct the scaffold and ultimately show the locations
Primers 4 Use PCR (with primers of repetitive elements. Despite the high rate of errors with
PCR product based on flanking third-generation sequencing technologies, the use of their
Sequence sequences) to close long reads to facilitate assembly of contigs into scaffolds is
remaining gaps.
becoming commonplace.
Figure 16.1 Primer walking versus shotgun sequencing
Let’s examine how scaffold assembly works. Typically,
approaches. several genomic libraries are generated, each containing
cloned DNA fragments of a different size (Figure 16.3)—for
example, one library of 2- to 3-kb clones, a second of 6- to
In the second approach, often called clone-by-clone 8-kb clones, and a third of larger clones (20 to 30 or more
sequencing, each chromosome is first broken into overlap- kilobases). Paired-end sequence data generated from clones
ping clones that are then arranged in linear order to produce in the different libraries provide information on whether two
a physical map of the genome. Each clone in the map is then particular sequences are physically linked and the approxi-
sequenced separately. The WGS approach is applicable to mate distance between the two sequences. Even if repeti-
any genome and is the approach in widespread use today. tive DNA occurs between the paired-end sequences, they
The clone-by-clone approach, which has been supplanted can still be linked into a scaffold. Dispersed repetitive DNA
by the WGS approach, relies on the availability of specific in the genome often consists either of simple, short repeats
16.1 Structural Genomics Provides a Catalog of Genes in a Genome 597
Sequences: 80 kb
Unique Repeat Unique Repeat Unique Repeat Sequences:
A B C Unique Repeat Unique Repeat Unique
A B C
1 Fragment DNA and
shotgun sequence.
1 Construct three libraries
2 Generate paired-end of different sizes.
sequence reads.
20–30 kb Clone X Clone Y
A 6–8 kb
B
Repeat C Repeat
C B A
A B C
B C A
3 Assemble contigs.
B C A
Overlap of n clones
shotgun sequencing of the H. influenzae genome
Restriction
used to check and enzyme sites
confirm assembly
SmaI 1
H. influenzae SmaI NotI Base pairs
1800000 100000
genome SmaI RsrII
1.8 * 106 bp 1700000 SmaI
1600000 200000
SmaI
SmaI
SmaI
1 Construct three genomic libraries. SmaI
SmaI
2 Generate 6* 2 Generate 1*
paired-end paired-end rRNA 400000
sequence. sequence. 1400000
SmaI Origin of
11.6 * 106 bp sequence replication
500000
3 Assemble into tRNA
1300000
contigs.
Each line in the outer circle represents a gene, with the color
42 physical gaps 98 sequence gaps indicating predicted biological function.
Scaffold
n
Scaffold 1 Scaffold 2 Amino acid biosynthesis
Biosynthesis of cofactors, prosthetic groups, carriers
Cell envelope
Central intermediary metabolism
5 Identify n clone spanning Contig 1 Contig 2
Energy metabolism
physical gap using scaffold Purine, pyrimidines, nucleosides, nucleotides
end sequences as probes 5 Close sequence gap by Regulatory functions
approaches were combined to close the physical gaps. First, a significant fraction of repetitive DNA to be sequenced
the lambda genomic libraries were probed with sequences using a WGS approach. The Drosophila genome is approxi-
derived from the ends of the scaffolds: If a single genomic mately 170 Mb, of which 120 Mb is considered to be euchro-
clone hybridized with ends of two scaffolds, the clone matic and the remaining 50 Mb heterochromatic. Because
should span the gap between the two scaffolds. Second, centromeric heterochromatic DNA is not efficiently cloned,
polymerase chain reaction (PCR) methodology, using com- owing to its highly repetitive nature, only the euchromatic
binations of primers specific to sequences at the ends of scaf- portion of the genome was initially sequenced, using the
folds, was employed to amplify spanning sequences. With Sanger sequencing method (see Section 7.5).
this combination of approaches, the entire 1,830,137-bp Paired-end sequencing was accomplished using three
sequence of the H. influenzae genome was assembled into a genomic libraries, of 2 kb, 10 kb, and 130 kb (Figure 16.5).
single contig (Figure 16.4b). The 10-kb clones were large enough to span most of the
dispersed repetitive elements (such as transposons and ret-
WGS Sequencing of a Eukaryotic Genome The genome of rotransposons) found in the Drosophila genome, whereas
Drosophila was the first large eukaryotic genome containing the 130-kb clones provided long-range linking information
16.1 Structural Genomics Provides a Catalog of Genes in a Genome 599
Frequency of variation
sequence of the individual or individuals used to construct
the initial complete genome sequence is called the reference
genome sequence. Once a reference genome sequence is
constructed, polymorphisms in the species can be identi- Copy-number variants
fied by comparing the reference genome sequence with the
genome sequences of different strains collected from differ-
ent populations derived from the wild. This allows the refer-
ence genome sequence to be refined and enhanced to reflect Trisomy
Monosomy
genetic variation not displayed in the originally sequenced
genome. The use of next-generation sequencing technolo- 1 bp 1 kb 1 Mb 1 chromosome
gies, as well as the use of the reference genome sequence to Size of sequence variant
expedite the assembly of WGS sequence data from each new
subject, makes such “resequencing” of genomes relatively (b)
inexpensive. Thus, thousands of human genome sequences Repeat 1 Repeat 2
have now been produced and used to augment and improve a b c
understanding of the reference human genome sequence.
Genetic variation ranges from differences in the iden-
Unequal crossing over
tity of a single nucleotide—that is, single nucleotide during meiosis between a b c
polymorphisms (SNPs)—to larger-scale structural varia- repeats 1 and 2
tions, such as insertions and deletions—collectively called
indels—and inversions. These indels and inversions are in a b b c
turn— collectively called “structural variants,” and their prev-
alence— was previously unknown until large-scale sequenc- Duplication of b
ing studies brought them to light, because they are too small
a c
to be detectable by karyotype analysis. A specific type of
indel, called a copy-number variant (CNV), is a repeated
Deletion of b
section of genome where each repeat is greater than 1 kb in
length (Figure 16.6a). Although many CNVs are small, some Figure 16.6 Copy-number variants. (a) Relationship between
are hundreds of kilobases long, span several genes, and result size of DNA polymorphisms and their frequency. (b) CNVs can
in differences in gene dosage. The larger deletions that occur be formed during meiosis by unequal crossing over mediated by
as structural variations are often in chromosomal regions that repetitive DNA.
are present in more than one copy due to previous duplica- Q Explain how CNVs can change in length.
tions, suggesting that genes in the deleted segments would
have been redundant. A likely origin of indels is the occur-
rence of unequal crossing over after mispairing during meiosis experiments in the flowering plant Arabidopsis, suggest-
via misalignment of repetitive sequences (Figure 16.6b). An ing this error rate may be near the limit of DNA replication
unexpected observation from sequencing multiple genomes fidelity. We will explore human genetic variation further in
from a single species is that individuals can vary substantially Application Chapter D.
in gene content, including genes present in some individuals
but not in others, due to CNVs. The pangenome is the entire
set of genes present in a species, with the core genome being
Metagenomics
genes present in all individuals and the variable genome com- In both the number of individual organisms and their total
posed of genes present in only some individuals. mass, microbial populations constitute the majority of life
Studies analyzing genome sequences of parents and on Earth. However, unlike model genetic organisms, which
their offspring indicate that 8–25 kb of CNV variation accu- are convenient for scientists to study, only a small fraction
mulates due to mutation in each individual’s germ cells of microbes can be cultivated in the laboratory. How can we
in each generation. Likewise, studies analyzing genome begin to understand microbial diversity without being able
sequences of parents and their offspring indicate that SNP to grow the necessary range of microorganisms in the lab?
variation accumulates due to mutation at the rate of about One approach is to apply WGS sequencing to DNA isolated
30 to 50 new SNPs in each individual’s germ cells in each from entire natural communities consisting of a range of
generation. This is a rate of about 1 change in every 108 bp, organisms. The genetic material or data derived from such a
a figure remarkably similar to that observed in similar sequencing project is called a metagenome.
16.1 Structural Genomics Provides a Catalog of Genes in a Genome 601
One of the first metagenomics projects provides an exam- levels in an environmental setting and also contribute to the
ple. It was an environmental genomic shotgun sequencing of identification of gene sequences of organisms living in a
DNA isolated from microorganisms from the Sargasso Sea, a particular environment. Such analyses have been applied,
region of ocean bounded by the Gulf Stream off the southeast for example, to ecological communities living in acidic
coast of the United States. In this study, approximately 265 Mb mine tailings, contaminated groundwater, and drinking-
of sequence was generated and assembled into a large number water systems and also to more “natural” (less human-
of contigs, representing an estimated 1800 different genomes. influenced) ecosystems such as soils, oceans, and hot
However, none of the estimated 1800 genomes was complete, springs. The sequencing of ancient DNA (i.e., DNA from
and many were represented by only one or a few contigs. This long-dead organisms) can also be considered a metagenom-
situation highlights a complication arising in metagenomic ics task, given the inevitable contamination of the ancient
studies: Species in any given environmental sample are not sample with microorganisms over the years (often millenia)
equally represented, and so data from common species are since the organism of interest was alive.
over-weighted relative to those of scarcer ones. Consequently, EXPERIMENTAL INSIGHT 16.1 presents the results
any complete genome sequences that are produced are likely of metagenomic analyses of several microbial biomes of
to belong to very common species, whereas genomes of rare humans, including the gut, mouth, and skin, revealing that,
species are represented by only a small number of contigs. collectively, our microbial biomes possess a comparable
Despite such limitations, metagenomic analyses pro- number of genes with that of our own genome. The same
vide information on species diversity and relative population analytical approaches can be applied to any biological system
Our Communities Within and Upon A striking example of how diet can influence our resident
microbes is the occurrence of a unique lateral gene transfer
When we look in the mirror, we like to think we are looking at event in Japanese individuals who eat substantial amounts of
just ourselves, but the number of microbes within and upon red algae, the “wrapping” used in sushi. In this case, genes
us, primarily bacteria, is about the same as the number of our encoding enzymes that break down red algal polysaccha-
own cells, though the microbes comprise only about 1 kg of rides have been transferred from bacteria that normally live
our weight. Perhaps the first to recognize that we are host on the red algae to Bacillus species resident in the human
to our own microbiome was Antonie van Leeuwenhoek, who, gut. Thus, the bacteria in people who consume quantities of
scraping “gritty matter” from between his teeth, observed red algae evolve to better utilize this food source.
the “animalcules,” or bacteria, in his dental plaque in 1683. We obtain our initial gut microbiome from our mother’s
Subsequently, bacterial culturing techniques demonstrated birth canal and subsequently from her milk. Those born by
that microbes inhabit many parts of our bodies; but as has caesarean section miss out on these potentially important
since been revealed by the application of metagenomic contributions. Short-term changes in diet do not appear to
shotgun sequencing, only a small fraction of the microbial induce changes in gut microbial communities, but major per-
diversity in and on our bodies was culturable. Metagenomics turbations, such as antibiotic usage, can alter them. Normally,
has revolutionized our thinking on this topic, leading to the the ecology of the gastrointestinal community is robust and
present view that each of us has our own private ecosystems, rebounds to its former composition even after major insults.
complete with diverse habitats and ecology. However, sometimes new communities, often detrimental to
the health of their host, take over, and these may be resis-
DIGESTIVE MICROBIOME tant to removal by antibiotics. A seemingly radical method of
Our inner mucosal surfaces (gastrointestinal tract and mouth) displacing these unwanted microbes by doing a fecal trans-
and skin are dominated by four phyla of bacteria: Actino- plant from a healthy individual appears to be highly effec-
bacteria, Firmicutes, Bacteroidetes, and Proteobacteria. It tive, suggesting that similar transplant approaches may be
is becoming apparent that the makeup of our gut microbial capable of replacing “bad” microbiota with “good.” Several
community, in particular, influences our health and well- disease states, including Crohn’s disease, colorectal cancer,
being, and its composition is influenced by our diet. Metage- and irritable bowel syndrome, are associated with alterations
nomic sequencing of the gut microbiomes from hundreds of of the gut microbiome, highlighting the critical relationship
individuals revealed that these microbiomes fall into three we share with our ecosystems.
general types of gastrointestinal bacterial communities, or
enterotypes, corresponding strongly to long-term dietary SKIN MICROBIOME
habits. For example, high protein and animal fat consump- Our skin offers about 1.8 m2 of diverse habitats colonized by
tion is correlated with the Bacteroides enterotype, and a high microbes. Despite our bathing and shedding of skin cells,
carbohydrate diet is correlated with a Prevotella enterotype, our bacterial communities remain relatively constant and are
suggesting there is feedback between diet and habitat favor- dominated by the same four phyla as our guts, but with Acti-
ing growth of specific bacterial groups. nobacteria more abundant.
(continued)
602 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
Three distinct skin habitats—moist, dry, and sebaceous— acquire a coating of primarily Lactobacillus in our mother’s
are created by variations in skin thickness, folds, and density birth canal. This is replaced by habitat-characteristic commu-
of glands and hairs. The three habitat types are colonized by nities in the first years of our life.
distinct bacterial communities, with greater similarity arising Although it is not yet clear how many of our microbes
from similar habitat type than from topographic proximity. are commensal, symbiotic, or pathogenic in relation to us
In transplant experiments where forehead (sebaceous) and or each other, it is becoming clear that they exert a signifi-
forearm (dry) habitats were populated with tongue bacteria, cant influence on our health and well-being. In particular,
the tongue bacteria remained for some time at the forearm the proper development of our immune system, both when
site but were quickly replaced by “native” bacteria on the it is being established during infancy and later when pro-
forehead. This research and temporal monitoring of bacte- tecting our internal mucosal system, is influenced by the
rial communities indicate that the moist and sebaceous habi- composition of our microbiome. Experiments manipulating
tats have more stable communities than the dry skin areas. the gut microbiomes of mice suggest that intestinal micro-
In contrast, the dry skin areas, such as the forearm, heel, and biota can even influence brain chemistry and behavior.
buttock, having more exposure to the surrounding environ- Thus, next time you look in the mirror, ponder the ecosys-
ment, may be colonized opportunistically by a broader range tem you are cultivating and how its denizens are contribut-
of bacteria. If we are born by the normal birth process, we ing to your life.
from which purified DNA belonging to a single species is molecules (see Section 14.2 for a review of cDNA and
difficult to obtain. In addition, an application of metagenom- genomic libraries).
ics is presented in the Case Study at the end of this chapter. In theory, a complete set of transcribed sequences rep-
resenting all the genes from an organism would allow com-
plete annotation of the transcribed regions of its genome.
16.2 Annotation Ascribes Biological In practice, though, complete sets of such sequences are
Function to DNA Sequences not available, due to both variability in expression levels
and variation in structure and processing of different tran-
scripts (see Section 8.4 for a discussion of mRNA splicing).
The genome sequence can be considered the finest-scale
Nevertheless, for many organisms, a large amount of cDNA
physical map of the genome, and in it are encoded all the
sequence is available, allowing the partial or complete
genes of the organism. Genome annotation identifies
assembly of gene transcripts. Comparing these transcribed
the location of genes and other functional sequences within
sequences with the genomic sequence allows accurate anno-
the genome sequence.
tation of gene exons and introns, including alternative splic-
Annotation is the process of attaching biological func-
ing and other mRNA variants (Figure 16.7).
tions to DNA sequences, and gene annotation describes the
biochemical, cellular, and biological function of the gene prod-
ucts the genome encodes. Until annotated, a genome sequence Computational Approaches to Structural
is nothing but a very long string of As, Ts, Cs, and Gs. Annota- Annotation
tion describes both structural and functional features of a gene.
Its goal, moreover, is not only to identify known genes, regula- The genomes of multicellular eukaryotes often contain
tory sequences, and so on, but also to identify sequences that tens of thousands of genes, for many of which little or no
are likely to be genes though their function, if they are genes, experimental data have been collected. In the absence of
is as yet unknown. Annotations may be based on experimental experimental data concerning the existence or function of
evidence—the gold standard—or on computational analysis, a gene, computational approaches are used to identify pos-
which then must be confirmed experimentally. sible genes within genome sequences. The use of computa-
tional approaches to decipher DNA-sequence information is
Experimental Approaches to Structural termed bioinformatics.
Bioinformatic annotation algorithms predict gene struc-
Annotation ture by identifying open reading frames (ORFs), sequences
Structural annotation aims to identify genes and their struc- that appear to possibly code for polypeptides. Most of these
tural components, including transcribed, coding, and regu- algorithms initially search for ORFs larger than a minimum
latory sequences. Experimental approaches to identifying size, such as 50 amino acids, since ORFs of at least that size
transcribed sequences in a genome make use of comple- are less likely to occur at random. Data derived from known
mentary DNA (cDNA). Comparison of cDNA sequences cDNA sequences of the organism under analysis can be used
with genomic sequences identifies the parts of the genome to fine-tune the algorithms. Even so, predictions are not
that undergo transcription leading to production of RNA infallible, especially with large eukaryotic genomes, where
16.2 Annotation Ascribes Biological Function to DNA Sequences 603
Enzyme
exons are often small relative to introns and are dispersed
Enzyme activator
over large distances. Thus, bioinformatic algorithms are gen-
Enzyme inhibitor
erally less successful than experimental data in correctly pre-
Apoptosis inhibitor
dicting exons, but they can provide enough information to Signal transduction
assist in the design of experimental approaches for clarifying Storage protein
gene structures. Furthermore, because searching for ORFs is Cell adhesion
not helpful for recognizing genes that code for RNA mol- Structural protein
ecules, experimental or comparative genomic approaches Tumor suppressor Transporter
are usually required for annotating genes whose products are Ligand binding or carrier
noncoding RNA. The process by which genes are predicted Ubiquitin
is explored further in Research Technique 16.1.
Figure 16.8 Genome annotation of predicted biological
Another bioinformatic method of gene annotation is f unction. Genes are categorized with presumed functions based on
to compare genome sequences of related species. As we similarity to known genes. When the Arabidopsis and Drosophila
discuss in a later section, this and other forms of compara- genomes were first annotated in 2000, many genes (blue) had no
tive genomic analysis are becoming increasingly powerful similarity to genes of known function. However, in the past decade
as the genome sequences of more species become avail- significant progress has been made to functionally characterize
able. Remember, though, that after genes are predicted these genes, either using functional or comparative approaches.
computationally, either from algorithms or phylogenetic
comparisons, they must then be confirmed experimentally. Genes that are similar to each other in sequence are
assumed to encode gene products with similar biochemi-
cal functions. Genes similar in sequence to the lacI gene, for
Functional Gene Annotation
example, are likely to encode transcription factors that regulate
In addition to pinpointing genes and their structural com- gene expression. However, the nature of the genes they regu-
ponents, gene annotation aims to describe biochemical and late may not be easy to predict. In other words, their biochemi-
biological function. Let us consider the lacI gene, which cal function may be predicted by sequence comparison, but
encodes the Lac repressor protein of E. coli. The biochemi- determination of their biological function requires experimen-
cal function of the encoded protein is to bind to DNA and tal analysis, the most powerful tool being a loss-of-function
allolactose, and its cellular function is to regulate transcrip- allele (see Chapter 14 for descriptions of approaches to mutant
tion of the lac operon (see Section 12.2). The biological analysis). Initial annotation of the eukaryote genomes rep-
function of the lacI gene is regulation of gene expression resented in Figure 16.8 categorized many genes by their pre-
in response to sugar availability in the environment. In this sumed biochemical or cellular function. At that time, only about
case, the annotation we make can be quite detailed, since we half of the genes predicted for these species had either known
know a great deal about the lacI gene. biochemical and cellular functions (see Table 16.1), learned
604 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
5¿ T T G C A G T A T G G G C T A G A C C A A A G A G A G A G T T G A T A A C T A G C C G A A A C G A A C C A T G T T C G T C A A T C A G C A C C T T T G T G G T T
CTCACCTCGTTGAAGCTTTGTACCTTGTTTGCGGTGAACGTGGTTTCTTCTACACTCCTAAGACTTAAGCTAGCTAAGTA
T A G A T G G C G A G G T G A C A C A C A C A C A C A C A G G T A G A T A T T A A 3¿
1 Identify the three reading frames (rf) in the forward direction and in the complementary strand.
The three reading frames in the forward direction The three reading frames in the complementary strand
rf1 5¿ TTG CAG TAT GGG CTA GAC CAA 3¿ rf4 3¿ AAC GTC ATA CCC GAT CTG GTT 5¿
rf2 5¿ T TGC AGT ATG GGC TAG ACC AA 3¿ rf5 3¿ AA CGT CAT ACC CGA TCT GGT T 5¿
rf3 5¿ TT GCA GTA TGG GCT AGA CCA A 3¿ rf6 3¿ A ACG TCA TAC CCG ATC TGG TT 5¿
2 Highlight all potential start codons (ATG); note that these can occur in any of the six reading frames.
There are four potential start codons, highlighted under step 3 below: rf2-1 (reading frame 2, first potential start
codon), rf2-2, rf2-3, and rf4-1.
3 Highlight any stop codons (TTA, TAG, TGA) that are in the same reading frame as the four identified start codons.
Since all potential start codons were in either reading frame 2 or 4, we need only look for potential stop codons in
these reading frames. Six potential stop codons can be found in reading frame 2, and seven in reading frame 4.
The forward direction
rf2-1 rf2 rf2 rf2-2
5¿ T T G C A G T A T G G G C T A G A C C A A A G A G A G A G T T G A T A A C T A G C C G A A A C G A A C C A T G T T C G T C A A T C A G C A C C T T T G T G G T T
CTCACCTCGTTGAAGCTTTGTACCTTGTTTGCGGTGAACGTGGTTTCTTCTACACTCCTAAGACTTAAGCTAGCTAAGTA
rf2 rf2
T A G A T G G C G A G G T G A C A C A C A C A C A C A C A G G T A G A T A T T A A 3¿
rf2 rf2-3 rf2
We find that the rf2-1, rf2-3, and rf4-1 potential start codons are followed almost immediately
by in-frame stop codons, preventing the open reading frames from encoding more than 2, 3, or
5 amino acids. In contrast, the open reading frame commencing from rf2-2 is much longer.
The rf2-2 start codon is followed by an open reading frame of 93 nucleotides that could encode a protein of 31 amino acids:
5¿ T T G C A G T A T G G G C T A G A C C A A A G A G A G A G T T G A T A A C T A G C C G A A A C G A A C C A T G T T C G T C A A T C A G C A C C T T T G T G G T T
M F L N Q H L C G S
CTCACCTCGTTGAAGCTTTGTACCTTGTTTGCGGTGAACGTGGTTTCTTCTACACTCCTAAGACTTAAGCTAGCTAAGTA
S H L V E A L Y L V C G E R G F F Y T P K T *
T A G A T G G C G A G G T G A C A C A C A C A C A C A C A G G T A G A T A T T A A 3¿
For more practice with bioinformatics concepts, see Problems 4, 5, and 6. Visit the Study Area to access study tools.
16.2 Annotation Ascribes Biological Function to DNA Sequences 605
PHD
PHD
protein
Related Genes and Protein Motifs consisted
of two Ep
Examination and comparison of whole-genome sequences domains and Ep2 Br BMB
Fly (peregrin) Znf Ep1
PHD
PHD
two PHD
have allowed researchers to recognize gene families, groups domains.
of genes that are evolutionarily related and share con-
served sequences and gene structures (Table 16.1) that can Human (peregrin) Znf Ep1 Ep2 Br BMB
PHD
PHD
PHD
PHD
Ep1 Ep2
aid in the process of annotation. Some gene families may
be prominent in certain species, whereas others may be
Yeast (YPR031w) Ep1 Ep2
PHD
PHD
entirely absent. The 20,000 to 21,000 protein-coding genes
(b)
of the human genome represent about 10,000 gene families.
2000 Protein:
Although most mammals largely share this set of 10,000
gene families, only 3000 to 4000 of these gene families are Transmembrane
Protein architectures
Extracellular
found in all eukaryotes. Other lineages, such as fungi and 1500
Intracellular The number of
plants, have their own sets of lineage-specific gene families.
different protein
Expansion and retention of particular gene families 1000 architectures is
depends on the importance of their biological functions larger in animals
to the organism. For example, in mammals, the gene fam- than in yeast.
500
ily encoding olfactory receptors is often the largest in the
genome, frequently consisting of more than 1000 members.
However, the olfactory receptor gene family is much larger 0
Yeast Fly Worm Human
in organisms that rely heavily on this sense (a mouse has
Figure 16.9 Modularity of protein domains. EPC-like protein,
more than 900 of these genes) than in species in which the
a protein type found in all eukaryotes, is used as an example.
sense of smell is diminished (humans have only 339). In (a) Proteins are often modular, composed of discrete domains (e.g.,
humans, the largest gene family encodes proteins function- Ep1, Ep2, PHD, Br, BMB, Znf). Complex proteins can evolve by
ing in the immune system, but this family of genes is absent mixing and matching of protein domains, usually through a process
in both the plant Arabidopsis and the fungus Saccharomy- known as exon shuffling. (b) Multicellular eukaryotes have more
ces, where the largest gene families encode protein kinases. complex protein architectures than single-celled eukaryotes.
Annotation can also be assisted by recognition of
genome segments coding for conserved protein domains.
Many eukaryotic proteins are modular, consisting of Variation in Genome Organization
distinct protein domains joined together (Figure 16.9).
Because many protein domains correlate with exon struc-
among Species
ture in genes—that is, one or more exons specifically Having obtained and compared genome sequences of
encode a particular protein domain—a hypothesis has been bacteria and archaea and of eukaryotes (see Table 16.1),
advanced that composite genes (genes that encode multiple biologists can draw several general conclusions about
conserved protein domains) are generated by exon shuf- genome organization (Figure 16.10). First, bacteria and
fling, a process in which one or more exons become part archaea have fewer genes and much higher gene density
of a new gene through duplication, translocation, or inver- than eukaryotes. This high gene density is attributable to
sion of DNA sequence or a combination of such events. the lack of introns, the more compact size of regulatory
The modular structure of proteins means that the number of sequences, and the generally less complex structures of
genes is much larger than the number of unique functional most encoded proteins in bacteria and archaea. Second,
protein domains. Exon shuffling creates novel arrange- eukaryotes differ widely in both gene number and gene
ments of protein domains that can be co-opted to fulfill new density, and the genomes of single-celled eukaryotes tend
biological roles. The available data indicate that the protein to encode fewer genes than those of multicellular eukary-
repertoires of multicellular eukaryotes are generally more otes. At the same time, groups of related eukaryotes—for
complex, averaging more different domains per protein, example, mammals—often have similar numbers of genes,
than those of single-celled eukaryotes. Knowledge of con- suggesting that gene regulation rather than number or type
served protein domains often provides insight into potential largely determines differences between related species.
biochemical activities of proteins, but, again, understand- Third, species that have evolved to be obligate parasites
ing the biological function requires mutant analysis. often experience genome contraction. As parasites become
606 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
8.2 3.2
100 kb
Drosophila melanogaster (chromosome X)
0.67 9
1 Mb
Homo sapiens (chromosome 1)
dependent on their hosts for nutrients, they lose the genes the genomes of all organisms are highly dynamicin nature.
they no longer need. This trait is reflected in the reduced Transposable elements (see Section 11.7) are just one of the
genome size compared with the other bacteria of Chla- factors driving genome evolution; large- and small-scale
mydia trachomatis, the bacterium responsible for chla- chromosomal duplications as well as deletions and other
mydia in humans (see Table 16.1). rearrangements also contribute. Substantial genetic varia-
Just as gene number and density vary among eukary- tion is seen even within species, thus providing raw mate-
otes, so does the proportion of repetitive DNA in the genome. rial for natural selection and the evolution of new species.
The human genome consists of more than 50% repetitive Second, genome sequencing of model organisms
DNA: Approximately 45% consists of transposable ele- reveals the limitations of forward genetic screens. Even in
ments (transposons, retrotransposons, and retroelements; intensely studied species, such as E. coli and S. cerevisiae,
see Section 11.7); a further 3% consists of microsatellite forward genetic screens (see Section 14.1) identified only a
sequence; and about 5% contains recent gene duplications. fraction (one-third to one-half as many) of the genes identi-
Additional repetitive DNA is present in the centromeric and fied by genome sequencing. What are the functions of all
telomeric sequences. The repetitive DNA that is not centro- these previously unknown genes?
meric or telomeric is often called dispersed repetitive DNA The third insight obtained from the analysis of genomes
because it is distributed throughout the genome. The pro- is the discovery that the number of genes in the human
portion of dispersed repetitive DNA, largely transposons, genome is comparable with that of various other multi-
retrotransposons, and retroelements, in a genome is a sig- cellular eukaryotes. Over the past 25 to 30 years, the esti-
nificant factor influencing gene density. Some features of mates of gene number in the human genome have steadily
genome organization can be seen in human chromosome 21, decreased. Having once estimated our genome to contain as
shown in Figure 16.11. many as 80,000 to 120,000 genes, we may find it humbling
The annotated genome sequences of model genetic to discover that we and other animals have fewer genes
organisms can be found at the websites provided on the than many plants. The currently estimated number of about
back endsheets of this book. The host site for the human 20,000 protein-coding genes in the human genome is typical
genome (http://genome.ucsc.edu/) also acts as a portal to the for vertebrates, and it is not much higher than the 14,000
annotated genomes of several additional species. or so estimated for Drosophila. If some of us have “gene
number anxiety,” it should be assuaged by recognizing that
gene number does not translate directly into protein number
Three Insights from Genome Sequences or organism complexity. Both exon shuffling and alternative
Analyses of genome sequences from a range of bacteria, splicing increase the complexity of proteins in eukaryotes,
archaea, and eukaryotes have produced many insights into and these processes are much more prevalent in animals
the nature of genomes, of which three are particularly than in either fungi or plants. In the remaining pages of this
important. First, genomic comparisons demonstrate that chapter, we address these major insights in more detail.
16.3 Evolutionary Genomics Traces the History of Genomes 607
21p11.1
LOC100131268 21q22.11 Hypothetical LOC100131268
21q11.1
21q11.2
HUNK 21q22.11 Hormonally upregulated Neu-associated kinase
Blue = exon
21q21.1 Red = intron
21q21.2
21q21.3
21q22.11
16.3 Evolutionary Genomics Traces comparisons identify sequence polymorphisms that are
responsible for the genetic differences within populations of
the History of Genomes a single species. These differences are the raw material of
evolution and form the basis of population genetics and the
Evolutionary genomics, sometimes called phylogenom- evolution of species.
ics or comparative genomics, is the comparative study The evolutionary history of each organism can be
of genomes. Interspecific comparisons of genomes— traced in its genome and in the composition of its chromo-
comparisons between species—identify sequences con- somes. Evolutionary genomics has revealed the striking fact
served over evolutionary time and thus facilitate the annota- that a large number of genes are shared by phylogenetically
tion of genomes and provide insight into the evolution of distant species, reaffirming that all life on Earth is related.
genes and organismal diversity. In contrast, intraspecific Species that are more closely related to one another share
608 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
a larger number of genes than species that are more dis- for homologous sequences, using a computer program
tantly related. In closely related species, the similarities in called BLAST, for Basic Local Alignment Search Tool, is
sequence go beyond shared genes to conserved chromo- described in Research Technique 16.2.
somal segments. Evolutionary genomics has also brought to
light important information concerning the highly dynamic
nature of the genome. Changes, in the form of mutations, Interspecific Genome Comparisons:
can be observed even in the time scale of a single generation. Gene Content
Genome sequencing indicates that certain genes are found
The Tree of Life in all organisms, whether bacteria, archaea, or eukaryotes,
and suggests that these genes must have arisen early in the
The large amount of DNA sequence information now
evolution of life on Earth. Such highly conserved genes—
available has revolutionized how biologists perceive the
for example, the genes encoding proteins needed for DNA
tree of life, the phylogenetic tree depicting the evolution-
synthesis—are involved in biological processes common to all
ary relationships between organisms. Morphological and
species. Other genes have a more recent origin and define spe-
physiological traits were once the primary basis of species
cific clades of species. For instance, genes encoding tubulin are
classification, but DNA sequence comparisons have pro-
found in all eukaryotes, implying that the tubulin gene evolved
vided new clarity concerning questions that the older meth-
before the diversification of the eukaryotes. Still other genes
ods of study were unable to resolve.
are shared among more restricted clades of organisms, and
Comparisons of DNA sequences of the same gene from
some genes are confined to only closely related species. In this
different species are particularly useful for assessing phylo-
way, the phylogenetic distribution of gene families provides
genetic relationships. Due to their ubiquity and high degree
information on when specific genes evolved. Furthermore,
of conservation, genes encoding the ribosomal RNAs pro-
the set of genes shared among any group of organisms can be
vide a universal set of sequences for such comparisons.
considered to represent the minimum genomic content of the
By comparing ribosomal RNA sequences, Carl Woese and
common ancestor of that group of organisms, thus providing
colleagues revealed through pioneering studies in the late
information on the evolution of both genomes and organisms.
1970s that all forms of life on Earth fall into one of three
Because the first genomes to be sequenced were from
distinct domains: Bacteria, Archaea, and Eukarya. Since
phylogenetically diverse organisms, many genes appeared
then, relationships within many eukaryotic groups have
to be specific to particular taxa. However, as more genome
been clarified using DNA sequence comparisons, allowing
sequences were determined, genes initially thought to be
the basic architecture of the tree of life to be determined
unique were found to have counterparts in the genomes of
(Figure 16.12). Some surprising relationships have emerged.
related species. Indeed, two closely related species may share
For example, the Fungi and Metazoans, which had tradi-
almost their entire genome content, with the genomic differ-
tionally been considered two separate “kingdoms” of life,
ences between sister taxa defining the differences between
were discovered to be relatively closely related and are now
the two species. For example, genome content is very simi-
grouped with Amoebozoa in a clade called the Unikonts.
lar in four closely related Saccharomyces species (S. cerevi-
Since animals and plants are the most conspicuous life-
siae, S. paradoxus, S. mikatae, and S. bayanus), all separated
forms from a human perspective, the tree presented in Figure
by 5 to 20 million years (Figure 16.13). Throughout the
16.12 is biased toward a focus on the interrelationships in
genomes of the four Saccharomyces species, just a handful
those two groups. If all its branches were to be presented in
of species-specific genes were detected, with an average of
equal detail, the “tree” would more closely resemble a very
one unique gene for every 0.5 million years of evolutionary
dense bush.
distance. Similarly, in Drosophila melanogaster and related
The tree of life in Figure 16.12 was constructed using
species, the rate of the origin of new functional genes was
DNA sequence information (see Section 1.5) and compari-
estimated to be 5 to 11 genes per million years. It is not yet
son of the alignment of homologous nucleotides with ascer-
clear whether these rates are typical for other organisms. But
tain phylogenetic relationships. Homologous nucleotides
it does bring up the question: How do new genes form?
are those that are descended from the same nucleotide
in the common ancestor of the two species being com-
The Births and Deaths of Genes In tracing the evolu-
pared (Figure 1.18). Highly conserved protein-coding DNA
tionary history of genes by comparing genome sequences,
sequences, some of which have been conserved over time
geneticists obtain clues to the mechanisms through which
scales of more than a billion years, are analyzed to identify
new genes arise (Figure 16.14). These mechanisms include
ancient evolutionary branch points, or nodes. Conversely,
the following.
rapidly evolving sequences are compared to clarify recent
nodes in species evolution. Intron and intergenic sequences, 1. Gene duplication by duplication of genomic DNA.
on which there may be little selective pressure to main- Duplication of genetic material can duplicate a portion
tain a specific sequence, can accumulate mutations and of a gene, a single gene, a chromosome or chromo-
change rapidly over time. A strategy developed to search some segment, or the entire genome (see Chapter 10).
16.3 Evolutionary Genomics Traces the History of Genomes 609
Amphibians Reptiles
Fish lineages Danio rerio and birds
Plants
Mammals
Land plants
Algal lineages Placental mammals Marsupials
Chlamydomonas reinhardtii Mus musculus
Monotremes
Primates
Seed plants
Flowering plants Gymnosperms Lemurs Lorises New World Old World monkeys Apes
Arabidopsis thaliana Zea mays monkeys Homo sapiens
Figure 16.12 The tree of life, highlighting the phylogenetic relationships of model organisms discussed
in this book.
610 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
2. Gene duplication by unequal crossover. In a special 3. Exon shuffling. During an exon-shuffling event,
case of gene duplication, one or more genes can be exons from two or more genes are combined in a new
duplicated by unequal crossover due to misalignment genomic context (see Figure 16.9a). The rearranging
of homologous chromosomes at synapsis during pro- could occur through illegitimate recombination events
phase I of meiosis. Gene duplication by unequal cross- or, alternatively, through retrotransposition events.
over is indicated by the detection of tandem repeats,
or back-to-back copies, of genetic material (see 4. Reverse transcription. Reverse transcription of cel-
Section 5.5). lular RNAs using a retrotransposon-encoded reverse
Basic Local Alignment Search Tool and the BLAST program then searches chosen databases for
similar sequences. Sequences are given a score based on
PURPOSE Homologous genes are derived from a com- the extent of similarity and relative to the probability that the
mon ancestral gene and often have similar functions. A sequences could be similar by chance.
computer program called the Basic Local Alignment Search
Tool (BLAST) was developed in 1990 by Stephen Altschul, CONCLUSION What information can be derived from this
David Lipman, and colleagues to search for homologous experiment? First, the results of the BLAST search can provide
sequences. BLAST, perhaps the most widely used and most clues to the biological and biochemical function of the gene
important tool employed in bioinformatic endeavors, allows used as a query. Since homologous genes are descended
scientists to search databases for sequences similar to any from a common ancestor, they likely share biochemical activ-
input sequence. ity if not biological context. Second, knowledge of the phylo-
The BLAST program of the National Center for Biotechnology genetic distribution of homologous genes allows inferences
Information at the National Institutes of Health (http://blast.ncbi. to be made about when the gene evolved. For example, if
nlm.nih.gov/Blast.cgi) enables searches of either DNA sequence the query is a human gene and if genes homologous to it
similarity or protein similarity. Various types of searches can be are detected in all eukaryotes, the protein is likely to perform
performed. Here are three of the most common. a function conserved in all eukaryotes. Conversely, if only
mammals have homologous genes, the gene is likely to per-
• nucleotide blast (blastn): A nucleotide query sequence form a function specific to mammals.
is compared with nucleotide sequences in the database. Since related species often have conserved amino acid
• tblastn: A protein query sequence is compared with the sequences but, due to the redundancy of the genetic code, pos-
nucleotide databases, hypothetically translated into all six sess different nucleotide sequences, a tblastn (or tblastx) search
potential reading frames. is often more sensitive than a blastn in identifying homologous
• tblastx: A nucleotide query sequence is translated into sequences from distantly related species. When a researcher
all six possible reading frames and compared against the has no prior knowledge of the DNA sequence being used as
nucleotide sequences in the database, also translated a query, tblastx searches are particularly useful because they
into all six possible reading frames. identify DNA sequences with the potential to encode similar
proteins.
PROCEDURE One of the first experiments researchers per- What if a BLAST search fails to find any other sequences in
form once they have determined the sequence of a gene is the database similar to the query sequence? If the sequence is
to “BLAST” their sequence against the GenBank database, known to encode a protein, the result suggests that the gene for
where most DNA sequences determined anywhere in the the protein is unlikely to be conserved in a broad phylogenetic
world are deposited. To perform a search, the user enters sense. Alternatively, if the sequence is noncoding DNA, a lack
an “input” nucleotide or protein sequence into a window, of similarity to other DNA sequences is not unexpected.
For more practice with bioinformatics concepts, see Problems 14 and 15. Visit the Study Area to access study tools.
16.3 Evolutionary Genomics Traces the History of Genomes 611
Gene Duplication The high rate of gene duplication is one a process called subfunctionalization. Third, in a process
surprising discovery arising from evolutionary genomics. called neofunctionalization, a mutation in one of the dupli-
Most genomes contain a mosaic of gene families derived cates could provide a function not performed by the origi-
from both ancient and more recent duplication events, indi- nal gene. In rare cases where the new function provides a
cating that genomes are dynamic and continuously chang- selective advantage, the gene can be maintained and become
ing over time. A study in 2000 by Michael Lynch and John fixed in the population. In the latter two cases, both copies
Conery counted the duplicated genes in nine eukaryotic remain functional, whereas in the first case, only a single
species and estimated the duplication rate: approximately copy retains activity.
0.01 genes per million years. Thus, for an average eukary- Repeated duplication events produce families of related
otic genome with 10,000 to 30,000 genes, this research genes. Through gene duplications, gene losses, and spe-
suggests that one gene duplicates and is maintained in the ciation events, the relationships among these genes often
genome every 3000 to 10,000 years, a rate of gene forma- become complex. Three terms describe different relation-
tion higher than has been observed in the Saccharomyces ships of evolutionarily related genes. The broadest term
species. is homology, which is defined as descent from a common
The fate of duplicated genes depends on the molecular ancestor. Thus, homologous genes, or homologs, have
basis of the duplication. If the entire gene including regu- descended from a common ancestral gene and are said to
latory sequences is duplicated, both copies will be able to constitute a gene family (Figure 16.16). Two other terms
produce a functional protein product in the correct amount, define specific relationships between homologous genes.
time, and place. In this case, the duplicate genes are geneti- Paralogous genes, or paralogs, are genes whose origin lies
cally redundant and are free to evolve new functions, as in a gene duplication event. No indication of the age of the
long as the composite functions of the two duplicate genes duplication event leading to the paralogs is implied. Gen-
retain the function of the original gene. Fully redundant erally, paralogs perform biologically distinct but biochemi-
genes are not maintained over long time periods, usually cally related functions. Orthologous genes, or orthologs,
because the duplicate genes undergo one of three likely fates are genes whose origin lies in a speciation event. They are
(Figure 16.15). First, the vast majority of new genes degen- genes in different species that are derived from a single
erate into pseudogenes due to a lack of positive selection, ancestral gene in two species’ last common ancestor. Ortho-
without which mutations will slowly accumulate and ren- logs most often, but not always, have equivalent functions
der the genes nonfunctional. Pseudogenes form a significant in the two organisms being compared. The globin genes in
fraction of the genomes of some organisms. Figure 16.16 illustrate these evolutionary relationships. See
Second, mutations in each of the two copies—for Genetic Analysis 16.1 for practice in determining ortholo-
example, mutations in two different tissue-specific enhanc- gous and paralogous relationships of evolutionarily related
ers, as in Figure 16.15—can result in the two genes having genes.
complementary activities such that their combined activity Gene duplication has been a key mechanism in gen-
is the same as the activity of the gene before duplication, erating new genes that over time have made possible the
Inactivating
mutations The composite Gene Z1 retains the
functions of genes Z1 original function of
and Z2 are equivalent gene Z, while gene Z2
ab Z1
to those of gene Z. acquires a new function.
ab Z2
Function A
Gene Z1
Function B
16.3 Evolutionary Genomics Traces the History of Genomes 613
( iG iA f d ( iG iA f d
Orthologs
ed ed
d-globin gene cluster
Human (Homo sapiens) Chimpanzee
d-globin
(Pan troglodytes)
* e* ec1 c2 c1 ( iG iA ed f d MO
50 mya 50 mya
80 mya
260 mya 120 mya
450–500 mya
The term homology may apply to the Gene duplication 450–500
relationship between genes derived via million years ago
a speciation event (orthologs) or to the
relationship between genes derived via Ancestral hemoglobin gene Ancestral myoglobin gene
a gene duplication event (paralogs).
600–800 mya
Ancestral globin gene
Figure 16.16 Orthology and paralogy, speciation events and gene duplications: Examples from the
globin gene family.
evolution of complex organisms. During globin gene evo- Lateral Gene Transfer Lateral gene transfer, also known as hor-
lution, gene duplication has permitted specialization, which izontal gene transfer, is the transfer of genetic material between
in turn has allowed greater physiological complexity. Both two species. Lateral gene transfer may have been extensive early
subfunctionalization and neofunctionalization can be seen in the evolution of life, but as specialized genetic mechanisms
within the globin gene family. Neofunctionalization can be evolved for control of gene expression, lateral gene transfer
seen in the gene duplication event that produced the hemo- became less frequent within the eukaryotic lineage.
globin and myoglobin genes, where hemoglobin functions A common lateral gene transfer event occurs through the
to carry oxygen in the blood and myoglobin functions to sharing of plasmids among bacterial species (see Chapter 6),
bind oxygen in muscles. Subfunctionalization has also but other lateral gene transfer events between bacterial species
occurred in the globin genes, if an assumption is made that and between bacterial and archaeal species also have been
the ancestral b@globin was active throughout the life cycle documented. Based on comparison of the sequenced bacterial
of the organism. If so, subfunctionalization is now evident and archaeal genomes, an estimated 1.5 to 14.5% of genes
between the e@globin and b@globin paralogs, where the in any genome are the result of lateral gene transfer. This is
e@globin is active in the embryo and the b@globin is active likely to be an underestimate, since ancient transfer events
in the adult. Other examples of gene duplication are seen may not be detectable. In an extreme example of lateral gene
in the duplications of an ancestral gene leading to the fam- transfer, hyperthermophilic bacterial species (bacteria able to
ily of genes that allow trichromatic vision in some primate live in extremely hot environments) have acquired genes from
species, including humans (see Section 3.5), and in the cre- hyperthermophilic archaeal species. Nearly a quarter of the
ation of another gene family that specifies identity along the genes in the bacterium Thermotoga maritima are most similar
anterior–posterior axis of animals (see Section 18.2). to archaeal genes, indicating an archaeal origin. One acquired
GENETIC ANALYSIS 16.1
PROBLEM Consider the phylogenetic tree of seven homolo- Indian hedgehog (mouse)
4
gous eukaryotic genes derived from three species. What is the
relationship between the human genes and the Drosophila
Indian hedgehog (human)
gene—are they paralogs or orthologs? What are the relation- 3
ships between the mouse and human sonic hedgehog genes,
between the human sonic hedgehog and human desert hedge- Desert hedgehog (mouse)
5
hog genes, and between the human desert hedgehog and
mouse indian hedgehog genes? In each case, are the genes 2
Desert hedgehog (human)
paralogs or orthologs? BREAK IT DOWN: Recall that homologous
genes are genes that have descended from
a common ancestral gene (p. 612) Sonic hedgehog (mouse)
1 6
BREAK IT DOWN: Recall that orthologs are homologous
genes produced by a speciation event, and paralogs Sonic hedgehog (human)
are homologous genes produced by a gene duplication
event within a species.
Hedgehog (Drosophila)
Evaluate
1. Identify the topic this problem addresses and 1. This problem is about determining orthology and paralogy of homolo-
the nature of the required answer. gous genes.
2. Identify the critical information given in the 2. The phylogenetic tree provides information about how the genes are
problem. related to one another.
Deduce
3. Consider the topology of the phylogenetic 3. The node at the base of the tree represents the ancestral gene. Since
tree. First examine the relationship between all of the mammalian genes are more closely related to one another
the Drosophila gene and the mammalian than they are to the Drosophila gene, the ancestral organism had only a
genes. single gene.
TIP: How many genes were in their
common ancestor?
4. Examine the earliest node in the phylogenetic 4. At the earliest node in the tree (node 1 ), the divergence produced the
tree to see if it corresponds to a speciation Drosophila gene and a lineage of mammalian genes. Thus, this node is
event or a gene duplication event. a speciation event, with the common ancestor of Drosophila and mam-
mals speciating to produce a lineage leading to Drosophila and another
leading to mammals.
5. Determine for each node in the tree whether 5. Following the lineage leading to the mammalian genes (node 2 ), the
it represents a speciation or gene duplication divergence produces two lineages, each containing both mouse and
event. human genes. Thus, the duplication must have been a gene duplication
and not a speciation. The divergence at node 3 is similar to that of
node 2 and so must also be a gene duplication. In contrast, nodes 4 ,
5 , and 6 all diverge to produce a mouse gene and a human gene
and thus represent the speciation event leading to mice and humans.
Solve
6. What is the relationship between the Dro- 6. Since we concluded that the divergence at node 1 was a speciation
sophila gene and the mammalian genes? event, the Drosophila gene is orthologous to all of the mammalian
genes and vice versa.
TIP: Orthologs are produced by a speciation
event and paralogs are produced by a gene
duplication event.
7. What are the relationships between the 7. Let’s consider three specific sets of genes. First, consider mouse sonic
human and mouse genes? hedgehog and human sonic hedgehog—these two genes are related
by a speciation even at node 6 and are thus orthologs. Next, consider
human sonic hedgehog and human desert hedgehog—these two
genes are related by a gene duplication event at node 2 and are thus
paralogs. Finally, consider human desert hedgehog and mouse indian
hedgehog—these two genes are related by a gene duplication event at
node 3 and are thus paralogs.
For more practice, see Problems 16 and 23. Visit the Study Area to access study tools. Mastering Genetics
614
16.3 Evolutionary Genomics Traces the History of Genomes 615
archaeal gene encodes a reverse gyrase, a topoisomerase that Gene annotation can be hampered by a lack of homol-
induces positive supercoils in DNA and is required for adap- ogy to known genes. This is especially the case with genes
tation to living at high temperatures. or exons of a small size (e.g., encoding proteins of less than
Although genes encoding proteins with metabolic func- 100 amino acids), as they are particularly difficult to predict.
tions appear to have been donated in lateral gene transfer Consider that stop codons occur, on average, about once in
events, those that encode proteins for information process- 21 codons (3/64) in a random sequence. Thus, random ORFs
ing (e.g., replication, transcription, and translation) are not of 63 amino acids occur frequently (approximately 5% of
commonly transferred. One possible explanation for this the time in any random 189-bp sequence). Furthermore,
bias is that proteins with information processing functions in multicellular eukaryotes, the coding sequences of genes
often act in large complexes and are not easily incorporated are typically broken into small exons (often encoding fewer
into existing complexes in other species. than 100 amino acids) dispersed over large distances, thus
The rarity of lateral gene transfer between eukaryotes making their unambiguous identification a challenge. Anno-
and also between eukaryotes and members of either of the tation of such genes is typically feasible only with either
other two domains, compared with its relative frequency experimental evidence or evidence of similar sequences in
among bacteria and archaea, is due in part to the differences other genomes.
between eukaryotic transcriptional and translational control In the case of the Saccharomyces species (see Fig-
mechanisms and those of bacteria and archaea. Even though ure 16.13), comparisons between the four genomes led to
the bacterium Agrobacterium tumefaciens transfers genes prediction of more than 40 previously unannotated genes
to plant cells (see Section 15.2), there is little evidence that encoding proteins between 50 and 100 amino acids in
those genes have entered the germ line of the transformed length. Likewise, comparisons of the human genome with
plants. Conversely, there is no evidence of transfer of genes the genomes of other vertebrates have aided in the identi-
from transgenic plants to soil bacteria. However, there is fication of exons and significantly refined the annotation
one prominent exception to this generalization: the transfer of the human genome. This is one respect in which the
of genetic material from endosymbionts to their hosts. The genome sequencing of model genetic organisms has greatly
most conspicuous examples are the large-scale transfers of increased our knowledge of our own genome.
genes from mitochondria and chloroplasts to the nucleus in
eukaryotic cells (explored in greater detail in Section 17.5).
Conserved Noncoding Sequences Besides helping
Finally, although lateral gene transfer between two eukary-
to identify open reading frames, genome comparisons
otes is not thought to be common, it has been documented—
have also detected the presence of conserved noncoding
for example, between parasitic flowering plants and their
sequences (CNSs). Noncoding DNA was once called
flowering plant hosts as well as between fungi and aphids.
“junk” DNA (a term originally coined by Sydney Brenner)
since junk, as opposed to garbage, is something we tend to
Interspecific Genome Comparisons: keep even though it serves no identifiable purpose. Today,
however, we know that at least some of this noncoding
Genome Annotation DNA is functional; it contains regulatory sequences and
By comparing the genome sequences of related species, genes that produce functional noncoding RNAs, such as
researchers are often able to refine their annotations of pre- microRNA genes and lncRNAs (see Section 13.3 for discus-
dicted genes whose existence has not been experimentally sion of these types of genes).
confirmed. If the predicted gene in fact functions as a gene, There are two methods for identifying conserved non-
orthologous genes are likely to exist in related species. coding sequences, and they approach the task from oppo-
site directions. In phylogenetic footprinting, conserved
Conserved Coding Sequences Comparative genomic sequences are identified by searching for similar sequences
analyses can facilitate the discovery of previously unanno- in species separated by large evolutionary distances. Con-
tated genes. Sequences that are conserved in the genomes of versely, in phylogenetic shadowing, conserved sequences
two or more species are more likely to be functional (e.g., are identified by comparison of sequences in closely related
encode genes) than sequences that are not conserved. Due species, after first eliminating sequences that are not con-
to the redundancy of the genetic code, amino acid sequences served among them. Comparative sequence analyses of
of proteins are often more conserved than the nucleotide CNSs are now often the first step to predicting regula-
sequences that encode them. Thus, in searches for con- tory sequences, which are then tested by experiment (see
served coding sequences, the nucleotide sequences of each Figure 14.18).
of the genomes are first translated into all six potential read- Regulatory sequences controlling expression of genes
ing frames and the hypothetical amino acid sequences are in most multicellular eukaryotes consist of enhancer mod-
compared (see tblastx in Research Technique 16.2). Con- ules spanning hundreds and potentially tens of thousands
served sequences can then be used to direct experimental of base pairs (see Section 13.1). A large number of CNSs
examination of the predicted genes, leading to refinement of that correspond to regulatory sequences have been identi-
the genome annotation. fied by phylogenetic footprinting using comparisons of
616 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
(a) Species A A A
Exon Intron Exon Exon Intron Exon Exon Intron Exon
Species B B B
CNS
Percentage 100 100 100
sequence 50 Evolutionary 50 Evolutionary 50
identity time time
0 0 0
An evolutionary event causes the separation Over time, the percentage of sequence identity declines in regions
of species B from species A. Initially, genes in not under strong selection, leaving peaks of conservation in exons
species A and B are identical. and some conserved noncoding sequences (CNS).
Mouse Human
Figure 16.17 Phylogenetic footprinting. (a) Evolution of a conserved noncoding sequence (CNS).
(b) A CNS associated with the SHH gene acts as an enhancer directing expression of the SHH gene in
the developing limb bud.
mammalian and other vertebrate genomes (Figure 16.17a). In contrast to phylogenetic footprinting, phylogenetic
Comparisons between mammals and fish have shown that shadowing identifies conserved sequences via compari-
enhancer modules can be conserved over large evolution- son of multiple closely related species. In this approach,
ary distances (the lineages leading to fish and humans sequences that are not conserved in at least one of the spe-
separated about 400 million years ago). Conserved non- cies are removed from consideration, whereas sequences
coding sequences are often clustered in the genome, and that are conserved in all species are considered as potential
they are often adjacent to evolutionarily conserved genes functional sequences. Phylogenetic shadowing has identi-
involved in basic developmental processes. For example, fied functional sequences in the human genome by looking
comparisons between the human, mouse, and fugu (puff- for sequences that have not changed in any of several pri-
erfish) genomes identified a CNS corresponding to an mate species (Figure 16.18).
enhancer module approximately 1 megabase distant from
the sonic hedgehog (SHH) gene (Figure 16.17b). When this Interspecific Genome Comparisons:
CNS was tested for regulatory activity, it drove expression
Gene Order
of a reporter gene in mice in a manner reminiscent of the
endogenous SHH expression pattern in developing limb Just as the evolutionary history of organisms and genes can be
buds. This CNS is functionally important because muta- traced by comparisons of genomes, so can the evolutionary his-
tions in this enhancer are associated with polydactyly in tories of chromosomes. For example, humans have 2n = 46
both mice and humans. chromosomes, but our closest relatives (chimpanzees,
16.3 Evolutionary Genomics Traces the History of Genomes 617
Human
STCH Stch
Gorilla
Human Mouse
Orangutan chromosome chromosome
21 16
Gibbon
Baboon
Spider monkey
Howler monkey
Gabpa
Ncam2
GABPA App
APP
Sequences conserved among all species Grik1
GRIK1
TIAM1 Sod1
Figure 16.18 Phylogenetic shadowing of primate species. SOD1 Il110rb
IL10RB Tiam1
Q Contrast the approach of phylogenetic shadowing with that of IFNAR1 Runx1
phylogenetic footprinting. IFNAR2 Gart
GART Ifnar
SON Son
gorillas, orangutans) have an additional pair of chromosomes, RUNX1 Ifngr2
2n = 48 (see Figure 10.30). Comparing the chromosomes of CBR1 Cbr
humans and these other primates for synteny—the conserved CBR3 Cbr3
CHAF1B Chaf1b
order of consecutive orthologous genes along the length of a SIM2 Sim2
chromosome or chromosomal segment—shows that a pair of HLCS Hlcs
chromosomes in our common ancestor fused to form a single TTC3 Ttc3
DYRK1A Dyrk1a
chromosome, chromosome 2, in humans. Other minor differ-
KCNJ6 Kcnj6
ences among primate chromosomes can be accounted for by a KCNJ15 Kcnj15
small number of translocation and inversion events. ERG Erg
Synteny can also be observed in more distantly related ETS2 Ets2
HMG14 Hmg14
mammals, such as between mouse and human lineages PCP4 Pcp4
that diverged about 100 million years ago (Figure 16.19). DSCAM Dscam
Genome sequence information can provide detailed views MX2 Mx2
of synteny between even more distantly related organisms. MX1 Mx1
TFF3 Tff3 Mouse
Even if chromosome synteny is not conserved, synteny at CBS chromosome
Cbs
the level of only a few genes, referred to as microsynteny, CRYAA Crya1 17
can sometimes be detected. For example, such information CSTB Cstb
has revealed relationships between the chromosomes of D21S2056E Nnp1
TMEM1 Ube2g2
birds and mammals. PFKL pfkl
Even when synteny is conserved at a chromosomal level, C21ORF2 Smt3h1
comparative studies have revealed large numbers of small UBE2G2 Itgb2
rearrangements between closely related species. In a sense, SMT4H1 Tmem1
ITGB2 D10Jhu13e Mouse
this can be considered a loss of microsynteny. The large ADARB1 Col18a1 chromosome
amount of repetitive DNA in eukaryotic genomes coupled COL18A1 Col16a1 10
with unequal crossing over due to mispairing during meiosis SLC19A1 Col16a2
COL6A1 Lss
provides a mechanism by which DNA rearrangements can
COL6A2 S100b
LSS Hrmt1l1
Figure 16.19 Synteny between human and mouse S100B Adarb1
chromosomes. HRMT1L1 Slc19a1
618 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
(b)
Chromosome number
1 2 3 4 5
(a)
Ancestor
Genome
duplication
Homeologs
after genome
duplication
Gene loss (e.g., pseudogenes)
Conserved
syntenic
paralogs
Figure 16.20 Evidence of past whole-genome duplications. (a) Following a whole-genome duplication,
gene loss via pseudogene formation results in a “diploid” species. (b) Evidence of past whole-genome
duplications in the Arabidopsis genome. Colored bands connect duplicated segments. Twisted bands con-
nect duplicated segments having reversed orientations.
occur. The presence of numerous small deletions, duplica- pseudogenes or becoming subfunctionalized, the initially
tions, and inversions suggests that chromosome structure is tetraploid species evolves into one whose chromosomes
dynamic on a local scale. An example of a loss of micro- behave as a diploid. This process has been termed dip-
synteny can be seen in the loss of strict colinearity between loidization (Figure 16.20a).
the mouse and human chromosomes shown in Figure 16.19. Evidence for both past whole-genome and smaller seg-
As we discuss later in this chapter, small rearrangements are mental duplications can be seen in the Arabidopsis genome
also found within individuals of a single species. in Figure 16.20b. Although whole-genome duplications
Another striking feature of most eukaryotic genomes (e.g., polyploidy) are particularly abundant in plants (see
examined to date is the evidence of past whole-genome Section 10.3), they are not limited to plants. Evidence of past
duplications as well as smaller duplications involving genome duplications is seen in fungal (e.g., S. cerevisiae) as
only segments of chromosomes. Whole-genome dupli- well as vertebrate (e.g., Danio rerio) genomes.
cations result in gene duplications on a massive scale
and have contributed significantly to the evolution of
many eukaryotic lineages. A whole-genome duplication
16.4 Functional Genomics Aims
instantly provides duplicate sets of genes, referred to as to Elucidate Gene Function
homeologs, that can subsequently undergo sub- and neo-
functionalization, the latter a driver of evolution. Immedi- Although the genome sequence supplies a catalog of genes
ately following a whole-genome duplication, a previously for an organism, it does not directly provide an under-
diploid species is transformed into a tetraploid. How- standing of how the genes direct the organism’s develop-
ever, over time, through duplicate genes evolving into ment and physiology. For this, we need to know when and
16.4 Functional Genomics Aims to Elucidate Gene Function 619
where genes are expressed, the phenotypes of loss- and The first application of high-throughput sequencing to
gain-of-function alleles, which other genes act in the same transcriptome analysis of the yeast genome was published in
or redundant pathways, and which proteins each gene prod- 2008. It provided precise descriptions of the 5′ and 3′ ends
uct interacts with. Functional genomics is the study of gene of transcripts and clarified gene annotations. Subsequent
function from a whole-genome perspective. similar studies on other species followed, revealing the
High-throughput technologies, in which a large num- extent and nature of alternative splicing, which is prevalent
ber of genes are analyzed simultaneously, have enabled in most multicellular eukaryotes. Such experiments have
genome-wide examination of RNA- and protein-expression also facilitated gene annotation by identifying novel tran-
patterns, genetic interactions, and protein–DNA as well as scripts. Genes that had not yet been annotated using com-
protein–protein interactions. In addition, high-throughput putational approaches have often been identified by using
technologies have facilitated the creation of mutant alleles expression data.
of all genes in the genome of some model genetic species. One surprising result from the application of next-
In this section, we describe some high-throughput tech- generation sequencing of transcriptomes was the large
nologies of functional genomics and consider what we have number of previously unidentified transcripts, many of
learned by applying them to model organisms. them noncoding, present in the cells of many multicellu-
lar eukaryotes. Some of these have been shown to encode
microRNAs or lncRNAs (see Section 13.3), but many others
Transcriptomics do not have any as-yet-known functions. The numbers of
such transcripts range in the hundreds in some invertebrates
One important clue to the function of a gene is when and
to thousands in mammals, and an active area of research is
where the gene is expressed. The study of gene expression
to identify the functions, if any, for these RNA molecules.
from a genomic perspective is called transcriptomics, and
the set of transcripts present in a cell or organism is called
the transcriptome. Two high-throughput techniques used
to analyze the transcriptome are high-throughput sequenc- DNA Microarrays DNA microarrays consist of collec-
ing of cDNA and hybridization on DNA microarrays. High- tions of synthesized DNA fragments (oligonucleotides)
throughput sequencing is becoming the dominant method, attached to a solid support (Figure 16.21b). The DNA
but DNA microarrays are still in widespread use. Below we fragments are of a fixed length, usually 25 to 70 bases.
describe the two techniques and illustrate their use in tran- The specific DNA sequences, representing sequences
scriptomic analyses. present in a genome, are chemically synthesized on a
silicon substrate, called a chip, at high density—tens of
Transcriptome Analysis by Sequencing High-throughput thousands to millions of oligonucleotide sequences per
DNA sequencing techniques (see Section 7.5) provide a direct array, each sequence located on a different spot in the
way of assaying the transcriptome. In this approach, RNA array. Following hybridization with a fluorescent probe
isolated from the cells of interest and converted into cDNA representing cDNA, the intensity of the signal from each
is fragmented and sequenced by high-throughput technol- of the spots reflects the concentration of the sequence
ogy. The resulting sequence, often referred to as “RNA- complementary to the probe. One advantage of micro-
seq,” is then compared with the reference genome sequence arrays is that they can be custom designed, because the
to identify similar sequences that are present in the cDNA spots can be added independently. An expression array
population as a whole (Figure 16.21a). The power to exam- carries unique sequences from every annotated gene of
ine gene expression patterns through the use of sequencing is the genome. Hybridization of an expression array with
limited only by the degree to which mRNA can be extracted labeled cDNA probes produces quantitative information
from specific cells or tissues and converted to cDNA, with about the relative expression levels of the genes repre-
the sequencing of mRNA from a single cell now possible. sented on the array.
The sequencing approach has two advantages over Arrays can also be designed to identify binding sites
hybridization-based techniques used with microarrays. and proteins bound to DNA, including transcription factor
First, the sequencing approach has the potential to be more binding sites and histone modifications (see Section 13.2).
quantitative. Since millions of cDNA fragments can be This is accomplished by applying the technique of chro-
sequenced, precise quantitative data on gene expression matin immunoprecipitation (ChIP) at a whole-genome
levels can be obtained. The number of times a sequence level. As described in Chapter 13, DNA that is immuno-
is detected in cDNA pool reflects the relative expres- precipitated with antibodies to the protein of interest can be
sion level of that sequence in the cDNA sample. Sec- sequenced, revealing the genomic sequences to which the
ond, sequencing approaches can more easily distinguish targeted protein was bound in the cell. This technique pro-
between transcripts with similar sequences, such as alter- vides a genome-wide view of protein–DNA interactions and
native splice variants and SNPs, which are sometimes dif- is known colloquially as “ChIP-seq” (bottom two lines in
ficult to distinguish with hybridization techniques. Figure 16.21a).
620 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
(a)
Position of sequence of
interest on chromosome 6749 bp
of reference genome sequence
RNA-seq
tissue 1
RNA-seq
tissue 2
ChIP-seq
H3K27me3
tissue 1
ChIP-seq
tissue 2
(b)
Experimental sample
Oligo-
nucleotide
RNA
FPO
7325018026
HR to come
Example of Transcriptome Analysis An example from manner. Reporter genes (see Section 14.4) provide
the budding yeast S. cerevisiae illustrates how microarray one approach to determining both temporal and spatial
data can provide insight into the function of genes not pre- gene expression patterns at high resolution. Compara-
viously identified by forward genetic approaches. Diploid tive genomic techniques to identify potential regula-
yeast cells of S. cerevisiae produce haploid cells through tory sequences (see Figures 16.17 and 16.18) can guide
the developmental process of sporulation, which con- the design of reporter gene constructs. Confirmation
sists of meiosis and spore morphogenesis. From forward of expression patterns revealed by reporter gene analy-
genetic studies, approximately 150 genes were known to sis can be obtained in a process analogous to a north-
be involved in sporulation, and these could be classified ern blot (see Section 1.4) but in which a labeled RNA
into four groups defined by expression patterns and mutant probe is applied directly to tissue in which the mRNA
phenotypes. is fixed in place rather than purified and separated by
To examine genome-wide expression patterns during electrophoresis.
sporulation, diploid yeast cells were induced to sporulate,
RNA samples were taken at seven time points spanning 11
Other “-omes” and “-omics”
hours, and their expression levels were compared to identify
genes whose expression was either induced or repressed at By the same logic that produced the terms genomics and
those different times (Figure 16.22). More than 1000 genes transcriptomics, proteomics is the study of all the pro-
exhibited significant changes in expression at some point teins—collectively known as the proteome—expressed in
during the sporulation process: In about 40% of these cases a cell, tissue, or individual. Whereas the biochemistry of
the genes became induced, and in the other 60% the genes nucleic acids is predictable—any nucleic acid can base-
became repressed. In other words, more than six times as pair with any other nucleic acid, given complementary
many genes as had been identified previously were likely to sequences—the biochemistry of the proteome is compli-
play some role during sporulation. cated by the much greater range of protein structures and
The researchers categorized the induced genes by their functions. The study of proteins thus requires techniques
expression patterns, expanding the four previously described tailored to specific subsets of proteins.
patterns to at least seven. Genes with expression patterns Multiple high-throughput technologies have been
similar to those of known genes could be hypothesized to developed for proteomic analyses, including techniques to
have biological roles similar to those of the known genes. study protein expression, protein modification, and protein–
For example, some “Early I” genes (see Figure 16.22) are protein interactions. Examples of the latter—techniques that
known to function in the synapsis of homologous chro- reveal whether and how different proteins interact—provide
mosomes. By extrapolation, other Early I genes whose information on the functioning of biological systems by
functions are unknown may also have roles in synapsis of identifying, for instance, sets of proteins that form a com-
chromosomes, suggesting areas for experimental study to plex. Here we discuss one technique for identifying interact-
support or refute the predicted roles. Similarly, comparisons ing proteins.
of sequences upstream of coordinately regulated genes can The two-hybrid system is a high-throughput method
provide information on gene regulation. For example, more for discovering whether two proteins interact. This system
than 40% of the Early I genes have a consensus upstream is based on the modular nature of the Gal4 transcription
regulatory sequence (URS1) to which the transcription fac- factor from yeast that binds to the GAL4 upstream activa-
tor UME6 binds, suggesting that this set of genes is coor- tion sequence (or UASGAL4), which is an enhancer element,
dinately regulated by the same transcription factor. This to activate the transcription of genes involved in galactose
research into temporal gene-expression patterns during metabolism (see Section 13.1). One domain of the Gal4
sporulation has provided clues to the functions of hundreds protein, the DNA-binding domain, binds to the UASGAL4
of previously uncharacterized genes, some with homologs sequence; a second domain, the activation domain, acti-
in humans. vates transcription by interacting with RNA polymerase II
Transcriptome analyses are routinely used in functional as well as other chromatin factors (Figure 16.23a). The two
genomics studies, including many pertaining to the study of domains can be physically separated.
human cancers. In cancer studies, for example, they allow To test whether two proteins interact, one of the proteins
precise characterization of gene expression in morphologi- to be tested is translationally fused (see Section 14.4) to the
cally similar but molecularly different cancers, facilitat- Gal4 DNA-binding domain (BD), and the other protein to be
ing the design of targeted treatments using drugs known to tested is translationally fused to the Gal4 activation domain
affect specific gene products. (AD). Both of these chimeric genes are then transformed into
Although transcriptomics can provide a broad over- a single yeast strain. If the two proteins interact, the Gal4-BD
view of genome-wide gene expression, techniques in and Gal4-AD will be brought together, and Gal4-activated
addition to transcriptomic approaches are often required genes will be transcribed. Conversely, if the two proteins do
for multicellular organisms, to provide details concerning not interact, no transcription of the Gal4-activated reporter
genes that are expressed in a tissue- or cell-type–specific gene will be observed. To facilitate the screening process,
622 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
Enhancer
sequences Genes
Hours MDH2
S1
MLB1
E
DAL7
MS
YEP3
UR
0 ½ 2 5 7 911 YJL060W
MET3
DTP
ACB1
YGR087C
MEP2
MET17
Metabolic
GDH1
ARG1
FAA2
YNR074C
PYC1
MET6
YPR002W
ICL2
INO1
ACO1
YOLI25W
DAL2
GDH3
YMR018W
SCC2
ZIP1
YDR374C
PAD1
Early I
RAD61
DNC1
LEU1
YGL117W
RAD64
YGL100C
IME4
KIP3
DOC1
HFM1
YHL024W
SPO12
THR202W
Hours after induction BAT1
YIL024C
of sporulation SMT4
HOP1
YIL121W
Early II
FKH1
YJL046W
0 ½ 2 5 7 9 11 IME2
REC104
Green lines represent MEI8
YMR144W
repression of gene YDR026W
SPO11
YNL180C
expression relative to BAR2
YGL076C
time 0.
TEL2
YLP084W
YGL061W
HOP2
MSH4
YOL100W
MEI4
Early middle
BAE3
REC102
YLL047W
REC114
POB5
YOR252C
NAB4
SPO16
YPL267W
YGR023C
POP4
EXO1
CDC14
YDR055C
KEL2
DIN7
YNL013C
YMLO34W
Genes can be ORM1
YGR226C
placed into classes YUH1
YPL034W
YDR117C
based on gene YMR184W
CLB1
expression patterns. DBF20
APC4
Red lines represent CCC1
YLP366W
YKL107W
activation of gene MET12
YBL178C
YJR034C
expression relative HBT4
GFA1
to time 0. SPS18
MUD13
YPR078C
CDA2
CDC10
YGR278C
YER066C
Middle
YNL018C
YOL015W
YJR113C
CDC3
CRC1
PEB4
YJL038C
YGL170C
YOL047C
YLR102C
YFL012W
SPR3
SPRS8
YGL016C
YDL115C
SPS1
YDA147W
YDA104C
REV7
YOL024W
CDC20
YCK3
TEP1
YLR013C
YLR013W
YLR041W
MRPL37
HXT14
APC11
YIL112W
YBR064W
YDR070C
YOL132W
NDT80
YDL114W
Mid late
YNL034W
YGL138C
SRD2
YNL208C
IBC10
YCR041C
YDR380W
YLR012C
YLL029W
YHR151C
Late
YHL028W
YJL017W
YAL055W
YBL042C
YOR114W
YNL166W
DIE2
YBR168W
YBR028C
DIT2
SHC1
YDL024C
DIT1
Figure 16.22 Analysis of yeast transcription patterns using microarrays. Each column shows a
different pattern of gene expression, correlating to a different point in time in the sporulation process.
16.4 Functional Genomics Aims to Elucidate Gene Function 623
(a) The Gal4 transcription factor is modular. (b) Ten yeast proteins tested in a reciprocal two-hybrid
experiment.
The DNA-binding The activation domain (AD)
domain (BD) interacts with RNA pol II to
-A
binds the UASGAL4 Gal4
G1 W
stimulate transcription.
LA 034W
C
C
YG 105C
activation
YOR353
YA R120
sequence.
AP L082
YP A22
AP 3
VMA6
domain
G1
y:
P4
R
Bait:
L
VM
Pre
YO
Gal4 RNA
binding polymerase VMA6
domain Transcription VMA22
UASGAL4 Promoter lacZ reporter gene YPR105C
YGR120C
YAL034W-A
LAP4
YOR353C
The Gal4-BD and Gal4-AD can be separated. YOL082W
Each can be fused to a different protein, the APG13
Gal4-BD to the bait protein and the Gal4-AD to APG1
the prey protein, to test whether the bait and
prey proteins interact. No growth means Growth means
the two proteins the two proteins
are not interacting. are interacting.
Gal4-AD If bait and prey do not interact,
Prey transcription cannot be activated,
and no transcription occurs. From the two-hybrid
Bait interaction data, a network
Gal4-BD of interacting proteins can
No transcription be inferred.
LAP4
YOL082W YAL034W-A
If bait and prey interact, Gal4-BD
and Gal4-AD are indirectly
YOR353C YGR120C APG13 APG1
Gal4-AD connected, and transcription
Prey occurs.
Bait
YPR105C YGR120C acts as a
“hub” connecting
Gal4-BD the other proteins.
Transcription VMA6 VMA22
Figure 16.23 Identifying protein–protein interaction networks. (a) The two-hybrid system identifies
interacting proteins. (b) Application of the two-hybrid system identifies networks of interacting proteins.
Q Why might some proteins be incapable of being analyzed by the two-hybrid system?
an auxotrophic yeast strain is often used in which UASGAL4 Genomic Approaches to Reverse Genetics One surpris-
drives expression of a gene that will complement the auxo- ing result of genome sequencing was the large number of
trophic defect. For example, a histidine auxotroph with a genes identified by sequence analysis but not previously
UASGAL4:HIS transgene will not grow on media lacking his- identified by forward genetic screens. Even in an intensely
tidine unless Gal4-mediated transcription is active. However, studied organism such as S. cerevisiae, only about 1000 of
certain interactions cannot be detected with the standard two- the more than 6000 genes in the genome had been identi-
hybrid system, including those in which the interacting pro- fied by forward genetic screens. Of the remaining 5000 or
teins are not efficiently transported into the nucleus and those so genes, about half had some sequence similarity to genes
in which proteins require a third partner for interaction. with a known or probable function, whereas the other half
Two-hybrid approaches have been applied success- did not exhibit homology to any other known genes in other
fully to many model systems, providing information on model systems. Analyses of other multicellular eukaryotic
their protein-interaction networks. In S. cerevisiae, all pair- genomes had similar outcomes.
wise combinations of the more than 6000 proteins encoded At the same time, the high-throughput techniques
in the genome have been tested, providing an overview discussed above have limitations. They can provide infor-
of protein-interaction networks in the living yeast cell (see mation on gene expression patterns and protein–protein
igure 16.23b). The sum of all of the protein–protein interac-
F interactions, but to fully understand gene function, we
tions in an organism is known as the interactome. must be able to analyze loss- and gain-of-function alleles.
624 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
survival of the organism. In addition, 186 of the deletion gene is a translational fusion with green fluorescent protein
mutants had a reduced-growth phenotype as heterozygotes (GFP) permits visual determination of the subcellular loca-
before induction of meiosis, thus indicating haploinsuffi- tion of proteins.
ciency of these genes. (Recall that haploinsufficiency is a
dominant phenotype in diploid organisms that are hetero-
Genetic Networks
zygous for a loss-of-function allele.) For the remaining
5000 genes, both haploid deletion mutants and homozy- Identification of genetic interactions can provide clues to
gous diploid mutants were obtained. However, 891 of these gene function by revealing that two genes act in the same
mutant strains exhibited a slow-growth defect in rich media pathway or redundant pathways (see Section 14.1). Data
under optimal conditions, which indicates that the genes are derived from double mutants identify sets of interacting
required for vital biological processes in optimal growth genes that define genetic networks.
conditions. This leaves about 4000 genes for which no obvi- An extreme example of a genetic interaction is synthetic
ous mutant phenotype is detected under optimal growth lethality, where the mutation of either gene alone is not lethal
conditions. These genes are referred to as nonessential, but but mutation of both genes together results in lethality (see
that classification is dependent on environment; in other Figure 14.5). A genome-wide estimate of the number of syn-
words, the genes are nonessential under optimal laboratory thetic lethal interactions in S. cerevisiae was obtained by using
growth conditions. mutants representing 132 genes and analyzing their genetic
One possible explanation for the lack of conspicu- interactions. For genes whose single-mutant phenotype is invi-
ous mutant phenotypes associated with 4000 nonessential ability, conditional alleles were used; for nonessential genes,
S. cerevisiae genes is that the genes are required only under null alleles were used. Each of the 132 mutants was crossed
specific growth conditions. To test this hypothesis, each with 4700 viable deletion mutants, and the double-mutant
mutant strain was grown under a variety of environmen- phenotypes were examined. Approximately 4000 different
tal conditions, including variations in temperature, media synthetic lethal interactions were identified, involving about
composition, and the presence of antifungal compounds, 1000 different genes. The number of interactions per gene
salts, and other chemicals known to perturb specific bio- ranged from 1 to 146, with an average of 34. One striking
logical processes. As a result, yeast geneticists discovered feature of this genetic interaction study is that essential genes
measurable growth defects under at least one environmen- exhibited about five times as many interactions as did “non-
tal condition for 3800 of the 4000 genes previously iden- essential” genes. These results suggest that genetic networks
tified as nonessential. Thus, these genes are required for consist of a small number of essential genes participating in
efficient growth in at least one tested environmental condi- many interactions and a larger number of nonessential genes
tion; they are not really “nonessential” from an evolution- participating in fewer interactions (Figure 16.26).
ary perspective, because their presence is likely to provide If the same level of synthetic lethality exists for the
a selective advantage. Growth defects were not found for remaining genes in the yeast genome, it is estimated that
only about 200 deletions, suggesting that either (1) these 200,000 different synthetic lethal interactions will occur
genes are authentically nonessential, (2) the conditions among all yeast genes and that 1% of all double mutants
under which they confer an advantage were not among the will result in synthetic lethality. Thus, although only 1000
ones tested, (3) that other genes serve the same function, genes are essential under optimal laboratory growth con-
or (4) designating them as genes is incorrect. ditions as defined by single-mutant phenotypes, additional
To further analyze the essential genes, conditional genes become essential when organisms are compromised
alleles are required. Traditionally, temperature-sensitive by a mutation in another gene. One explanation for the
alleles isolated in forward genetic screens have been observed levels of synthetic lethality is that where there are
used. Various libraries of engineered conditional alleles of multiple genetic pathways, some of the pathways buffer
S. cerevisiae essential genes have also been constructed for one another, creating stable genetic systems that are better
this purpose. In one approach, each essential gene is placed able to withstand environmental and genetic perturbations.
under the control of a tetracycline-repressible promoter. In Genetic networks defined by genetic interactions often
the absence of tetracycline, the gene is expressed, but upon identify groups of genes having similar molecular func-
addition of tetracycline, gene expression is repressed, cre- tions, such as translation, lipid metabolism, or DNA repair
ating a loss-of-function phenotype. In another approach, a (see Figure 16.26). If a gene of unknown function belongs
short peptide tag that confers heat-inducible protein deg- to a genetic network in which many genes have known
radation is added to the coding regions of essential genes. roles—say, in lipid metabolism—experiments to identify
Under the normal growth temperature of 30°C, the protein the molecular function of the unknown gene might begin by
is stable, but at 37°C, the tagged proteins are degraded and investigating whether the gene in question also plays a role
lose the ability to function. in lipid metabolism.
Other types of libraries that have been constructed pro- Genetic networks constructed on the basis of genetic
vide additional tools for identifying potential gene func- interactions can be examined in comparison with groupings
tions in S. cerevisiae. For example, a library in which every based on other gene attributes, such as their annotations,
626 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
Figure 16.26 Genetic interactions identified through synthetic lethal analysis. Mutant alleles of eight
genes (BNI1, RAD27, SGS1, BBC1, NBP2, BIM1, ARP2, and ARC40) were assayed for synthetic lethal inter-
actions with the 5000 viable deletion mutants of yeast.
expression patterns, or interactomes (discerned from pro- to test genetics interactions; however, when hypomorphic
tein–protein interaction data). The prediction of biological alleles are used, genetic interactions can reveal genes encod-
functions of genes based on correlations between different ing proteins that act in the same complex or pathway (see
data sets is referred to as systems biology. Section 14.1).
Genetic interaction data often correlate well with The ultimate objective of functional genomics studies
gene expression data, since genes that compensate for one is to define the molecular function of every gene in an
another in function often exhibit similar expression patterns. organism by compiling genomic data and searching for
In contrast, genetic interactions and protein–protein interac- correlations that suggest hypotheses for further experi-
tions overlap less often. One reason is that physically inter- mentation. The discussion here has focused on studies
acting proteins are likely to act in the same protein complex, in S. cerevisiae, but similar approaches are being taken
whereas in genetic interactions involving null alleles, the in other organisms. For example, enhancer–suppressor
proteins the genes encode often act in compensating path- genetic screens described in Section 14.1 are a directed
ways that would normally be composed of different protein approach for uncovering genetic interaction networks and
complexes with related functions. For the most part, this can be applied to most organisms regardless of the avail-
generalization holds true only when null alleles are used ability of knockout libraries.
Case Study 627
C A SE S T U D Y
Genomic Analysis of Insect Guts May Fuel the World
In metagenomic analysis, biologists study genomes col- each insect is just 1 microliter (mL). They isolated and per-
lected from the multiple organisms that together inhabit formed shotgun sequencing on the DNA from the P3 micro-
a single environment. Two recent studies suggest that bial mass.
metagenomic analysis of insect digestive tracts could Warnecke estimates that the DNA in this metage-
potentially have a significant impact on the production of nomic analysis may come from as many as 300 bacterial
biofuels. species whose symbiotic relationship with the termite
Much of the current supply of ethanol for fuel is pro- allows the termite to derive energy from wood. Gene-
duced from cellulose that comes from the lignocellulose identification analyses indicate that many of the most fre-
component of corn. Lignocellulose is a mixture of cellulose (a quently found genes in these bacteria produce glycoside
complex carbohydrate composed of glucose molecules) and hydrolases (GH) that hydrolyze lignocellulose. More than
lignin (the rigid structural material that protects cellulose). 700 different GH genes representing more than 45 differ-
The production of corn ethanol requires high temperature, ent gene families were found in this study. A large group
high heat, and the use of toxic chemicals to break down the of previously unidentified genes was also found, and War-
lignin and hydrolyze the cellulose. This step is followed by necke speculates that these genes might be involved in
microbial fermentation of the sugar and distillation of etha- various kinds of lignocellulose binding and digestion
nol. Obtaining ethanol from corn in this way has adverse reactions.
effects on the environment, consumes a great deal of energy, Although Warnecke’s study detected numerous bacte-
and may not be economically viable. These are principal rial genes that may carry out cellulose digestion, it did not
reasons why the investigation of lignocellulose digestion in identify any genes responsible for lignin digestion. However,
insects is attractive. Identification and characterization of the a second study, published in 2008 by Scott Geib and col-
genes responsible for lignocellulose digestion may allow the leagues, examined lignin digestion in the Asian longhorn
development of new, biologically based methods of biofuel beetle (Anoplophora glabripennis) and the Pacific damp-
production. wood termite (Zootermopsis angusticollis). Biochemical
In 2007, the microbiologist Falk Warnecke and col- analysis of the digestive tracts and digestive products of
leagues conducted a metagenomic study of the microbes both insects shows significant evidence of lignin digestion,
in the gut of the wood-eating termite species Nasutitermes. suggesting either that the genomes of these organisms
Termites are wood-digesting creatures whose ancestors have encode lignin-digesting enzymes or that the organisms carry
inhabited cellulose-rich environments for more than 100 symbiotic microbes whose genomes encode the enzymes.
million years. Nasutitermes has a bacteria-laden gut that The researchers did not perform metagenomic analyses of
acts like a tiny bioreactor for digesting the lignocellulose in the insect genomes or digestive system contents, but they
wood. Lignocellulose provides energy for these microorgan- did identify a single fungal species in the gut of the Asian
isms, which first break down lignin to liberate cellulose and longhorn beetle whose genome is likely to encode lignin-
then break down cellulose via hydrolysis driven by hydrolase digesting enzymes.
enzymes. A great deal of additional “bioprospecting” research
Nasutitermes has a three-part stomach, the main part will be necessary to characterize the genes that encode the
of which, designated P3, contains a rich microbial mixture of enzymes driving lignin and cellulose digestion in insect guts.
hundreds of bacterial species that are primarily responsible In the process, further genomic and metagenomic analyses
for wood digestion. Warnecke and his colleagues collected may suggest ways these genes can be cloned and used to
Nasutitermes in Costa Rica. Then, in the laboratory, they replace the costly current methods of lignocellulose-based
isolated and emptied P3 and found that its total volume in ethanol production.
628 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
SU MMA RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
16.1 Structural Genomics Provides a Catalog ❚❚ Gene duplication has been a key feature in the evolution
of Genes in a Genome of complex organisms. Lateral gene transfer is a common
mechanism of acquisition of new genes in bacteria and
❚❚ Genomes can be sequenced using a whole-genome shotgun archaea, but it is less common in eukaryotes.
approach. ❚❚ By comparing genomes of related species, researchers can
❚❚ Paired-end sequencing facilitates assembly of scaffolds identify conserved genes and noncoding sequences and
consisting of sequence fragments generated by shotgun refine gene annotation. Conserved noncoding sequences
sequencing. often consist of gene regulatory sequences.
❚❚ Metagenomics studies the genetic sequences of communi- ❚❚ Intraspecific genome comparisons identify genetic variation
ties of organisms whose member species may be difficult to within a species and provide information about its evolu-
cultivate individually. tionary history and population dynamics. Both intra- and
interspecific comparisons reveal that genomes are dynamic
and can change rapidly on evolutionary timescales.
16.2 Annotation Ascribes Biological Function
to DNA Sequences
❚❚ Genome annotation indicates the locations of genes and 16.4 Functional Genomics Aims to Elucidate
other functional sequences in a genomic sequence. It aims Gene Function
to ascribe biological function to sequence data.
❚❚ Biochemical functions of some annotated genes may be ❚❚ High-throughput sequencing and DNA microarrays can
predicted based on sequence similarities with known genes reveal polymorphisms, global transcription patterns, and
as analyzed through computational approaches and bioin- transcription-factor binding sites.
formatics, but experimental verification that includes analy- ❚❚ Protein–protein interactions can be determined by using
sis of mutant phenotypes is required to determine biological genetic tools developed from the study of yeast.
functions. ❚❚ Knockout libraries are used to perform genome-wide
genetic screens that elucidate gene function. They have
allowed classification of all yeast genes as essential or non-
16.3 Evolutionary Genomics Traces the History
essential under specific growth conditions.
of Genomes
❚❚ Genes classified as essential under optimal growth condi-
❚❚ A phylogenetic tree of life can be constructed by comparing tions have on average more genetic interactions than those
sequences of orthologous genes. classified as nonessential.
❚❚ New genes can be produced by gene duplication due to ❚❚ Genome-wide analyses of synthetic lethal interactions in
unequal crossing over or by larger-scale duplications of yeast reveal large numbers of genes that are essential in
DNA, retrotransposition, and other mechanisms. genetically compromised organisms.
❚❚ Most new genes degenerate rapidly, but some are retained ❚❚ Systems biology is a research approach that correlates data
and may acquire new functions, driving the evolution of sets derived from functional genomics to define and eluci-
new species. date gene function.
PRE PA R IN G F O R P R O BLE M S O LV I NG
In addition to the list of problem-solving tips and sugges- 3. Review how new genes can arise, and understand the
tions given here, you can go to the Study Guide and Solu- possible fates of a duplicated gene.
tions Manual that accompanies this book for help at solving
problems. 4. Review how species become polyploid and then return
to a state of diploidy.
1. Familiarize yourself with the process of whole-genome
shotgun sequencing and possible complications due to 5. Review how transcriptome data can be generated and
repetitive DNA. used to examine gene function.
2. Review how transcriptome sequences can be used to 6. Review how knowledge of protein–protein interactions and
annotate genomic sequence. genetic interactions provides insight into gene function.
Problems 629
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. You have discovered a new species of archaea from a hot differences and similarities might you expect to see in the
spring in Yellowstone National Park. annotation of the sequences—for example, in number of
a. After growing a pure culture of this organism, what genes, gene structure, regulatory sequences, repetitive DNA?
strategy might you employ to sequence its genome? 8. You have just obtained 100 kb of genomic sequence from
b. How would your strategy change if you were unable to an as yet unsequenced mammalian genome. What are
grow the strain in culture? three methods you might use to identify potential genes
2. Repetitive DNA poses problems for genome sequencing. in the 100 kb? What are the advantages and limitations of
a. Why is this so? each method?
b. What types of repetitive DNA are most problematic? 9. The human genome contains a large number of pseudo-
c. What strategies can be employed to overcome these genes. How would you distinguish whether a particular
problems? sequence encodes a gene or a pseudogene? How do pseu-
3. When the whole-genome shotgun sequence of the Dro- dogenes arise?
sophila genome was assembled, it comprised 134 scaf- 10. Based on the tree of life in Figure 16.12, would you expect
folds made up of 1636 contigs. human proteins to be more similar to fungal proteins or
a. Why were there so many more contigs than scaffolds? to plant proteins? Would you expect plant proteins to be
b. What is the difference between physical and sequence more similar to fungal proteins or human proteins?
gaps?
c. How can physical gaps be closed? 11. When comparing genes from two sequenced genomes,
d. How can sequence gaps be closed? how does one determine whether two genes are ortholo-
gous? What pitfalls arise when one or both of the
4. How do cDNA sequences facilitate gene annotation?
genomes are not sequenced?
Describe how the use of full-length cDNAs facilitates dis-
covery of alternative splicing. 12. What is a reference genome? How can it be used to sur-
vey genetic variation within a species?
5. How do comparisons between genomes of related species
help refine gene annotation? 13. The two-hybrid method facilitates the discovery of
protein–protein interactions. How does this technique
6. You are designing algorithms for the bioinformatic pre-
work? Can you think of reasons for obtaining a false-
diction of gene sequences. How might algorithms dif-
positive result, that is, where the proteins encoded by
fer for predicting genes in bacterial versus eukaryotic
two clones interact in the two-hybrid system but do not
genomic sequence?
interact in the organism in which they naturally occur?
7. You have sequenced a 100-kb region of the Bacillus Can you think of reasons you might obtain a false-
anthracis genome (the bacterium that causes anthrax) and negative result, in which the two proteins interact in vivo
a 100-kb region from the Gorilla gorilla genome. What but fail to interact in the two-hybrid system?
Application and integration For answers to selected even-numbered problems, see Appendix: Answers.
14. Go to http://blast.ncbi.nlm.nih.gov/Blast.cgi and follow search more or less sensitive to mismatches or gaps. For
the links to nucleotide blast. Type in the sequence below; our purposes, we will use the default setting, which is
it is broken up into codons to make it easier to copy. automatically presented. Press “search.” What can you say
5′ ATG TTC GTC AAT CAG CAC CTT TGT GGT about the DNA sequence?
TCT CAC CTC GTT GAA GCTTTG TAC CTT GTT 15. In the course of the Drosophila melanogaster genome proj-
TGC GGT GAA CGT GGT TTC TTC TAC ACT ect, the following genomic DNA sequences were obtained.
CCT AAG ACT TAA 3′ Try to assemble the sequences into a single contig.
As you will note on the BLAST page, there are several
5′ TTCCAGAACCGGCGAATGAAGCTGAAGAAG 3′
options for tailoring your query to obtain the most rel-
5′ GAGCGGCAGATCAAGATCTGGTTCCAGAAC 3′
evant information. Some are related to which sequences
5′ TGATCTGCCGCTCCGTCAGGCATAGCGCGT 3′
to search in the database. For example, the search can
5′ GGAGAATCGAGATGGCGCACGCGCTATGCC 3′
be limited taxonomically (e.g., restricted to mammals)
5′ GGAGAATCGAGATGGCGCACGCGCTATGCC 3′
or by the type of sequences in the database (e.g., cDNA
5′ CCATCTCGATTCTCCGTCTGCGGGTCAGAT 3′
or genomic). For our search, we will use the broadest
database, the “nucleotide collection (nr/nt).” This is the Go to the URL provided in Problem 14, and using the
nonredundant (nr) database of all nucleotide data (nt) in sequence you have just assembled, perform a blastn
GenBank and can be selected in the “database” dialogue search of the “nucleotide collection (nr/nt)” database.
box. Other parameters can also be adjusted to make the Does the search produce sequences similar to your
630 CHAPTER 16 Genomics: Genetics from a Whole-Genome Perspective
assembled sequence, and if so, what are they? Can you tell a. For gene X, no gene duplications have occurred in any
if your sequence is transcribed, and if it represents protein- lineage, and each gene X is derived from the ancestral
coding sequence? Perform a tblastx search, first choos- gene X via speciation events. Are genes AX, BX, and
ing the “nucleotide collection (nr/nt)” database and then CX orthologous, paralogous, or homologous?
limiting the search to human sequences by typing Homo b. For gene Y, a gene duplication occurred in the
sapiens in the organism box. Are homologous sequences lineage leading to A after it diverged from that lead-
found in the human genome? Annotate the assembled ing to B and C. Are genes AY1 and AY2 orthologous
sequence. or paralogous? Are genes AY1 and BY orthologous
16. Consider the phylogenetic trees below pertaining to three or paralogous? Are genes BY and CY orthologous or
related species (A, B, C) that share a common ancestor (last paralogous?
common ancestor, or LCA). The lineage leading to species A c. For gene Z, gene duplications have occurred in all spe-
diverges before the divergence of species B and C. cies. Define orthology and paralogy relationships for
the different Z genes.
Last common
ancestor
(LCA) Gene X Gene Y Gene Z
A B C
Species tree
AX BX CX AY1AY2 BY CY AZ1AZ2 CZ1CZ2
Species A B C A B C BZ1 BZ3 BZ2
17. You have isolated a gene that is important for the produc- 21. A modification of the two-hybrid system, called the one-
tion of milk and wish to study its regulation. You examine hybrid system, is used for identifying proteins that can
the genomes of human, mouse, dog, chicken, pufferfish, bind specific DNA sequences. In this method, the DNA
and yeast and note that all genomes except yeast have an sequence to be tested, the bait, is fused to a TATA box
orthologous gene. to drive expression of a reporter gene. The reporter gene
a. How would you identify the regulatory elements is often chosen to complement a mutant phenotype; for
important for the expression of your isolated gene in example, a HIS gene may be used in a his - mutant yeast
mammary glands? strain. A cDNA library is constructed with the cDNA
b. What does the existence of orthologous genes in chicken sequences translationally fused to the GAL4 activation
and pufferfish tell you about the function of this gene? domain and transformed into this yeast strain. Diagram
how trans-acting proteins that bind to cis-acting regu-
18. When the human genome is examined, the chromosomes
latory sequences can be identified using a one-hybrid
appear to have undergone only minimal rearrangement in
screen.
the 100 million years since the last common ancestor of
eutherian mammals. However, when individual humans 22. A substantial fraction of almost every genome sequenced
are examined or when the human genome is compared consists of genes that have no known function and that
with that of chimpanzees, a large number of small indels do not have sequence similarity to any genes with known
and SNPs can be detected. How are these observations function.
reconciled? a. Describe two approaches to ascertaining the biological
19. Symbiodinium minutum is a dinoflagellate with a genome role of these genes in S. cerevisiae.
size that encodes more than 40,000 protein-coding genes. b. How would your approach change if the genes of
In contrast, the genome of Plasmodium falciparum has unknown function were in the human genome?
only a little more than 5000 protein-coding genes. Both 23. In the globin gene family shown in Figure 16.16, which
Symbiodinium and Plasmodium are members of the pair of genes would exhibit a higher level of sequence
Alveolate lineage of eukaryotes. What might be the cause similarity, the human d@globin and human b@globin genes
of such a wide variation in their genome sizes? or the human b@globin and chimpanzee b@globin genes?
Can you explain your answer in terms of timing of gene
20. Substantial fractions of the genomes of many plants
duplications?
consist of segmental duplications; for example, approxi-
mately 40% of genes in the Arabidopsis genome are 24. You are studying similarities and differences in how
duplicated. How might you approach the functional char- organisms respond to high salt concentrations and high
acterization of such genes using reverse genetics? temperatures. You begin your investigation by using
Problems 631
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
27. What is the difference between biochemical and biologi- G are phenotypically indistinguishable from the wild type.
cal function? You construct several double-mutant strains: The ab, ac,
ad, and ae double mutants all grow at about 80% of the
28. Using the two-hybrid system to detect interactions
rate of the wild type, but af and ag double mutants exhibit
between proteins, you obtained the following results: A
lethality. Explain these results.
clone encoding gene A gave positive results with clones B
How do the two-hybrid system and genetic interaction
and C; clone B gave positive results with clones A, D, and
results complement one another? Can you reconcile your
E but not C; and clone E gave positive results only with
two-hybrid system and genetic interaction results in a
clone B. Another clone F gave positive results with clone
single model?
G but not with any of A–E. Can you explain these results?
To follow up your two-hybrid results, you isolate null 29. Describe at least two mechanisms by which duplicate
loss-of-function mutations in each of the genes A–G. genes arise. What are the possible fates of duplicate
Mutants of genes A, B, C, D, and E grow at only 80% of genes? Does the mode of duplication affect possible
the rate of the wild type, whereas mutants of genes F and fates?
17 Organellar Inheritance and
the Evolution of Organellar
Genomes
CHAPTER OUTLINE
17.1 Organellar Inheritance
Transmits Genes Carried on
Organellar Chromosomes
17.2 Modes of Organellar
Inheritance Depend on the
Organism
17.3 Mitochondria Are the Energy
Factories of Eukaryotic Cells
17.4 Chloroplasts Are the Sites of
Photosynthesis
17.5 The Endosymbiosis Theory
Explains Mitochondrial and
Chloroplast Evolution
S
number of genes.
❚❚ Eukaryotic cells may have many copies of oon after the rediscovery of Mendel’s laws in the early
organelle DNA, and multiple genotypes
may coexist in a single cell. 1900s, Carl Correns and Erwin Baur, working indepen-
❚❚ The inheritance of organellar genomes dently, each noted a pattern of inheritance that was dis-
can be uniparental, as in maternal inheri- tinctly non-Mendelian. Both Correns and Baur were studying
tance in mammals, or biparental.
the inheritance in plants of a variegated phenotype in which
❚❚ The organization and expression of
organellar genomes reflect their evolu-
individual branches had either white, green, or variegated
tionary origins as symbiotic bacteria. leaves. Reciprocal crosses between flowers growing on
❚❚ Genes have been, and continue to be, white or green branches produced progeny that invariably
transferred from the organellar genomes
exhibited the phenotype of the female parent in the cross.
to the genome of the host cell.
The green coloration in land plants and green algae
is due to the presence of the green pigment chlorophyll,
which harvests light for photosynthesis. In plants, chlorophyll
is found in chloroplasts, which are the organelles where pho-
tosynthetic reactions convert light energy and CO2 into fixed
organic carbon. The variegated and white phenotypes stud-
ied by Correns and Baur are caused by a failure of chloroplast
632
17.1 Organellar Inheritance Transmits Genes Carried on Organellar Chromosomes 633
development in some cells, which as a consequence gametes make contributions of cytoplasm and cytoplasmic
remain colorless (white). organelles to the zygote; this pattern is termed biparental.
Biparental cytoplasmic contributions are often unequal
In the 1950s, studies demonstrated that
because one gamete contributes more of the cytoplasm and
chloroplasts contain their own genome. In the other gamete makes a smaller contribution. Additional
combination with the observation that chloro- reasons that the study of organellar inheritance differs from
plasts are strictly maternally inherited in many the study of nuclear inheritance may be summarized as
follows:
plants, this discovery suggested an explanation
for the maternal inheritance seen by Correns and 1. Multiple organelles may be present in eukaryotic cells.
Baur: The mutations they were studying must 2. Each mitochondrion or chloroplast may contain mul-
tiple copies of its chromosome. The potential presence
reside on the chloroplast genome. As we will see,
of tens to hundreds of copies of organellar chromo-
the cell’s energy-producing and energy-capturing somes in each cell stands in contrast to the two copies
organelles—mitochondria and chloroplasts, of nuclear genes present in the cells of diploid organ-
respectively—each possess their own genome isms, in terms of both number and variability.
and may be either uniparentally or biparentally 3. The genome sizes (six to hundreds of kilobases), num-
bers (few to hundreds), and identities of the genes con-
inherited depending on the species. Furthermore,
tained in the organellar genomes are variable from one
uniparental inheritance may be maternal, pater- species to another.
nal, or genetically determined. In this chapter, we 4. Traits controlled by organellar inheritance can also be
explore the transmission of organelle genomes, influenced by nuclear genes. Most biological func-
the remarkable evolutionary events that led to the tions ascribed to mitochondrial or chloroplast genes are
produced through the joint action of nuclear genes and
development of organelles, and the surprisingly
organellar genes.
dynamic interactions between the organelle and
nuclear genomes of eukaryotes.
The Discovery of Organellar Inheritance
Erwin Baur and Carl Correns were working independently
of one another in 1908—Baur on Pelargonium (gerani-
17.1 Organellar Inheritance ums) and Correns on Mirabilis jalapa (the four o’clock
plant)—when each made his discovery of non-Mendelian
Transmits Genes Carried inheritance. Baur was studying leaf-color inheritance
on Organellar Chromosomes in geraniums. He began his investigation by doing self-
fertilization experiments and found that seeds derived from
Organellar inheritance refers to the transmission of genes self-fertilization of flowers on green branches produced
on mitochondrial and chloroplast chromosomes—genes plants that contained only green leaves. Seeds derived from
that are located in the cytoplasmic organelles as opposed self-fertilization of flowers on white branches produced
to the nucleus. As with nuclear genes, expression of mito- seedlings that had only white leaves. These latter seedlings
chondrial and chloroplast genes produces proteins and grew poorly and never produced mature plants. The self-
RNAs that perform specific functions in cells. However, fertilization of flowers from branches with variegated leaves
genetic analysis of organellar inheritance differs from that produced a mixture of progeny that were either variegated,
of nuclear gene inheritance because within a fertilized egg had all white leaves, or had all green leaves.
the cytoplasm, in which the organelles are found, is not These results led Baur to make reciprocal crosses
usually derived from equal contributions of both parental between branches with different leaf colors. Using pollen
gametes. from a flower located on a branch with one leaf color, he
In many eukaryotic species, the mitochondria and chlo- fertilized ovules from a flower located on a branch with a
roplasts in fertilized eggs are uniparental in their origin. different leaf color. The results, as shown in Figure 17.1,
This means that just one parental gamete—often the mater- were progeny that invariably exhibited the phenotype of the
nal gamete—contributes all of the cytoplasm and cytoplas- female parent in the cross. This is not the result predicted
mic organelles. In some species, organelles are inherited in by Mendelian genetics (which predicts no difference in the
a uniparental manner even though equal amounts of cyto- results of reciprocal crosses), nor is it the result expected
plasm are inherited from both parental gametes. In such if leaf color were inherited on a sex chromosome. Instead,
cases, the organelles derived from one of the gametes are the outcome suggested that transmission of leaf color
selectively destroyed. In still other species, both parental occurs through maternal inheritance—that is, through
634 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
(a) Homoplasmic and heteroplasmic cells among daughter cells. If an egg cell inherits both wild-type
Nucleus Chloroplasts and mutant chloroplasts, a heteroplasmic plant with varie-
Mutant gated leaves develops. However, if by chance the organelles
inherited by an egg cell are all wild type, the branches of
Wild type the plant produced by fertilization of the egg will be green.
Alternatively, chance might result in an egg cell inheriting
Green White Variegated chloroplasts that are all mutant, in which case the plant will
Homoplasmic cells have organelles Heteroplasmic have white leaves.
with the same genotype. cells contain a
mixture of alleles.
Genome Replication in Organelles
(b) In maternal inheritance, phenotype of progeny depends Organellar DNA is packaged into protein–DNA complexes
only on the genotype of the maternal parent. in an area within the organelle called a nucleoid. Each
Parent Parent Progeny nucleoid usually contains multiple copies of the organel-
lar genome. There may be several nucleoids per organelle
× Green and multiple organelles per cell, resulting in a copy num-
any
ber for organelle genomes that is in the range of hundreds
to thousands per cell. To better understand the transmission
of mutations in organellar genomes, and their phenotypic
White
effects, let us examine how organellar DNA is replicated.
×
A major difference between replication of the nuclear
any
genome and that of an organelle is in their relationship to
the cell cycle. Each of the nuclear chromosomes is dupli-
cated once each mitotic cycle, so that daughter cells have
Green
exactly the same chromosome constitution as the par-
ent cell following cell division. In contrast, the replica-
tion of organellar genomes is not tightly coupled to the
Variegated × White cell cycle. Rather, the replication of organellar genomes
depends on three factors (Figure 17.3). First, organellar
any
transmission genetics depends on the growth, division,
and segregation of the organelles themselves (“organelle
Variegated division” in Figure 17.3). There appears to be a mecha-
nism to ensure that each daughter cell receives approx-
Figure 17.2 Homoplasmy and heteroplasmy in cells.
imately equal amounts of the organelles present in the
mother cell. Second, the segregation of genes encoded in
Q Describe the difference between homoplasmic or heteroplas- the organellar genome is connected to the division and
mic organellar alleles and homozygous or heterozygous nuclear segregation of nucleoids within an organelle (“nucleoid
alleles.
division” in Figure 17.3). Details of this process are still
being discovered, but differences in the replication rate
Homoplasmic and heteroplasmic genotypes for chlo- of nucleoids have been observed both between cells and
roplast genes explain the maternal inheritance of variega- between organelles. Third, organellar transmission genet-
tion observed by Baur in geraniums (Figure 17.2b). Ovules ics depends on the replication of the individual organel-
derived from flowers on branches that contain green leaves lar genomes (“DNA replication” in Figure 17.3). There
are homoplasmic for wild-type chloroplast genes and trans- is evidence that DNA molecules within a nucleoid are
mit only wild-type chloroplasts to their progeny. In con- related to each other; they are sometimes physically
trast, ovules derived from flowers on branches with white linked, which would suggest that they are products of
leaves are homoplasmic for a chloroplast mutation, and only DNA replication.
mutant chloroplasts are passed to progeny.
The progeny phenotypes derived from flowers on Replicative Segregation of Organelle
variegated branches illustrate the complexity of organel-
Genomes
lar genetics. Consider an ovule produced on a variegated
branch that consists of a mixture of cells. Some of them The variation in the numbers of organelles and of their
are heteroplasmic, inheriting a cytoplasm containing many genomes in different somatic cells and tissues can sig-
chloroplasts, some that are wild type and others that har- nificantly influence the phenotypic effects of mutations in
bor the mutant allele. During the mitoses and meiosis that organellar genes. Consider again the case of the variegated
produce egg cells, the chloroplasts are divided randomly leaves. If a cell is homoplasmic with regard to this trait, cells
636 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
Organelle genome heteroplasmic ancestral cell (see the cells at the bottom of
Organelle the far-right columns in Figure 17.4). This random segrega-
Nucleoid
tion of organelles during replication is termed replicative
segregation. Replicative segregation is of great impor-
Mutant tance since it affects the proportion of mutant organellar
organelle genomes in a cell, thus influencing the severity (penetrance
DNA replication occurs genome
and expressivity) of phenotypes produced by mutations in
within nucleoids, which
DNA replication organellar genomes. It can lead to genetically mosaic organ-
contain several copies of
the organelle genome. isms with both “mutant” cells and “wild-type” cells; and, as
we see with the variegated plants, it can influence transmis-
sion of mutant alleles to subsequent generations depending
on the organellar genotype of the germ cells.
In heteroplasmic individuals, penetrance and expressiv-
ity will depend on the ratio between mutant and wild-type
organelle alleles, which can vary among cells and tissues. In
Nucleoids divide within an some cases, wild-type alleles can complement mutant alleles
individual organelle. Each Nucleoid division within an organelle, so a heteroplasmic individual can often
organelle contains several tolerate a high frequency of mutant alleles without a mutant
nucleoids. phenotype being evident or becoming severe. For organel-
lar inheritance between generations, the number of chloro-
plast or mitochondrial genomes present in the germ cells is
important. In heteroplasmic individuals, transmission will
depend on what fraction of organellar genomes present in
the gametes contain mutant versus wild-type alleles. Due
to replicative segregation, gametes can be produced that
Nucleoids are distributed to are homoplasmic wild type, homoplasmic mutant, or het-
daughter organelles during eroplasmic, and they can have varying ratios of mutant and
organelle replication. wild-type alleles. Thus, replicative segregation can explain
Organelle division
Organelles are subsequently
distributed among daughter both variation in penetrance and expressivity between indi-
cells following cell division. viduals and also variable transmission, where green, white,
and variegated seedlings can all be derived from variegated
plants.
The observation that mitochondria undergo frequent
fusion and fission has implications for the segregation of
mitochondrial DNA and creates the potential for genotypes
within a cell’s mitochondria to become mixed and homoge-
Organelle with one Wild type nized. Thus, replicative segregation in mitochondria is more
mutant nucleoid complicated than that described for chloroplasts.
Now that we have described some of the complexities
Figure 17.3 Factors in replication of organelle genomes.
of transmission of the organellar genomes, for the remain-
der of the chapter we will assume that individuals are homo-
plasmic, unless there is evidence that heteroplasmy exists.
descended from this cell by division will also be homoplas-
mic. However, cells that are heteroplasmic can produce both
heteroplasmic and homoplasmic descendants. 17.2 Modes of Organellar
To see how this happens, imagine a plant cell in which
a mutation occurs in a chloroplast genome. Through segre-
Inheritance Depend on the
gation of nucleoids during chloroplast division, chloroplasts Organism
in which all copies of the genome harbor the mutations
can arise. Since chloroplasts within a cell do not fuse with The inheritance of organellar genomes occurs through two
one another, once a homoplastic mutant chloroplast arises, basic mechanisms. In many organisms, the transmission is
it does not acquire wild-type genomes from other chloro- biased toward whichever gamete contributes the bulk of the
plasts within the cell. During cell division, the chloroplasts cytoplasm to the zygote. In this case transmission can be
are randomly distributed to the daughter cells. If by chance either uniparental (maternal or paternal) or biparental. Alter-
all the organelles inherited by a daughter cell are of a sin- natively, inheritance is genetically determined: one gamete’s
gle genotype, homoplasmic cells can be generated from a organelles are destined to be transmitted to the progeny
17.2 Modes of Organellar Inheritance Depend on the Organism 637
Cell Repeat
division
Organelle
division
DNA
replication
Nucleoids replicate
at diffrent rates.
Plant cell
Mutation
Chloroplast Heteroplasmic
cell
Once a mutation produces
a heteroplasmic cell,
random partitioning of
genomes can produce a
homoplasmic mutant cell.
Heteroplasmic
cell
Homoplasmic
mutant cell
Homoplasmic
wild-type cell
while the other gamete’s organellar contributions are selec- 3. Since there is no paternal contribution, phylogenetic
tively destroyed. Even in cases where one gamete contrib- trees constructed using mitochondrial DNA sequences
utes most of the cytoplasm, genetic mechanisms may exist can be interpreted as maternal genealogies reflecting
to eliminate the residual organellar contribution from the the maternal history of species.
other gamete. Thus, the two mechanisms are not mutually 4. Human genetic diseases due to mitochondrial muta-
exclusive. In this section, we explore three cases illustrat- tions are maternally inherited.
ing three different inheritance patterns, those of mammals,
of the alga Chlamydomonas reinhardii, and of the yeast Mother–Child Identity of Mitochondrial DNA In mam-
Saccharomyces cerevisiae. mals, mothers and their children of both sexes share identi-
cal mitochondrial DNA (mtDNA). These identical genetic
Mitochondrial Inheritance in Mammals matches are put to many practical uses. One of the most dra-
Maternal inheritance of mitochondria is the norm in mammals matic examples in humans is the use of mitochondrial DNA
because the egg contributes all of the cytoplasm and the sperm to find matches between grandmothers and grandchildren who
contributes primarily a nucleus during fertilization. Maternal were separated during political unrest in Argentina during the
inheritance of the mitochondrial genome in mammals has 1970s. An Argentinean military dictatorship undertook a cam-
four important consequences that we examine in this section: paign of kidnapping and murder of political dissidents in the
early 1970s. Among those kidnapped were pregnant women,
1. Predictions of inheritance of mitochondrial muta- who were allowed to give birth before they were murdered. The
tions can be made based solely on the genotype of the children of these women were adopted by unrelated families,
mother. and their identities were hidden from their biological families.
2. Maternal inheritance allows the maternal lineage of As the political environment in Argentina became less
organisms to be examined specifically. repressive, a group known as Las Abuelas de la Plaza de Mayo
638 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
(Grandmothers of the Plaza de Mayo) demanded an accounting time. The mitochondrial DNA sequences in the present pop-
of the murder of the dissidents and the return of the adopted ulation reflect the maternal genealogy of the population as
children to their biological families. Part of the process used a whole, and construction of a phylogenetic tree based on
to identify adopted children took advantage of the maternal these sequences should allow the identification of the com-
inheritance of mitochondrial DNA—specifically, of the fact mon ancestor(s) of the species. See Genetic Analysis 17.1
that each grandmother had transmitted her mitochondria to for practice interpreting data from a research project that
her biological children, all of whom, as a result, inherited analyzed mitochondrial DNA.
identical mitochondrial genes (Figure 17.5). Her daughters
in turn passed the same mitochondrial DNA to their biologi- Mitochondrial Mutations and Human Genetic Disease
cal children. By this hereditary transmission mechanism, Human biology is highly dependent on the cellular energy
grandmothers and the children of their daughters carry iden- derived from oxidative phosphorylation reactions in our
tical mitochondrial DNA. Comparisons of mitochondrial mitochondria. It is therefore not surprising that mitochon-
DNA revealed exact matches between individual abuelas drial mutations can result in human genetic diseases (Figure
and specific children of the murdered women, allowing 17.6a). The phenotypes of mitochondrial diseases are often
many abuelas to be reunited with their grandchildren, whose highly pleiotropic, a reflection of the ubiquitous depen-
mothers had been “disappeared.” dency of cells on mitochondrial function. A hallmark of
such diseases is their strictly maternal transmission. Since
Mitochondrial DNA Sequences and Species Evolution homoplasmic null alleles in mitochondrial genes would
Mitochondrial DNA sequences are used as a tool for deci- result in lethality, mitochondrial mutations in humans either
phering the genealogical history and evolutionary rela- are partial loss-of-function alleles (see Section 4.1) or, if
tionships of mammalian species. Mitochondrial DNA is null alleles, individuals are heteroplasmic.
particularly well suited to such studies for two reasons. Leber hereditary optic neuropathy (LHON) is a mito-
First, since mitochondria are strictly maternally inherited in chondrial genetic disease in which degeneration of the
mammals, there is no recombination of alleles, as there is central optic nerve results in blindness, usually in late
with the nuclear genome. Second, some noncoding regions adolescence to early adulthood (Figure 17.6b). Like most
of mitochondrial genomes evolve quickly, with the result diseases caused by mitochondrial mutations, the LHON
that many differences in mitochondrial DNA sequence are syndrome is accompanied by pleiotropic defects, primarily
present even in closely related populations. This is particu- a range of heart abnormalities. LHON can be caused by
larly true for mammals, where the rate of mutation in the mutations in a number of different genes that encode pro-
mitochondrial genome is about 10 times that of the nuclear teins of the NADH dehydrogenase subunit involved in
genome, reflecting decreased levels of DNA mutation repair electron transport. In the pedigree shown in Figure 17.6b,
in mitochondria versus repair of nuclear DNA or higher affected individuals have a single base-pair change, result-
rates of DNA damage in the mitochondria. Since there is ing in a missense (arginine to histidine) mutation in the sub-
little selective pressure to maintain a specific sequence in unit 4 gene, ND4.
noncoding regions, mutations in these regions accumulate at Close inspection of the pedigree in Figure 17.6b reveals
a relatively steady rate. that, although all affected individuals have an affected
Once a mitochondrial mutation becomes homoplasmic mother, not all children of an affected mother exhibit LHON.
in the germ cells of an individual female, the mutation is If we assume strict maternal inheritance of the mitochon-
transmitted to all her progeny. Therefore, maternal lineages drial mutations, then the phenotype is not fully penetrant.
can be traced by following the mutational changes back in There are three possible reasons for incomplete penetrance:
the effects of heteroplasmy, the effects of genetic interac-
tions with nuclear genes, and the effect of environmental
I factors interacting with mitochondrial gene mutations to
1 2 produce a mutant phenotype. A discussion of mitochondrial
gene–-environment interactions appears in the Case Study
II
1 2 3 4 5 6 7 8
at the end of this chapter, and an example of mitochondrial–
nuclear interactions appears in Experimental Insight 17.1 on
III 651. Here we consider heteroplasmy as a cause for incom-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 plete penetrance.
Heteroplasmy can lead to incomplete penetrance of
All children in generation II All children in generation III receive
receive their mother’s mtDNA. their maternal grandmother‘s mtDNA. a human hereditary disease because, as discussed earlier,
each cell contains multiple mitochondria and each mito-
Figure 17.5 Maternal inheritance of mitochondrial genes in chondrion contains multiple copies of the mitochondrial
mammals.
genome. There is no fixed number of copies of organelle
Q How would you distinguish maternal inheritance from sex- genomes in a cell. The numbers of organelles within a cell
linked inheritance? can influence expressivity, penetrance, and transmission of
GENETIC ANALYSIS 17.1
PROBLEM Although North American bison (Bison bison) and domestic
97 B. indicus Danakil
cattle (Bos taurus and Bos indicus) descended from a common ances- B. indicus Ogaden
tor, they do not readily interbreed. However, because they still share the B. indicus Adwa
same chromosome number and structure, the production of fertile inter- B. taurus Longhorn
specific hybrids is possible. Male bison have been known to breed with B. taurus Algarvia
B. taurus Shorthorn
Cattle
female cattle, but not the converse. Twelve North American bison herds 100 B. taurus Jersey
BREAK IT DOWN: How is mitochon- (numbered 1 through 12 at right) were examined for evi- B. taurus Hereford
drial DNA inherited in mammals? dence of such interbreeding by a comparison of their B. taurus Charolais
mtDNA sequences with those of several cattle breeds B. taurus Criollo Chiapas
BREAK IT DOWN: Phylogenetic B. taurus Cheju Black
trees reveal relatedness and and related species. A phylogenetic tree constructed B. taurus Jutland
suggest common ancestry.
from the comparisons is presented here. The numbers in B. taurus Angus
the left half of the diagram represent confidence values B. taurus Holstein
for the particular relationships (100 is the maximum). Bison bison 11
Bison bison 12
a. Explain why mtDNA but not nuclear DNA is used to detect bison– 54 Bison bison 9
domestic cattle interspecific hybrids. Bison bison 10
b. Based on this phylogeny, identify which bison herds show evidence of European bison
interspecific breeding with domestic cattle. 93 Bison bison 1
55 Bison bison 6
Bison bison 2
Bison bison 7
100 100 Bison bison 3
100 75 Bison bison 4
Bison bison 5
Bison bison 8
Yak
Evaluate
1. Identify the topic of this problem 1. This problem presents a phylogenetic analysis of an mtDNA sequence in domestic
and the kind of information the cattle and in bison. We must explain why mtDNA was used rather than nuclear
answer should contain. DNA, and then we must examine the phylogeny to identify bison herds that do
and do not have bison–cattle hybridization in their lineage.
2. Identify the critical information given 2. The phylogenetic tree depicts evolutionary relationships between cattle mtDNA
in the problem. and mtDNA samples from bison.
Deduce
3. Examine the pattern of major clades 3. The phylogeny has two major clades. The bottom clade contains eight North
in the phylogenetic tree and the American bison herds (Bison bison 1 through 8) and two outside reference spe-
membership of each clade. cies, European bison and yak. The upper clade contains fourteen domestic cattle
breeds (Bos taurus and Bos indicus) and four North American bison herds (Bison
bison 9 through 12).
4. Identify the kind of phylogenetic 4. If a clade consists either only of domesticated breeds or only of bison, then the
evidence (based on mtDNA) that animals in the clade, being more closely related to one another than they are to
would be consistent with interspecific animals in other clades, do not have interspecific hybridization in their lineage. If
hybridization and also the kind that a clade contains bison and domesticated cattle breeds, then there is a close rela-
would be inconsistent with it. tionship between the bison and the cattle in that clade.
TIP: In interspecies hybridization, bison mtDNA
sequences would be more closely related to cattle
sequences than they are to other bison sequences.
Solve Answer a
5. Explain why mtDNA but not nuclear 5. We are told that female cattle interbreed with male bison, but not the reverse.
DNA sequences were used in this Since mtDNA is inherited maternally, the resulting hybrids would possess solely
phylogenetic analysis cattle mtDNA but would contain equal mixtures of cattle and bison nuclear
TIP: In mammals, all mitochondrial DNA.
DNA is maternally inherited.
Answer b
6. Determine which bison are 6. Bison herds 9 to 12 are in the same clade as a number of breeds of domestic
interspecies hybrids. cattle, signifying that their mtDNA sequences are more closely related to domes-
ticated cattle than to the wild bison and yak species. Thus these four herds have
TIP: Bison of hybrid origin will harbor cattle mtDNA from interspecific hybridization in previous generations.
mtDNA more closely related to that
of cattle than of bison.
For more practice, see Problem 24. Visit the Study Area to access study tools. Mastering Genetics
639
640 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
(a) Aminoglycoside-
induced Deafness
MELAS Myopathy Respiratory
deafness
PEO MELAS MILS deficiency
Myopathy F D-loop
Cardiomyopathy 12 S P
V rRNA T
Diabetes Myopathy
Deafness 16 S Cyt b
MELAS rRNA E LHON/
L
LHON ND6 dystonia
ND1 Homo sapiens
PEO I mtDNA
Q MELAS
Cardiomyopathy 16,569 bp
M
ND2 ND5
Chorea
MILS W A
NC Typically
PEO Y deleted in SL
Encephalopathy H Anemia
LHON Leber hereditary optic neuropathy KSS/PEO
Myopathy COI ND4 Myopathy
MELAS Mitochondrial encephalomyopathy, ND4L
lactic acidosis, and stroke-like episodes Deafness S COII ND3 LHON
Ataxia D ATPase R LHON/
MILS Maternally inherited Leigh syndrome
Myoclonus
K86 COIII G dystonia
PEO Progressive external ophthalmoplegia
MERRF Myoclonic epilepsy with ragged MERRF Deafness Cardiomyopathy
red fibers Cardiopathy NARP Myoglobinuria MELAS
NARP Neuropathy, ataxia, retinitis pigmentosa MERRF MILS Encephalomyopathy
(b)
I
1 2 3 4 5
II
1 2 3 4 5 6 7 8 9 10 11
III
Figure 17.6 Mutations in human mitochondrial genes leading to disease syndromes. (a) Muscle
functioning, hearing, and vision all require high levels of energy produced by mitochondria. (b) Pedigree
showing maternal inheritance with incomplete penetrance of LHON.
Q Why do individuals III-2 and III-3 not exhibit disease symptoms? Will their offspring be affected?
mutant alleles in various ways. The numbers of copies of to variable ratios of mutant : wild-type mitochondrial
mitochondrial genomes in human cells vary from hundreds genomes in different cells and tissues of the same hetero-
to hundreds of thousands, depending on the cell type and plasmic individual; and this too results in variable phe-
physiological state. In cells with both wild-type and mutant notypic penetrance. Disease symptoms will develop only
mitochondrial genotypes, the wild-type allele can comple- when vulnerable cells contain a high proportion of mutant
ment the mutant allele. mitochondria. For example, in the case of another mito-
In human pedigrees, heteroplasmic mothers can pro- chondrial disease, called MERRF (myoclonic epilepsy
duce wild-type homoplasmic progeny, mutant homoplas- with ragged red fibers), an individual who displayed the
mic offspring, or heteroplasmic offspring (Figure 17.7a). mutant genotype in 85% of his mitochondrial DNA did
For mitochondrial transmission in mammals, the number not exhibit a phenotype defect, whereas a cousin with 96%
of mitochondria present in the egg cell is what matters. mutant mitochondria displayed a severe phenotype. See
Human oocytes typically have a small number (e.g., 10) of Genetic Analysis 17.2 for practice in analyzing a pedigree
large mitochondria that are subsequently divided into many for evidence of various forms of nuclear and mitochondrial
smaller mitochondria in the zygote. In humans, an egg cell inheritance.
contains up to 2000 mitochondrial genomes. In heteroplas-
mic individuals, replicative segregation can lead to variable Mating Type and Chloroplast Segregation
penetrance, in which the ratio of mutant : wild-type mito-
chondrial genomes varies significantly between progeny
in Chlamydomonas
(Figure 17.7b). Chlamydomonas reinhardii is a single-celled green alga
Furthermore, replicative segregation of mitochon- with a haploid nuclear genome that harbors a single, large
drial mutations over the lifetime of an individual can lead chloroplast containing 50 to 100 genomes divided among
17.2 Modes of Organellar Inheritance Depend on the Organism 641
(a) (b)
Homoplasmic segregation Primordial germ cell Primary oocytes Mature oocytes
100% containing wild-type
mutant and mutant mitochondria
Low proportion
of mutant
Mitochondria mitochondria
Mutant Wild-type (unaffected individual)
5 to 15 nucleoids. Haploid cells of Chlamydomonas also which the different mating types preferentially transmit the
typically have about 50 copies of the mitochondrial genome different organellar genomes are presently unknown.
distributed among a small number of mitochondria in the During the mating process in Chlamydomonas, the two
germ cells and a larger number of mitochondria at other cells of opposite mating type fuse, after which the chloroplast
stages of the life cycle. genome from the mt + parent is selectively maintained, while
Matings between Chlamydomonas cells of different that from the mt - parent is degraded. As indicated above, the
mating types produce diploid algae that undergo meiosis mechanism by which the mt - cell’s chloroplast genome is
to produce haploid progeny. Mating compatibility is deter- eliminated is not known, but it is likely to involve degradation
mined by the genotype at the mt locus, and mt + individuals of that genome at some point in the mating process. A simi-
mate only with mt - individuals. Both mating types appear lar process leads to the loss of the mitochondrial genomes
to contribute equally to the cytoplasmic content of the contributed by the mt + gamete. Perhaps the degradation of
diploid zygote, but in approximately 95% of matings, the organelles or their genomes provides a possible source of
chloroplast genome is contributed by the mt + mating type. organellar DNA that may be transferred between genomes—
In the remaining 5% of matings, chloroplast inheritance into the nuclear genome, for example. (We will return to this
is biparental. The first mutation in a chloroplast gene dis- topic later in the chapter, when we discuss the evolution of
covered in Chlamydomonas was isolated by Ruth Sager in the organelles and their genomes.) For the cases in which
1954 and confers resistance to the antibiotic streptomycin biparental inheritance occurs, the presence of the two types
(str R). Analogous to reciprocal crosses between four o’clock of chloroplast genomes in the same organelle allows the
flowers of different leaf types, reciprocal crosses between genomes to undergo recombination that may result in the seg-
streptomycin-resistant and streptomycin-sensitive Chlam- regation of recombinant and parental chloroplast genomes.
ydomonas strains of different mating types give different
results, with the chloroplast genotype being contributed pri- Biparental Inheritance in Saccharomyces
marily by the mt + parent (Figure 17.8). Remarkably, though cerevisiae
the chloroplast genome is preferentially transmitted by the
mt + mating type, mitochondria are preferentially transmit- Saccharomyces cerevisiae is a single-celled yeast that can
ted by the mt - mating type. The genetic mechanisms by grow either aerobically (with oxygen) or anaerobically
642 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
Evaluate
1. Identify the topic of this problem and the 1. This problem concerns the mode of inheritance of a hereditary abnormality
kind of information the answer should in a human pedigree. The answer requires proposing a mode of inheritance,
contain. identifying family members whose phenotypes are inconsistent with the pro-
posed mode, and explaining those inconsistencies in a manner that justifies
the proposed mode.
2. Identify the critical information given in 2. The pedigree gives the phenotype of each family member in three
the problem. generations.
Deduce
3. Identify the possible modes of 3. The possibilities are that the trait might be caused by the mutation of either
inheritance of the gene causing this a nuclear gene or a mitochondrial gene. If the mutated gene is nuclear, it
abnormality. might be either recessive or dominant and either autosomal or X-linked.
TIP: Human cells contain maternally inherited If the mutation is mitochondrial, the transmission pattern will be maternal
mitochondria in addition to nuclear chromosomes. inheritance.
4. Examine the pedigree to see whether 4. The pattern is inconsistent with X-linked recessive inheritance, in which many
the pattern is generally consistent with more males than females have the recessive phenotype. Here, the ratio of
autosomal recessive or X-linked recessive six females to four males is close to 1:1, so X-linked recessive inheritance is
inheritance. highly unlikely. Autosomal inheritance is unlikely, since siblings in generation
5. Examine the pedigree to see whether III are either all affected or none affected within families.
the pattern is generally consistent with 5. In X-linked dominant inheritance, all daughters of males with the dominant-
X-linked dominant or autosomal domi- mutation are also expected to have the trait. II-5 does not transmit the trait
nant inheritance. to any of his three daughters, thus making X-linked dominant inheritance
highly unlikely. Autosomal dominant inheritance is possible, where II-3 is
nonpenetrant; but there is only a 1/32 chance (1/25) that II-5 would have five
children who do not have the trait.
6. Examine the pedigree to see whether 6. The pedigree pattern is consistent with maternal (mitochondrial) inheri-
the pattern is consistent with maternal tance. Affected individuals are all offspring of affected mothers (I-2, II-2) or
inheritance. of female II-3 (who may harbor the mutant allele but does not exhibit the
phenotype).
Answer c
8. Explain the presence of the anoma- 8. Lack of penetrance of the phenotype (as in II-3) may result from (1) variable
lous individuals whose phenotypes penetrance owing to some individuals being heteroplasmic, since some
are inconsistent with maternal could have a greater proportion of mutant mitochondria than others; (2)
inheritance. other genetic risk factors, such as alleles of nuclear genes (since females
TIP: Heteroplasmy may occur among TIP: Proteins produced show variable penetrance, alleles of autosomal genes may be influencing
the multiple copies of mitochondrial by mitochondrial the penetrance of the mitochondrial mutation, although common alleles of X
chromosomes present in each cell. genes interact with
proteins produced by chromosome genes cannot be ruled out); (3) environmental factors that influ-
nuclear genes. ence the penetrance of the phenotype.
For more practice, see Problems 10, 12, 15, 16, 17, and 19. Visit the Study Area to access study tools. Mastering Genetics
643
644 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
(a) Wild-type Mutant tetrads contain all petite spores (Figure 17.9d). Thus the
nuclear genes nuclear genes suppressive petite phenotype suppresses the wild-type phe-
c
a
notype, resulting in progeny that are all deficient in respi-
GAL4 × gal4 ration. Analysis of the mitochondrial genome reveals that
initially, suppressive petites have small deletions of mito-
chondrial DNA; but upon further growth, all copies of the
mitochondrial DNA tend to become rearranged and dupli-
GAL4 GAL4 gal4 gal4 cated. These gross defects in mitochondrial DNA lead to
losses and disruptions of mitochondrial genes and to defi-
ciencies in aerobic respiration.
Mutations in nuclear genes exhibit 2:2 segregation. Why do the mitochondria inherited from the
suppressive petite parent overwhelm those of the wild-
(b) Wild type Segregational
petite type parent? Two nonmutually exclusive possibilities are
that (1) suppressive petite mitochondria replicate faster
×
than wild-type mitochondria, perhaps due to having addi-
tional copies of a replication origin, and (2) the suppressive
petite and wild-type mitochondria fuse, and the genomic
rearrangements present in the suppressive petite mitochon-
drial genome induce rearrangements in the mitochondrial
genomes inherited from the wild-type parent. The latter
hypothesis has gained support from the observation that
2 wild type 2 petite
mitochondria within a cell often interact and fuse into a
Progeny of petite and wild-type phenotypes are produced in a 2:2 ratio, continuous mitochondrial network.
indicating that segregational petite mutations are in nuclear genes.
energy-transporting molecule. The protein complexes that are consistently encoded by the mitochondrial genome.
produce ATP are composed of gene products encoded by This suggests that genes have moved from the mitochon-
both the mitochondrial and nuclear genomes. Thus, the syn- drial genome to the nuclear genome at different times during
thesis and regulation of the protein complexes responsible evolution.
for oxidative phosphorylation and other mitochondrial pro-
cesses depend on coordination between the mitochondrial Mitochondrial Genome Structure
and nuclear genomes. In many species, mitochondrial genes
also participate in other metabolic processes and biochemi-
and Gene Content
cal reactions, including ion homeostasis and biosynthetic Genetic mapping studies and direct observation of mito-
pathways. chondrial chromosomes by electron microscopy indicate the
The general structure of a mitochondrion can chromosomes often have a circular structure (Figure 17.11).
be described as two membranes surrounding a matrix There is evidence, however, that circular mitochondrial
(Figure 17.10). The enzyme complexes responsible for oxida- genomes can assume a linear form and that the mitochon-
tive phosphorylation are found on the inner membrane. The drial genomes of certain species are primarily linear. In the
mitochondrial matrix is the site of mitochondrial genome vast majority of species, the mitochondrial genome is a sin-
transcription, translation, and DNA replication. The mito- gle molecule; but in a few species, the genome consists of
chondrial genome is responsible for only a fraction of the more than one molecule. Thus, in some species, the mito-
genes needed to carry out these processes, however, and most chondrial genome consists of one (Tetrahymena) or more
of the proteins active in mitochondrial DNA replication, tran- (Amoebidium) linear molecules that have terminal repeat
scription, and translation are encoded in the nuclear genome. sequences, which are reminiscent of telomeres.
Following their translation, nucleus-encoded mitochon- Unlike the DNA in the nucleus, mitochondrial DNA is
drial proteins are transported into mitochondria. Exami- not packaged in chromatin composed of histones. Rather,
nation of the mitochondrial genomes of different species the genomes are anchored to the inner membrane of the
reveals enormous diversity as to whether specific proteins mitochondria, in a manner similar to that of bacterial chro-
are mitochondrial- or nucleus-encoded; only a few proteins mosomes. These and other features described below give
Outer membrane
Enzymes responsible
for oxidative
phosphorylation reside Complex RNA
on the inner membrane. I polymerase
TIM
Complex translocases
Intermembrane II
space
Complex Sec
Inner membrane III translocase
Cytochrome
Matrix
c Tat RNAse P
translocase
Reactions of the Krebs Heme
cycle occur in the matrix, Complex lyase
as do several other IV
biosynthetic pathways. Oxa1
translocase
Complex
V
EF-Tu Ribosome
(a) hosts, are often extreme, owing to loss of the genes encod-
ing proteins required for oxidative phosphorylation.
Tetrahymena mtDNA
10 kb Mitochondrial Transcription
and Translation
The mitochondrial genome is transcribed by an RNA poly-
merase similar to that found in bacteria (see Section 8.2).
In some species, the mitochondrial RNA polymerase is
Human encoded by a mitochondrial gene; in other species, it is
mtDNA Spizellomyces mtDNA Amoebidium mtDNA
encoded by a nuclear gene. Transcriptional regulation of
(b)
mitochondrial gene expression also varies among spe-
cies but in most cases has features reminiscent of bacte-
rial operons. For example, transcription of the mammalian
mitochondrial genome involves the production of just three
polycistronic mRNA transcripts from only three promoters
(Figure 17.13). All promoters are within the mitochondrial
control region, and transcription is promoted in both direc-
tions, with the result that each strand of DNA is transcribed.
Transcription of the two strands generates precursor RNA
molecules encompassing the entire circumference of the
mitochondrial genome that encode both RNAs and proteins.
The rRNAs and mRNAs are flanked by tRNAs, which are
cleaved from the precursor RNAs, thus releasing the rRNA
and mRNA molecules.
Mitochondrial translation occurs on ribosomes that
resemble bacterial ribosomes (see Section 9.2). The rRNAs
Figure 17.11 Genome structures of mitochondria. utilized in mitochondria are always encoded by the mito-
chondrial genome, but the mitochondrial ribosomal pro-
teins may be encoded by either the mitochondrial or nuclear
clues to the evolutionary origin of mitochondria, as we dis- genome. In Reclinomonas americana, Shine–Dalgarno
cuss further in a later part of this chapter. sequences are present upstream of most protein-coding
The gene content and size of mitochondrial genomes genes, but such sequences are not evident in the mitochon-
vary substantially among eukaryotes (Figure 17.12a). drial genes of most eukaryotes.
Known genome sizes range from a low of 6 kb in the Most mitochondrial genomes encode many fewer than
malarial parasite Plasmodium to hundreds or thousands of the 61 different tRNA genes that are theoretically required for
kilobases in flowering plants. However, as with nuclear translation of all codons. Recall that the genetic code contains
genomes, the size in kilobases does not necessarily correlate 64 codons, of which 61 encode amino acids during transla-
with the number of genes. For example, the Saccharomyces tion. Each codon can be uniquely recognized by a comple-
mitochondrial genome is approximately five times as large mentary anticodon sequence in tRNA, but third-base wobble
as the human mitochondrial genome, but it contains only a and the redundancy of the genetic code permit genomes to
few more genes. This is because much of the extra DNA, carry fewer than 61 unique tRNA genes. Consequently, only
including some introns, is noncoding. In contrast to their 32 different tRNA anticodon sequences (i.e., 32 different
nuclear genomes, mammalian mitochondrial genomes are tRNA genes) are required to recognize the 61 codons.
particularly compact and have no introns and little noncod- The substantially lower number of unique tRNA genes in
ing DNA. Known gene numbers in mitochondrial genomes mitochondrial genomes compared with the number of codons is
vary from a low of 6 in Plasmodium to a high of nearly 100 accommodated in different ways in the mitochondria of differ-
genes in certain jakobid flagellates such as Reclinomonas ent species. In mammalian mitochondria, the rules of third-base
americana (Figure 17.12b). wobble are more lenient than they are for nuclear genes. Certain
As we discuss in a later section, all mitochondrial mammalian tRNAs can read codons with any of the four bases
genomes are descended from a common bacterial ancestral in the third position, a system that reduces the number of differ-
genome that likely possessed thousands of genes. The dif- ent tRNA genes needed in mammalian mitochondria to 22.
ferences between mitochondrial genomes in living organ- In some mammalian species, not all mitochondrial
isms reflect differential losses of genes from the ancestral tRNAs are encoded in the mitochondrial genome; instead,
genome in the different lineages. Gene losses in parasites some nucleus-encoded tRNAs are imported into mitochon-
such as Plasmodium, which obtains its energy from its dria. In extreme cases, such as Plasmodium, all tRNAs have
17.3 Mitochondria Are the Energy Factories of Eukaryotic Cells 647
(a)
rpl31 rns
cox2 orf169 nad8
F D-loop rrn5 rnl
12S P E K rpl11
V rRNA
T
orf64 cox1 S2 C L3 P
16S Cyt b cob2 rpl1
orf197 rpl10
rRNA E
L atp6
ND6 rpoB
ND1 rps2 0
I nad2 65 5
Q Homo sapiens nad4
M mtDNA nad5 60 10 rpoC
ND2 ND5
16,569 bp nad4L
WA sdh2
NC rps12
sdh4 55 15
Y SL Reclinomonas rps7
H sdh3 americana
COI ND4 mtDNA nad11
orf717
ND4L 50 69,034 bp 20
S COII ND3 H nad1
D ATPase R rpoD
K86 COIII G cox11
I2G2Q cox3
R2 45 25
rpl32 HindIII tufA
yejW rps10
yejV 40 30
(b) yejU 35 rpl12
yejR rps19
100 rps3
rp134
rpl16
rpl27 rpl20 rpl14
90
nad3 I1 rpl5
nad10 S1 Me secY rps14
80 R1 L3
nad9 V G Mf L2 rps13 rps8
nad7 D rps11 rpl6
F A
70 nad6 N rpoA rpl18
atp1 rsp4 rps1
Number of genes
40
30
20
10
0
o
on a
Ho a
Sa inom m
an bid a
sis
m
ar nas
ro as
om as
od s
Ca on
m
M teri
as yce
ig
ti
iu
iu
ha on
on
op
Ar an
os
o
Pe trid
hy
m
fe
m
ch
sc
m
no
e
od
io
M
cli
d
id
Pl
Rh
cc
Re
Cy
to be imported since none are encoded in the mitochondrial Still, in many species the mitochondrial genetic code is the
genome. In addition to mechanisms that reduce the total same as the universal code, thus supporting the hypothesis
number of different tRNA genes encoded in mitochondria, that most of the changes listed in Table 17.1 occurred rela-
there are differences between the mitochondrial genetic tively late in the evolution of the major branches of eukary-
codes of certain animals, plants, and fungi (Table 17.1). otes. Some of the same differences have apparently evolved
648 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
independently in multiple mitochondrial lineages, suggest- called a chromoplast. Regardless of type, all plastids and
ing that certain changes may confer a selective advantage. their derivatives possess a genome.
It may be that the reduction in tRNA gene number in the Chloroplasts resemble mitochondria in being enclosed
mitochondrial genome is related to the relaxed evolution of by a double-membrane system (Figure 17.14). However,
the mitochondrial genetic code. chloroplasts also possess a third membrane system, the thy-
lakoid membranes. These membranes reside in the stroma,
the region equivalent to the matrix of the mitochondrion.
17.4 Chloroplasts Are the Sites The protein complexes that carry out photosynthetic reac-
tions are embedded in the thylakoid membranes. As with
of Photosynthesis mitochondria, most chloroplast proteins are encoded in the
nuclear genome but are produced and regulated through
Chloroplasts—present in green plants, their algal relatives, interactions between the two genomes (plastid and nuclear).
and many other taxa that carry out photosynthesis—are
only the most familiar of various organelles derived from a Chloroplast Genome Structure and Gene
precursor organelle called a plastid. In the green tissues of
plants, plastids differentiate into chloroplasts in response to
Content
light; but in nongreen tissues, plastids may differentiate into Many structural features of chloroplast genomes are similar to
other types of specialized organelles. For example, toma- those of bacterial and mitochondrial genomes. For example,
toes get their red color from pigments in a plastid derivative the chloroplast genome is anchored to the inner chloroplast
Stroma PC Cyt c –
Cyt b–f
PSII
HCIII
membrane, and chloroplast genomes are not packaged in chro- abundant protein on the planet. RuBisCO is composed of two
matin composed of histones. Like mitochondrial genomes, protein subunits, abbreviated rbcL and rbcS, for the large and
chloroplast genomes are generally found to be circular, on small subunit, respectively. Whereas rbcL is encoded in the
the basis of genetic and molecular mapping as well as direct chloroplast genome (Figure 17.15b), rbcS is encoded in the
observation with the electron microscope. However, there is nuclear genome, providing another example of the extensive
evidence that linear chloroplast genomes may also occur. coordination between the two genomes, which in this case
The similarity of chloroplast genomes and bacterial genomes must cooperate to produce appropriate quantities of the two
reflects the ancestral evolutionary relationship that we explore subunits.
in Section 17.5.
Compared with mitochondrial genomes, chloroplast
Chloroplast Transcription and Translation
genomes are structurally less diverse. Chloroplast genomes
range in size from 120 to 200 kb and usually encode 100 to Transcription and translation of chloroplast genes are similar
250 genes; the precise gene content varies between species. to those of bacteria. Many chloroplast genes are arranged in
The chloroplast genome of Marchantia polymorpha is typi- operons and as a result are coordinately transcribed. The RNA
cal of many (Figure 17.15a). Whereas chloroplast ribosomal polymerase resembles that found in bacteria and, as in bacteria,
proteins may be encoded by either the chloroplast or nuclear recognizes consensus sequences (similar to those of bacterial
genome, the rRNA is always encoded by the chloroplast promoters) at -10 and -35 of chloroplast gene promoters (see
genome, and the tRNA molecules are usually encoded by Section 8.2). Like bacterial mRNAs, chloroplast mRNAs
the chloroplast genome. Most of the remaining chloroplast are neither capped at their 5′ end nor polyadenylated at their
genes with known functions encode proteins involved in 3′ end. However, some RNA processing occurs, such as the
photosynthesis. removal of introns from a few genes and RNA editing in most
One of the photosynthetic genes in the chloro- land plants (a process described in more detail below). The
plast genome encodes the large subunit of ribulose- ribosomes of chloroplasts are also similar to those of bacteria.
1,-5-bisphosphate carboxylase/oxygenase, the enzyme For example, ribosome function is disrupted by aminoglyco-
responsible for the fixation of carbon from CO2. The enzyme, side antibiotics, which also inhibit bacterial ribosome function.
often abbreviated RuBisCO, represents up to 50% of the From 30 to 35 different tRNAs are usually encoded by
protein content of green plants and is thus possibly the most the chloroplast genome, and as a result all codons can be
650 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
Figure 17.15 Chloroplast genome of (a) Lactuca sativa (lettuce) and (b) Marchantia paleacea (liverwort).
translated without the additional wobble found in mitochon- results in C-to-U (or, less frequently, U-to-C) changes in
dria. The kinds of deviations from the universal genetic code organellar mRNAs. In contrast to the RNA editing to insert
that are seen in mitochondrial genes are not observed in and delete bases, the RNA editing in the organelles of plants
chloroplasts. does not utilize a guide RNA. Rather, C-to-U editing is per-
formed by an enzyme, C deaminase, which converts the C
to a U, whereas U-to-C editing is presumably performed by
Editing of Chloroplast mRNA the reverse reaction, the addition of an amine group to the
RNA editing is the process of altering the sequence of an U. Proper RNA editing in these cases requires the presence
RNA molecule after transcription from the DNA genome of specific sequences adjacent to the sites to be edited, sug-
(see Section 8.4). RNA editing was first discovered in gesting that the adjacent sequences represent binding sites
the mitochondria of trypanosomes, where insertion (or, for trans-acting proteins.
less frequently, deletion) of U residues occurs in mito- Not surprisingly, given that the mRNAs of several genes
chondrial mRNAs. The mechanism by which this edit- encoding proteins involved in photosynthesis are edited,
ing process occurs (described in Section 8.4) involves genetic screens designed to identify mutants in which pho-
complementary guide RNAs that are encoded in the tosynthesis is compromised have identified nuclear genes
mitochondrial genome. The guide RNAs provide a tem- controlling chloroplast RNA editing. For example, muta-
plate on which the changes to the target mRNA are made; tions in the nuclear CCR4 gene of Arabidopsis result in
there, enzymes either add or delete U residues from the a loss of C-to-U editing of one nucleotide in the ndhD
mRNA. mRNA within chloroplasts; this editing normally generates
RNA editing has also been noted in the mitochondria a start codon, AUG, from the ACG encoded in the chloro-
and chloroplasts of land plants, where the editing process plast genome (Figure 17.16).
17.5 The Endosymbiosis Theory Explains Mitochondrial and Chloroplast Evolution 651
The ndhD gene and the primary evolved along with their hosts to produce the diversity we
mRNA transcript contain an observe in organelles today. The principal lines of evidence
ACG triplet in the position of supporting the endosymbiosis theory of mitochondria and
the translational initiation site. chloroplast evolution, several of which are discussed below,
DNA ndhD gene including the following:
Chloroplast ACG ❚ The double-membrane system found in both organelles
genome
is derived from a similar membrane system found in
Transcription
bacteria.
mRNA ACG ❚ The organelles are similar in size to extant bacteria.
❚ Organellar DNA is packaged in a manner similar to the
packaging of chromosomes in bacteria and dissimilar
CCR4
ACG recognizes to that of DNA in the nuclear genome.
CCR4 specific ❚ The transcriptional and translational machinery of the
sequences
C deaminase flanking the
organelles closely resembles that of bacteria.
Edited mRNA AUG
site to be ❚ The protein-coding sequences of organellar genes
edited in the are more like those of bacteria than like either the
ndhD mRNA.
A C deaminase recruited by nuclear genes of eukaryotes or the sequences of
CCR4 converts a specific C archaea.
to a U, changing ACG to an
AUG initiation condon.
Separate Evolution of Mitochondria
Figure 17.16 A model for C-to-U RNA editing. and Chloroplasts
The available genetic evidence indicates that mitochondria
CCR4 encodes a member of the pentatricopeptide are monophyletic; that is, all mitochondria are descendants
repeat (PPR) family of proteins. These proteins are thought from a single common ancestor. Coupled with evidence that
to play diverse roles in RNA processing, including cleav- mitochondria bear strong similarities to bacteria, this find-
age of RNA precursor molecules. Surprisingly, the other ing suggests that the point of origin of all mitochondria was
four edited sites in ndhD RNA are edited correctly in ccr4 a single endosymbiotic event (Figure 17.17).
mutants. The nuclear genomes of land plants encode large Based on the fossil record, the minimum age of the
numbers of PPR genes, and there is a strong correlation eukaryotes is approximately 1.5 to 2 billion years. One
between the number of nucleus-encoded PPR proteins and hypothesis concerning the origin of eukaryotes is that they
the extent of organellar RNA editing. It appears that each evolved from an anaerobic ancestor that acquired an aerobic
edited site in organellar RNA is processed by a different endosymbiont (the mitochondrial ancestor). This event was
trans-acting PPR protein! Studies in plant mitochondria perhaps linked with the global rise in atmospheric oxygen
have also identified PPR proteins as important components that began about 2 billion years ago and that could have pro-
of RNA processing; in so doing, these studies have illumi- vided a selective environment for aerobic organisms. Based
nated the mechanism of cytoplasmic male sterility, a pheno- on similarity in gene sequences, the closest extant relatives
type used in plant breeding that is described in Experimental of mitochondria are free-living a-proteobacteria. These
Insight 17.1. living a@proteobacteria have genomes of 4 to 9 Mb of DNA
encoding 4000 to 9000 genes, so it appears that extensive
gene loss has characterized the evolution of mitochondrial
17.5 The Endosymbiosis Theory genomes.
Explains Mitochondrial Chloroplasts are also monophyletic, having descended
from a single endosymbiotic event that occurred, accord-
and Chloroplast Evolution ing to the fossil record, at least 1.2 billion years ago (see
Figure 17.17). Based on similarity of gene sequences,
Endosymbiosis is a symbiotic (interdependent, often mutu- the closest extant relatives of chloroplasts are free-living
ally beneficial) relationship between organisms in which cyanobacteria. Existing cyanobacteria have genomes of
one organism inhabits the body of the other. Several lines 1.6 to 9.0 Mb of DNA encoding 1900 to 7400 genes, imply-
of evidence indicate that the mitochondria and chloroplasts ing extensive gene loss in the evolution of the chloroplast
inhabiting modern animal and plant cells are the descendants genome as well. Phylogenetic evidence also suggests mul-
of formerly free-living bacteria that took part in ancient tiple secondary symbioses (discussed at the end of this sec-
infections of eukaryotic cells. These ancient invaders estab- tion) in which some eukaryotes acquired a photosynthetic
lished endosymbiotic relationships with their hosts and have eukaryotic symbiont (see Figure 17.17). These events
652 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
Polycystinea
Acantharea
Foraminifera
Vampyrellids Rhizaria
Plasmodiophora
Haplosporidia
Core Cercozoa
Diatoms
SAR
Brown algae
Secondary Chrysophytes
endosymbioses Stramenopila
Oomycetes
Labyrinthulids
Blastocystis
Dinoflagellates
Apicomplexa Alveolata
Ciliates
Cyanobacterium Haptophytes
Centroheliozoa
Glaucophytes
Red algae Plantae
Green algae
Plastid
Cryptomonads
Euglenozoa
Heterolobosea
Jakobids
Preaxostyla Excavata
Fornicata
Parabasalia
Malawimonas
Thecamoebae
Vannellids
Centramoebida
Myxogastrids
Amoebozoa
Dictyostelids
Pelobionts
Host
Mastigamoebida
Tublinea
Mitochondrion Ancyromonas
Apusomonads
Breviata + Subulatomonas
Animals
c-proteobacterium
Choanoflagellates
Ichthyosporea Opisthokonta
Fungi
Chytrids
Figure 17.17 The evolutionary history of the mitochondrion and the chloroplast.
resulted in the horizontal transmission of chloroplasts the organelles contain many more organellar proteins than
among unrelated eukaryotic lineages. genes, what is the origin of the nuclear genes that encode so
Two fundamental questions arise when we consider the many organellar proteins? Are those nuclear genes derived
genomes of the organelles. First, given that mitochondrial from the ancestral symbiont genome, or did they evolve in
and chloroplast genomes contain from 6 to 100 and from the host genome? A possible answer was provided by the
20 to 200 genes, respectively, what happened to all the discovery that DNA is transferred from organellar genomes
other genes of the ancestral symbiont? Second, given that to nuclear genomes; this led to the hypothesis that genes
654 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
have been relocated from the ancestral endosymbiont this enormous amount of additional genetic information in
genome to the nuclear genome during evolution. the evolution of the eukaryotic lineage is difficult to overesti-
mate (see Figures 17.18 and 17.19).
One surprise discovered through the analysis of eukary-
Continual DNA Transfer from Organelles otic genome sequences is that all nuclear genomes seem to
The nuclear genomes of eukaryotes bear evidence of both include recent transfers of mitochondrial and chloroplast
ancient and recent DNA transfer between the organellar and sequences. Mitochondrial DNA sequences of recent origin
nuclear genomes (Figure 17.18). Ancient transfer events can be found in the nucleus have been termed nuclear mitochondrial
detected by comparative genomics of mitochondrial genomes sequences (NUMTS), whereas nuclear sequences recently
and by comparing eukaryotic nuclear genomes with bacterial derived from plastid genomes are called nuclear plastid
genomes. Sequencing of eukaryotic genomes has also revealed sequences (NUPTS). Organellar DNA sequence has been
evidence of recent transfers. Transferred sequences that are found in the nuclear genome of every organism examined.
highly similar must have been transferred recently. NUMTS and NUPTS are common in many plant species; the
Ancient gene transfers can be identified in comparisons Arabidopsis genome contains 17 NUPTS, totaling 11 kb, and
between nuclear genomes of eukaryotes and the genomes of 14 NUMTS, one of which is 620 kb and represents almost
extant a@proteobacteria and cyanobacteria. Nuclear genes two entire mitochondrial genomes. The human genome con-
that are most similar to the genes of the living bacterial tains hundreds of NUMTS, ranging from 106 to 14,654 bp
species are likely to have been derived from the bacterial long (the latter being 90% of the length of the mitochondrial
endosymbiont. Ancient transfers have been detected by com- genome).
paring the Arabidopsis nuclear genome and genomes of three Three conclusions have been drawn from the study of
cyanobacteria, leading to the identification of approximately NUMTS and NUPTS. First, given the level of sequence simi-
4300 Arabidopsis nuclear genes with a cyanobacterial origin. larity between NUMTS or NUPTS and the respective organelle
Thus, more than 10% of the Arabidopsis nuclear genome genome sequences, they are thought to represent evolutionarily
represents an acquisition of genetic information originally recent transfers of organellar DNA to the nuclear genome. Sec-
residing in the genome of the chloroplast (Figure 17.19). ond, entire organellar genomes likely were transferred to the
Similarly, comparisons between several eukaryotic nuclear nuclear genome multiple times in evolutionary history. Third,
genomes and those of a@proteobacteria detected at least 630 the process is ongoing; DNA continues to move between the
nuclear genes derived from the a@proteobacteria endosymbi- organelles and to the nucleus. Although the rate of transfer is
ont that gave rise to the mitochondrion. Thus, concomitant not known in most organisms, experiments to directly measure
with the reduction in the organellar genomes is an increase the rate of DNA transfer from chloroplast to nuclear genome
in gene content in the nuclear genome. The importance of in plants revealed a new integration of chloroplast DNA in the
Nucleus
Figure 17.18 Transfer of endosymbiont genes to the nuclear genome and destinations of encoded
protein products.
17.5 The Endosymbiosis Theory Explains Mitochondrial and Chloroplast Evolution 655
early stages of endosymbioses should have led scientists origin; this finding indicates that all portions of the mito-
to expect it. When an endosymbiotic relationship was ini- chondrial genome were either transferred to the nucleus or
tially established, the genome of the ancestral mitochon- lost. The extreme reduction of the mitochondrion to noth-
drion would have been similar in size to that of its bacterial ing but an anaerobic compartment allowing the cell to carry
ancestors. If the rate of DNA transfer was similar to that out specific reactions is likely a consequence of Giardia’s
measured today, the nuclear genome must have experi- parasitic lifestyle, where all of its energy is derived from a
enced a bombardment of DNA from the endosymbiont. host organism. This finding means that all known existing
Before the evolution of the mitochondrial protein-import eukaryotes harbor mitochondria or mitochondria-derived
machinery, proteins produced by genes transferred to the organelles.
nuclear genome had to remain in the cytoplasm or be trans- The second discovery concerns the nature of the genes
ported to the plasma membrane. Reduction in the endo- in the nuclear genomes of eukaryotic organisms. Compari-
symbiont genome could occur only after the evolution of son of the complete genome sequences of the eukaryote
systems able to import proteins into the endosymbiont. Saccharomyces cerevisiae with two bacteria (Escherichia
Such systems are composed of proteins encoded by genes coli and Synechocystis 6803) and an archaea (Methanococ-
originally derived from both the nuclear and endosymbiont cus jannaschii) revealed two general functional and evolu-
genomes. tionary categories into which the yeast nuclear genes could
be divided. One category of genes, called informational
The Origin of the Eukaryotic Lineage genes, encodes protein products that perform informational
processes in the cell such as DNA replication, packaging
The tree of life is often depicted as having three major of chromosomes, transcription, and translation. The infor-
branches—the Bacteria, the Archaea, and the Eukarya— mational genes of yeast resemble those found in Methano-
based on comparison of sequences of the rRNA genes coccus, and this resemblance includes a similarity between
(see Section 1.1). The extensive gene flow from bacterial the histones of the yeast and the histone-like chromatin
endosymbionts to the nucleus, however, has resulted in proteins present in Archaea (see Sections 8.3 and 9.2).
the presence of significant numbers of “bacterial” genes The second category of genes, called operational genes,
in the nuclear genomes of eukaryotes. Given this situa- encode proteins involved in cellular metabolic processes,
tion, a simple tripartite view of life, in which three branches such as amino acid biosynthesis, biosynthesis of cofac-
diverge from a single common ancestor, is overly simplis- tors, fatty acid and phospholipid biosynthesis, intermediary
tic. A fraction of the nuclear genome of every eukaryote is metabolism, energy metabolism, nucleotide biosynthesis,
derived from bacterial endosymbionts, but where were all and some regulatory functions. In contrast to their informa-
the remaining genes derived from? In other words, what was tional genes, most yeast operational genes resemble those
the original host of the a@proteobacterium that gave rise to of Bacteria.
the eukaryotes? One scenario consistent with the apparent origins of
Two models have been proposed to answer this ques- informational and operational genes in yeast is that the
tion. In one model, the original host is a cell described as original host cell of the a@proteobacterial endosymbiont was
having a nucleus but no mitochondria and as subsequently related to an archaeal cell (Figure 17.20). The original host
acquiring an a@proteobacterium as an endosymbiont. In genome would have contained both informational and oper-
this model, “eukaryotic” cells (cells having nuclei) existed ational genes, as would the a@proteobacterial endosymbiont.
before the endosymbiotic event, suggesting that such organ- Over time, while both genomes retained their own informa-
isms lacking mitochondria might still exist. In the second tional genes, many endosymbiont operational genes were
model, the original host is a bacterial cell that acquires an transferred to the nuclear genome and often replaced their
a@proteobacterium as an endosymbiont; and subsequently, host functional equivalents. Unlike the cases of the mito-
this host–endosymbiont system evolves other eukaryotic chondria and chloroplasts, where the endosymbionts can be
features, such as a nuclear membrane. If the latter model is traced to specific lineages of Bacteria, the putative archaeal
correct, no intermediate eukaryotes lacking mitochondria host is unknown and may have been unrelated to any spe-
should be found. cific lineage of extant Archaea.
Two recent discoveries have contributed new fuel to
this discussion. First, eukaryotic organisms that were origi-
nally thought to lack mitochondria, such as Giardia intes-
Secondary and Tertiary Endosymbioses
tinalis (which causes diarrhea when it infects the human The melding together of genomes did not happen only dur-
intestine), are now known to have mitochondria. In the ing the endosymbioses that formed mitochondria and chlo-
case of Giardia, the mitochondria are reduced to double- roplasts. Secondary and even tertiary endosymbiotic
membrane–bound structures called mitosomes. Mitosomes events have occurred between different lineages of eukary-
lack a genome, but proteins requiring an anaerobic environ- otes, resulting in the dispersal of plastids into eukary-
ment to function are imported into them. Furthermore, the otic lineages that are distantly related (see Figure 17.17).
nuclear genome of Giardia harbors genes of mitochondrial In secondary and tertiary endosymbioses, typically, a
17.5 The Endosymbiosis Theory Explains Mitochondrial and Chloroplast Evolution 657
c–proteobacterium
Ancient
Eukaryotes
Proteobacteria
Ancient
cyanobacterium
Plants
Cyanobacteria
A eukaryotic host acquires a Diversification
photosynthetic cyanobacterial of plants and
endosymbiont, the origin of gene transfer
the plastid. from organelle
to nucleus
nonphotosynthetic eukaryote envelopes an algal cell and Plasmodium resides within the phylum Apicomplexa, which
acquires a red or green algal endosymbiont. What hap- would make it a descendant of an ancient secondary endo-
pens to the nuclear genome of the secondary endosymbiont symbiosis involving a host eukaryote and an endosymbiotic
when one eukaryote envelops another eukaryote? Genes of chloroplast-containing red alga (see Figure 17.17). Is there
the nuclear genome of the eukaryotic endosymbiont (the a reason that Plasmodium, with its parasitic lifestyle, might
alga), whose products were targeted to the plastid, are have retained the apicoplast and its accompanying genome,
translocated to the nucleus of the new, primary host in a albeit without any genes encoding proteins involved in
process analogous to the movement of genes from the chlo- photosynthesis?
roplast genome to the primary endosymbiont host nuclear One hypothesis explaining retention of the apico-
genome. Thus the nuclear genome of the algal endosym- plast in Plasmodium is based on differences in translation
biont, termed the nucleomorph, undergoes reduction to of organellar-encoded compared with nucleus-encoded
the extent that it encodes only some genes for products genes. The initiator tRNA used in mitochondrial transla-
targeted to the plastid as well as some genes required for tion is a formylmethionyl-tRNA (tRNAfMet ), the same as
the maintenance of the nucleomorph genome. The plastid used in bacteria. This special tRNA cannot be imported
is serviced by three different genomes (nuclear, nucleo- from the cytoplasm, since cytosolic translation in eukary-
morph, and plastid), and the nuclear genome of photo- otes uses an initiator methionyl-tRNA that is not for-
synthetic secondary endosymbionts is a mixture of four mylated. During the evolutionary history of Plasmodium,
genomes (mitochon- drial, chloroplast, and two nuclear the gene encoding the enzyme that adds a formyl group
genomes). Because secondary and tertiary endosymbioses to the methionyl-tRNA has been lost from the mitochon-
have occurred many times during the evolution of eukary- drial genome. Since the only methionyl-tRNA formyl
otes (see Figure 17.17), the mixing and coevolution of transferase gene in Plasmodium is in the nuclear genome,
genomes has been instrumental in shaping the evolution of it is thought that the protein product of this gene is trans-
several lineages of life. ported to the apicoplast, and that tRNAfMet is produced in
The mixing and melding of genomes can sometimes the apicoplast and then transported to the mitochondria.
result in biological anomalies. For example, the discovery According to this hypothesis, the apicoplast may be main-
of a reduced chloroplast (or apicoplast) in Plasmodium tained for the sole purpose of synthesizing tRNAfMet to be
falciparum, the malarial parasite, came as quite a surprise imported into the mitochondrion—a quirk of the evolu-
because this is clearly not a photosynthetic organism. tionary history of Plasmodium.
658 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
C A SE ST U D Y
Ototoxic Deafness: A Mitochondrial Gene–Environment Interaction
Phenotypic penetrance can be affected by both genetic all deaf individuals can trace their loss of hearing to the use
and environmental factors. In the case of genetic interac- of aminoglycosides. Nearly one-fourth of these patients also
tions, the phenotypic effects of a mutation are influenced had relatives suffering from ototoxic deafness, suggesting a
by alleles at other loci. The gene products of other loci genetic susceptibility. In all 22 cases where genetic trans-
are thought either to exacerbate or compensate for the mission of the susceptibility could be traced, inheritance
mutational defect, thereby altering the expressivity or pen- was maternal, a sign of a mitochondrially inherited trait.
etrance of the phenotype. In the case of environmental A similar situation was observed for 26 families in Japan.
interactions, certain conditions either mitigate or enhance Furthermore, a large Arab-Israeli pedigree with maternally
the phenotypic effects, in essence making the mutation a inherited congenital (not ototoxic) deafness can be traced
conditional allele. Some mutations, like the one described back through five generations to a common female ances-
here, are subject to both these kinds of interaction. In this tor (Figure 17.21a). In this case, the mitochondrial muta-
particular example, the locus of the key mutation is a mito- tion is thought to be homoplasmic, since family members
chondrial gene. are either severely deaf or have normal hearing. However,
A rare complication of the use of aminoglycoside anti- the phenotype is not completely penetrant; this finding sug-
biotics, such as streptomycin, gentamicin, and kanamycin, is gests that another mutation, likely to be an autosomal reces-
irreversible loss of hearing, termed ototoxic deafness. Sev- sive nuclear mutation, contributes to the manifestation of
eral observations point to a genetic susceptibility to ototoxic the condition.
deafness. Due to pervasive use of aminoglycosides in China, In studies on bacteria, aminoglycosides stabilize mis-
it was reported that in a district of Shanghai, nearly 25% of matched aminoacyl-tRNAs in the ribosome during translation;
(a)
I
II
1 2 3 4 5
III
1 2 3 4 5 6
IV
1 2 3 4 5 6 7 8 9
V
1 2 3 4 5 6 7 8 9 10 11 12 13
(b) (c)
Mutations that disrupt Human stem loop has a Mutations that extend the base pairing at
base pairing at the foot more open foot than E. coli. the foot of the stem loop of human 12S
of the stem loop result in rRNA result in aminoglycoside sensitivity.
streptomycin resistance.
this finding explains their antibiotic effects. The presence of However, in these cases of maternally inherited deafness or
aminoglycosides causes a reduction in the fidelity of trans- susceptibility to aminoglycosides, no obvious pleiotropic
lation, leading to defective proteins. Aminoglycosides phenotypes are associated with the deafness. Is the cochlea
have been shown to interact directly both with ribosomal especially susceptible to a loss of mitochondrial function?
proteins and with the 16S rRNA of the 70S ribosome; and Are the cochlear mitochondria especially sensitive to ami-
aminoglycoside-resistant bacteria have been shown to have noglycosides? Second, what is the nature of the autosomal
point mutations in their 16S rRNA gene. Since the normal recessive mutation that acts to enhance the effect of the
target of aminoglycosides is the bacterial ribosome, the 12S rRNA mutation in the Arab-Israeli family? Could it be a
likely target of aminoglycoside ototoxicity in humans is the nucleus-encoded ribosomal protein gene that interacts with
evolutionarily related mitochondrial ribosomes, and perhaps the mitochondrial 12S rRNA? And third, if our mitochondrial
specifically the 12S rRNA that is homologous to the 16S ribosomes are evolutionarily related to bacterial ribosomes,
rRNA of bacteria. why are humans able to utilize aminoglycosides as antibiot-
Sequencing of the mitochondrial 12S rRNA gene in ics in the first place?
individuals with congenital deafness in the Arab-Israeli Clues to the answer of the third question have come
family and in other unrelated individuals with ototoxic from comparative studies of mitochondrial ribosome func-
deafness revealed that they shared a single A-to-G muta- tion. The mutation causing deafness creates an extension of
tion in their 12S rRNA genes. The mutation lies at the foot base pairing by one base in the stem loop of the mitochon-
of a stem loop conserved in bacteria, plants, and mam- drial 12S rRNA, in effect making its structure more closely
mals. Studies on bacterial ribosomes have shown that this resemble the structure of the aminoglycoside--binding
region of the 16S rRNA forms part of the aminoacyl site site of the bacterial 16S rRNA (Figure 17.21b–c). Thus, in
where mRNAs are decoded. Furthermore, aminoglycosides the 2 or so billion years since the separation of bacteria
bind to this domain of the 16S rRNA, and bacterial mutants and mitochondria, the structure of the mitochondrial ribo-
resistant to aminoglycosides map to this region of the 16S some has changed just enough so that aminoglycosides
rRNA gene. do not normally interfere with the fidelity of translation in
Thus, the cause of the aminoglycoside-induced deaf- mitochondria; but mutations that result in a more bacteria-
ness is a mutation in the mitochondrial 12S rRNA gene, like ribosome structure bring back the ancient sensitivity
but three intriguing questions remain. First, why is deaf- to aminoglycosides. It is worth noting that—at least in this
ness the primary, and perhaps only, phenotypic defect? A sense—translation in chloroplasts, which have diverged
characteristic of many mitochondrial diseases is pleiotropy from bacteria for about 1.2 billion years, remains sensitive to
due to a general loss of oxidative phosphorylation activity. aminoglycosides.
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
17.1 Organellar Inheritance Transmits Genes ❚❚ Organelles are maternally inherited in mammals and many
Carried on Organellar Chromosomes plant species, whereas in fungal species, mitochondria are
often biparentally inherited. In some species, organellar
❚❚ Mitochondria and chloroplasts possess their own genomes, inheritance is determined by alleles of a nuclear gene.
each encoding a small number of genes. The products of
these genomes function within the respective organelle.
17.3 Mitochondria Are the Energy Factories
❚❚ Because many copies of organellar DNA occur in each cell,
multiple genotypes may coexist in a single cell. of Eukaryotic Cells
❚❚ Cells or organisms in which all genomic copies of an ❚❚ Mitochondria are the sites of energy production; the
organellar gene have an identical sequence are said to be enzymes of oxidative phosphorylation are on the inner
homoplasmic for that gene, whereas cells or organisms pos- membrane.
sessing multiple alleles for an organellar gene are called ❚❚ Mitochondrial mutations often have pleiotropic effects that
heteroplasmic. reflect the role of mitochondria in energy production.
❚❚ Replication of organellar genomes and organelle division
are not directly coupled with the nuclear cell cycle.
17.4 Chloroplasts Are the Sites
❚❚ Replicative segregation of organelles can result in homo-
plasmic cells being derived from heteroplasmic cells. of Photosynthesis
❚❚ The proportion of mutant alleles in heteroplasmic cells ❚❚ Chloroplasts are the sites of photosynthesis, conducted by
influences the penetrance and expressivity of phenotypes. enzymatic reactions responsible for carbon fixation in the
stroma and by photosystem complexes that convert light to
17.2 Modes of Organellar Inheritance chemical energy in the thylakoid membranes.
Depend on the Organism ❚❚ Only a small fraction of the proteins present in a mito-
chondrion or chloroplast are encoded in the genome of
❚❚ The transmission genetics of organellar genomes is often the respective organelle; instead, most of the proteins are
determined by the relative amounts of cytoplasm contrib- encoded in the nuclear genome and posttranslationally
uted by the parental gametes. imported into the organelles.
660 CHAPTER 17 Organellar Inheritance and the Evolution of Organellar Genomes
17.5 The Endosymbiosis Theory Explains cell and have contributed extensively to eukaryotic nuclear
Mitochondrial and Chloroplast Evolution genome content.
❚❚ The process of DNA transfer from organellar genomes
❚❚ Both the mitochondrion and the chloroplast are evolu- to the nuclear genome is ongoing, and recent transfers of
tionarily derived from ancient endosymbioses in which a organellar DNA into the nucleus can be detected in most, if
bacterium (of the phyla a@proteobacteria and cyanobacteria, not all, organisms.
respectively) was incorporated into a eukaryotic cell. ❚❚ Genes transferred from the ancient endosymbiont genome
❚❚ The circular structure (in most organisms) and transcrip- to the host nuclear genome encode proteins that may be tar-
tional and translational expression of mitochondrial and geted to any compartment of the eukaryotic cell.
chloroplast genomes reflect their evolutionary origins as ❚❚ Eukaryotic informational genes are related to archeal genes,
bacterial endosymbionts of eukaryotic cells. thus suggesting that eukaryotes might be descended from
❚❚ Many of the genes present in the ancestral endosymbiont an archaea-like cell that acquired a bacterial endosymbiont.
have been transferred to the nuclear genome of the host
PREPA R IN G F O R P R O B LE M S O LV I NG
In addition to the list of problem-solving tips and suggestions 5. Know the general structure and contents of the
given here, you can go to the Study Guide and Solutions Man- organelle genomes.
ual that accompanies this book for help at solving problems.
6. Understand that organelles contain some proteins
1. Know the meanings of homoplasmy and heteroplasmy encoded in organelle genomes and other proteins
and how these properties impinge upon expressivity encoded in nuclear genomes, and how this influences
and penetrance of organellar alleles. expressivity and penetrance of alleles of organellar
and nuclear genes.
2. Be familiar with how replicative segregation can result
in homoplasmy from an initial state of heteroplasmy. 7. Recognize the origin of the organelles from ancestral
bacterial endosymbionts.
3. Recognize that the modes of organellar inheritance
differ among eukaryotes. 8. Be aware of the continuing transfer of DNA between
organelle genomes and the nuclear genome.
4. Understand that the inheritance of mitochondria in
mammals is maternal.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Reciprocal crosses of experimental animals or plants (G1, S, G2, M). On the same graph, plot the amount of
sometimes give different results in the F1. What are two mitochondrial DNA present at each stage of the cell cycle.
possible genetic explanations? How would you distin-
6. What are the differences between the universal code
guish between these two possibilities (i.e., what crosses
and that found in the mitochondria of some species?
would you perform, and what would the results tell you)?
Given that some changes (UGA = stop S Trp) have
2. How are some of the characteristics of the organelles (the occurred multiple independent times in evolution, can
mitochondria and chloroplasts) explained by their origin you think of any selective advantage to the mitochon-
as ancient bacterial endosymbionts? drial code?
3. The human mitochondrial genome encodes only 22 tRNAs, 7. What is the evidence that the ancient mitochondrial
but at least 32 tRNAs are needed for cytoplasmic trans- and chloroplast endosymbionts are related to the
lation. How are all codons in mitochondrial transcripts a@proteobacteria and cyanobacteria, respectively?
accommodated by only 22 tRNAs? The Plasmodium mito-
8. Outline the steps required for a gene originally present in
chondrial genome does not encode any tRNAs; how are
the endosymbiont genome to be transferred to the nuclear
genes of the Plasmodium mitochondrial genome translated?
genome and be expressed, and for its product to be tar-
4. What is the evidence that transfer of DNA from the organ- geted back to the organelle of origin.
elles to the nucleus continues to occur?
9. Consider the phylogenetic tree presented in Figure 17.17.
5. Draw a graph depicting the relative amounts of nuclear How were the origins of secondary endosymbiosis in the
DNA present in the different stages of the cell cycle brown algae determined?
Problems 661
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
10. You are a genetic counselor, and several members of 16. A 50-year-old man has been diagnosed with MELAS syn-
the family whose pedigree for an inherited disorder is drome (see Figure 17.6). His wife is phenotypically nor-
depicted in Genetic Analysis 17.2 consult with you about mal, and there is no history of MELAS syndrome in either
the probability that their progeny may be afflicted. What of their families. The couple is concerned about whether
advice would you give individuals III-1, III-2, III-4, III-6, their children will develop the disease. As a genetic coun-
III-8, and III-9? selor, what will you tell them? Would your answer change
if it were the mother who exhibited disease symptoms
11. A mutation in Arabidopsis immutans results in the necrosis
rather than the father?
(death) of tissues in a mosaic configuration. Examination of
the mitochondrial DNA detects deletions of various regions 17. The first person in a family to exhibit Leber hereditary
of the mitochondrial genome in the tissues that are necrotic. optic neuropathy (LHON) was II-3 in the pedigree shown
When immutans plants are crossed with wild-type plants, the below, and all of her children also exhibited the disease.
F1 are wild type, and the F2 are wild type and immutans in a Provide two possible explanations as to why II-3’s mother
3:1 ratio. Explain the inheritance of the immutans mutation (I-1) did not exhibit symptoms of LHON.
and a possible origin of the mitochondrial DNA deletions.
12. What type or types of inheritance are consistent with the I
following pedigree? 1 2
I II
1 2 1 2 3 4
III
II 1 2 3
1 2 3 4 5
14. You have isolated two petite mutants, pet1 and pet2,
in Saccharomyces cerevisiae. When pet1 is mated with 19. What is the most likely mode of inheritance for the trait
wild-type yeast, the haploid products following meiosis depicted in the following human pedigree?
segregate 2:2 (wild type : petite). In contrast, when pet2
is mated with wild type, all haploid products following
meiosis are wild type. To what class of petite mutations
does each of these petite mutants belong? What types of
progeny do you expect from a pet1 * pet2 mating?
15. Consider this human pedigree for a vision defect.
Wolf 4
What genotypes and phenotypes do you expect in the F1?
If some of the F1 plants are male fertile, what genotypes Dog
and phenotypes do you expect in the F2?
Wolf 5
23. Wolves and coyotes can interbreed in captivity; and now,
because of changes in their habitat distribution, they may Wolf 6
have the opportunity to interbreed in the wild. To examine Wolf 7
this possibility, mitochondrial DNA from wolf and coyote
populations throughout North America—including habi- Wolf 8
tats where the two species both reside—was analyzed,
and a phylogenetic tree was constructed from the resulting Jackal
data (see Section 1.4 for details on how this is accom-
plished). Sequence from a jackal was used as an outgroup
and a sequence from a domestic dog was included, dem-
24. Considering the phylogenetic assignment of Plasmodium
onstrating wolves as the origin of domestic dogs.
falciparum, the malarial parasite, to the phylum Apicom-
What do you conclude about the possibility that interspe- plexa (see Figure 17.17), what might you speculate as
cific hybridization occurred between wolves and coyotes to whether the parasite is susceptible to aminoglycoside
on the basis of this phylogenetic tree? antibiotics?
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
25. Elysia chlorotica is a sea slug that acquires chloroplasts regulation (i.e., for ensuring that the appropriate relative
by consuming an algal food source, Vaucheria litorea. numbers of the proteins in a complex are produced)?
The ingested chloroplasts are sequestered in the sea slug’s
27. As described in this chapter, mothers will pass on a
digestive epithelium, where they actively photosynthesize
mitochondrial defect to their offspring. In a type of
for months after ingestion. In the algae, the algal nuclear
gene therapy, one approach to circumvent this problem
genome encodes more than 90% but not all of the proteins
is to have two different maternal contributions, with
required for chloroplast metabolism. Thus it is suspected
the nucleus of the female with the defective mitochon-
that the sea slug actively maintains ingested chloroplasts,
dria being placed in an enucleated egg derived from a
supplying them with photosynthetic proteins encoded in
female with normal mitochondria. After fertilization, the
the sea slug genome. How would you determine whether
resulting offspring would have three parental sources of
the sea slug has acquired photosynthetic genes by hori-
DNA—with nuclear DNA derived from a mother and
zontal gene transfer from its algal food source? Discuss
a father, and mitochondrial DNA derived from another
the steps required, and their plausibility, for heritable
“mother.” Recently, children with this genetic makeup
endosymbiosis to eventuate.
have been born, but the elimination of defective mito-
26. Most large protein complexes in mitochondria and chlo- chondria is not complete, with the amount of defective
roplasts are composed both of proteins encoded in the mitochondria derived from the defective mother ranging
organelle genome and proteins encoded in the nuclear from 0 to 9%. Discuss potential complications resulting
genome. What complexities does this introduce for gene from such a mixture of genomes.
Developmental Genetics
18
CHAPTER OUTLINE
18.1 Development Is the Building of
a Multicellular Organism
18.2 Drosophila Development
Is a Paradigm for Animal
Development
18.3 Cellular Interactions Specify
Cell Fate
18.4 “Evolution Behaves Like a
Tinkerer”
18.5 Plants Represent an
Independent Experiment in
Multicellular Evolution
Multicellularity has evolved multiple times within the eukaryotes, as exem- ESSENTIAL IDEAS
plified by Volvox, a chlorophyte green alga and member of a multicellular
❚❚ Genes encoding transcription factors or
lineage independent of land plants and animals. In Volvox, the outer cells
signaling molecules direct the formation
are somatic while the germ cells will be derived from the inner cells.
of specialized cell types.
T
❚❚ Drosophila embryos are subdivided into
segments with unique identities by the
he development of a multicellular organism from a
sequential action of batteries of tran-
single fertilized egg cell is one of the wonders of evolu- scription factors.
tion. Typically, the fertilized egg undergoes an initial mitotic ❚❚ Hox genes specify the identity of body
division to produce two genetically identical daughter cells. segments of Drosophila and are largely
conserved throughout metazoans.
Those two cells divide to produce four identical cells, which
❚❚ Cells signal to either induce or inhibit
divide to produce eight cells, and so on. Yet, while all cells neighboring cells from adopting particu-
in the growing embryo continue to carry the same genetic lar developmental pathways.
information, many of them acquire different identities as the ❚❚ Morphological evolution can be the
result of changes in gene expression pat-
embryo develops different body parts, organs, and tissues. terns of a common genetic toolkit.
This development is a genetically programmed process, occur- ❚❚ Plant developmental genetics shares
ring in the same way in all members of a species. Different spe- similarities with that of animals despite
cies exhibit both similarities and differences in development, multicellularity evolving independently.
663
664 CHAPTER 18 Developmental Genetics
the former because of shared evolutionary ancestry small hind wings, the halteres, developed into structures
and the latter because of species-specific adaptations. resembling the forewings (Figure 18.1a). Mutations in
which an apparently normal organ or body part develops in
Geneticists rely on defects in development to
the wrong place are called homeotic mutations (from the
reveal the mechanisms of normal development. As Greek homeos, meaning “the same” or “similar”), and they
early as 1790, the German scientist and philosopher have been central to the progress geneticists have made in
Johann Wolfgang von Goethe recognized the poten- understanding how complex organisms develop and evolve.
Ed Lewis (a student of Morgan’s student Alfred Sturtevant)
tial of this approach:
later identified the bithorax complex of genes as being
From our acquaintance with . . . abnormal meta- responsible for the homeotic mutation observed by Bridges.
As we discuss in this chapter, mutations in bithorax genes
morphosis, we are enabled to unveil the secrets
change the developmental program of a portion of the fruit-
that normal metamorphosis conceals from us, and fly body, resulting in the transformation of the halteres into
to see distinctly what, from the regular course of a second set of forewings. Another example is the domi-
development, we can only infer. nant Antennapedia mutation, in which relatively normal
fly legs develop in the positions that should be occupied
Even so, the connections between developmental by the antennae (Figure 18.1b). To understand the cascades
abnormalities, gene mutations, and the mechanisms of events responsible for such developments, we must first
examine the phenomenon of cell differentiation and pattern
that control normal development could not be under-
formation.
stood in any detail until scientists began to apply the
basic principles of genetics to the study of develop-
ment. This process began around 1900, when the
(a) In a bithorax mutation, halteres seen in wild-type Drosophila
young embryologist Thomas Hunt Morgan decided (left) develop instead into a second set of wings (right).
to shift his research to focus on the nascent field of
genetics, using the fruit fly Drosophila as his experi-
mental organism. Although Morgan never returned
to the study of embryology, his students and his stu-
dents’ students blazed new trails by exploiting Dro-
sophila genetics to illuminate many of the secrets of
development in all metazoans (multicellular animals) Halteres A second set of wings develops
and in plants as well. in the position normally
occupied by halteres.
In this chapter, we discuss the genetic processes
that control development in complex multicellular
(b) In an Antennapedia mutation, antennae in wild-type
organisms and the experimental approaches that led Drosophila (left) develop instead into legs (right).
to their discovery.
Blue
cells
of morphogen
Concentration
White
cells
Red
cells
Example: Moving the organizer cells from Example: Drosophila cells expressing
one frog embryo to another induces the achaete (brown) become ectoderm and
development of a second body axis. inhibit neighboring cells from doing the
same.
all differentiate in the same manner, such as in the example the importance of positional information, induction, and
of Drosophila shown in Figure 18.3c. Examples of tissues inhibition in development, most genes identified as hav-
with regular spacing include many epidermal features, such ing prominent roles in developmental processes encode
as bristles, feathers, hairs, and scales. proteins that act as either transcription factors or signaling
The developmental histories of cells can affect how the molecules.
cells respond to cues from their neighbors. For example, for
a cell to be able to respond to an inductive or inhibitory sig-
nal from neighboring cells, it must express the appropriate
receptor. In addition, cells able to respond to a signal may
18.2 Drosophila Development Is
behave differently depending on what other factors are pres- a Paradigm for Animal Development
ent in the cell. When a cell divides, the daughter cells usually
inherit the same set of transcription factors and chromatin Discoveries about the developmental processes of Dro-
states that existed in the cell they were derived from (the sophila have made it one of the best-understood animals
importance of chromatin states is discussed in Section 18.2). on the planet. These insights have in turn profoundly influ-
However, occasional asymmetric cell divisions in which the enced how geneticists perceive the development and evolu-
two daughter cells inherit different cellular constituents and tion of all other animals, ourselves included. For their work
acquire different fates underlie developmental patterning in unraveling some of the mechanisms underlying pattern
events in some species. formation in Drosophila, Edward B. Lewis, Christiane
Positional information, induction, inhibition, and Nüsslein-Volhard, and Eric Wieschaus were awarded the
asymmetric cell divisions are common processes directing Nobel Prize in Physiology or Medicine in 1995.
cell differentiation and pattern formation in multicellular One of the reasons that Drosophila is an ideal genetic
organisms. When employed sequentially and reiteratively experimental organism is its short, 9-day life cycle
during embryogenesis, these processes enable a single- (Figure 18.4a). Embryogenesis spans the first 24 hours of
celled zygote to develop into a complex organism having Drosophila development, commencing with the deposition
a multitude of cell types. Each cell division in the embryo of a fertilized egg that immediately begins a rapid series
brings about changes in the relative positional relation- of genetically controlled changes (Figure 18.4b). After
ships between the cells, so new opportunities for cell–cell embryogenesis, development progresses through three dis-
communication are constantly created. In keeping with tinct larval stages, called instars. Each instar stage is marked
18.2 Drosophila Development Is a Paradigm for Animal Development 667
by progressive development of tissues and structures that by transplanting cellular blastoderm cells from one embryo
will form the adult fly. Following the third instar stage, the into another. Blastoderm cells implanted into an equivalent
larva forms a pupa in which metamorphosis will take place. region of a host embryo are incorporated normally into host
At the conclusion of pupation a fully formed adult fruit fly structures, but those transplanted into different regions will
emerges, ready to begin the cycle anew. develop autonomously into tissues reflecting the original
The Drosophila egg has conspicuous anterior–posterior position of the cells in the donor embryo. Thus, at the cellu-
and dorsal–ventral polarities that are acquired during its lar blastoderm stage, cells have already become committed
production in the female fly. In contrast to early develop- to differentiate into particular tissues.
ment in many other species, early embryonic development Drosophila is typical of insects in the segmentation pat-
in Drosophila proceeds by nuclear division without divi- tern of its adult body. Eight abdominal and three thoracic
sion of cytoplasm. Rather than forming blastomeres, as in segments are easily distinguished (Figure 18.4c). The head
mammalian development, this process forms a syncytium, consists of at least three distinct developmental segments.
a multinucleate cell in which the nuclei are not separated by The segments of the insect body are first visible during
cell membranes (see Figure 18.4b). The fertilized egg under- embryogenesis, where they are indicated by the pattern of
goes nine mitotic nuclear divisions, after which the nuclei denticles (small hooks for gripping during larval movement)
migrate to the periphery of the embryo. At this time, about on the ventral epidermis. The body plan established during
10 pole cells, from which the germ line will be derived, are embryogenesis determines the organization of tissues and
set aside at the posterior end of the embryo. The somatic organs in the adult fly.
cells undergo another four rounds of mitotic divisions at
the periphery, forming a syncytial blastoderm containing
The Developmental Toolkit of Drosophila
about 6000 nuclei. By about 3 hours after egg laying, cel-
lularization of the syncytium occurs by the assembly of cell Large-scale genetic screens (see Section 14.1) were com-
membranes that separate nuclei into individual cells, thus menced by Christiane Nüsslein-Volhard, Eric Wieschaus,
forming a cellular blastoderm. and others in the late 1970s and early 1980s to identify
During the syncytial blastoderm and cellularization and describe the function of genes directing pattern for-
stages, cells become progressively restricted in their devel- mation in Drosophila embryos. It is estimated that muta-
opmental potential. This can be demonstrated experimentally tions in about 5000 of the 14,000 genes in Drosophila
668 CHAPTER 18 Developmental Genetics
will result in a lethal phenotype. Most mutations result- exhibited by the mutant phenotypes, mutants were grouped
ing in lethality affect genes that have essential cellular into four gene classes, with a fifth class identified earlier by
functions, and these genes are sometimes described as Ed Lewis:
housekeeping genes. However, several hundred genes
producing lethal phenotypes are involved directly in 1. Coordinate genes: Defects affect an entire pole of the
developmental programs of pattern formation during larva (Figure 18.5a).
embryogenesis. 2. Gap genes: Mutants are missing large, contiguous
Nüsslein-Volhard and Wieschaus faced a significant groups of segments (Figure 18.5b).
challenge when designing genetic screens for mutations
3. Pair-rule genes: Mutants are missing parts of
in pattern formation because flies in which segmental pat-
adjacent segment pairs, in alternating patterns
tern formation is severely disrupted rarely survive beyond
(Figure 18.5c).
the larval stage. Their solution was to focus on embryos and
larvae. They reasoned that mutations affecting embryonic 4. Segment polarity genes: Defects affect patterning
pattern formation would not be lethal until larval forma- within each of the 14 segments (Figure 18.5d).
tion, leaving a short window of time for observation of the 5. Homeotic genes: Defects affect the identity of one or
effects of such mutations. From the types of spatial defect more segments.
(a) Coordinate gene (b) Gap gene (c) Pair-rule gene (d) Segment polarity gene
Highest concentration
of bicoid knirps hunchback odd-skipped
Expression
patterns
Mutations Mutation results in the loss of Mutation results in the loss of Mutation results in the loss of Mutation results in defects
in pattern segments and mirror-image contiguous sets of segments alternate parasegments within anterior or posterior
formation duplications of other (9 genes). (8 genes). regions of each segment
segments. (>15 genes).
Wild type (side view) Wild type (dorsal view) Wild type Wild type
A1
A2
A3
A4
A5
A6
A7
A8
A2
A4
A6
A8
T1
T2
T3
T1
T3
A1 A3 A5 A7
A8
T3
A2 A4 A6
Mutant
phenotypes
A8
1 Krüppel
A8 A7 A6 A7
1 gooseberry
1 bicoid 1 even-skipped
A1
A3
A5
A7
T1
T2
2 hunchback 2 hedgehog
T = thoracic segment
A = abdominal segment
2 odd-skipped
3 knirps
Figure 18.5 Mutations causing defects in pattern formation in Drosophila. A fifth class of mutations,
homeotic gene mutations, is represented in Figure 18.10.
18.2 Drosophila Development Is a Paradigm for Animal Development 669
These five gene classes are expressed sequentially dur- egg cell, whereas maternal inheritance refers to maternal
ing embryogenesis: The coordinate genes act first, followed transmission of genetic material (e.g., organelle genomes).
by gap genes, pair-rule genes, segment polarity genes, and How can the maternal effect genes that influence
finally homeotic genes. The cascade of gene expression development be identified in mutant screens, given that for
subdivides the embryo in successive steps, first into broad these genes, the embryonic phenotype is determined by the
regions and then into progressively smaller domains, and genotype of the mother rather than that of the embryo? An
each of the 14 resulting segments acquires a specific iden- answer becomes apparent when we compare the inheritance
tity. The patterns of mRNA and protein expression of each patterns observed with maternal effect genes against those
gene correspond, both in space and in time, to its mutant observed with zygotic genes, genes that are active only in
phenotype (see Figure 18.5). For example, expression of the the zygote or embryo. For zygotic genes, the genotype of the
gap gene knirps spans a contiguous embryonic domain that embryo determines the phenotype. The following cross illus-
is destined to become abdominal segments. These abdomi- trates this principle for an autosomal recessive mutation (m):
nal segments are missing in knirps mutants, as is evident in
the early larva (see Figure 18.5b). Inheritance Pattern with Zygotic Genes
Expression of the pair-rule genes follows that of gap Parents Offspring Phenotype
genes and each is expressed in 7 stripes in the embryo. Curi- m/+ * m/+ m/+, +/+ Normal (3)
ously, the stripes of gene expression of some pair-rule genes m/m Mutant (1)
do not correspond to the segments of the adult insect, but
rather straddle the boundaries between segments, thus occu- With maternal effect genes, where the genotype of the
pying the posterior part of one segment and the anterior part mother determines the phenotype of the zygote, the same
of its neighbor. The domains of gene expression controlled cross as above, involving an autosomal recessive mutation
by these pair-rule genes are therefore called parasegments. (m), would give the following outcomes:
In contrast, expression of the segment polarity genes occurs
in 14 polar stripes (i.e., each stripe has anterior and posterior Inheritance Pattern with Maternal Effect Genes
“poles”) that do correspond to the segments of the embryo. Parents (female : male) Offspring Phenotype
The homeotic genes are the last to be expressed and affect m/+ * m/+ m/m, m/+, +/+ All normal
broad domains of contiguous parasegments along the
m/+ * m/m m/m, m/+ All normal
anterior–posterior axis. The anterior expression boundaries
of the homeotic genes correspond to parasegment bound- m/m * +/+ or m/+ or m/m m/m, m/+ All mutant
aries defined by the pair-rule genes. Thus, the sequential
These divergent patterns allow discrimination between
activation of different classes of genes during early develop-
maternal effect genes and zygotic genes. Crosses can be
ment is reflected in the sequential subdivision of the organ-
performed to determine whether the genes are active mater-
ism, from a single-celled zygote into a segmented embryo.
nally, zygotically, or both. When such crosses were per-
When the expression pattern of a gene in a wild-type
formed to test the five classes of pattern formation mutants
embryo corresponds precisely to the cell fates that are dis-
described above, the coordinate genes were found to be
rupted when the gene is mutated, the activity of the gene
maternally active; their expression in the mother rather
is said to be cell autonomous. A gene whose action is
than in the embryo provides positional information to the
cell autonomous affects only the cells in which the gene
egg. Most gap genes are active zygotically, but at least one,
is transcribed and expressed. Four of the five classes of
hunchback, also exhibits maternal activity. All pair-rule,
genes act largely cell autonomously, an observation consis-
segment polarity, and homeotic genes act strictly zygoti-
tent with the identity of these genes as transcription fac-
cally. These findings make sense given the developmental
tors. The exception is the segment polarity class of genes,
stage at which the different classes of gene are active and
which often encode signaling molecules that can act non-
the observation that zygotic gene expression commences
autonomously, that is, in cells other than where the gene is
only in the syncytial blastoderm stage of embryogenesis.
expressed. In the following sections, we examine how the
embryo is successively subdivided by the activity of these
sets of genes. Coordinate Gene Patterning of the
Anterior–Posterior Axis
Maternal Effects on Pattern Formation The genetic control of development is essentially a pro
In animals, the mother often supplies critical gene prod- c
ess of regulating gene expression in three-dimensional
ucts to the egg that subsequently direct embryo develop- space over time. It is not surprising, then, that most of the
ment. These genes are called maternal effect genes. Note early-acting genes establishing the anterior–posterior axis
that maternal effects are different from maternal inheri- of Drosophila encode transcription factors. The interaction
tance (introduced in Chapter 17), in that maternal effects of transcription factors with cis-acting regulatory elements
entail the maternal deposition of protein or mRNA in the of target genes provides spatial control of gene expression.
670 CHAPTER 18 Developmental Genetics
This spatial control is coordinated over time by continual Cytoplasmic transplantation experiments elegantly
inputs from neighboring cells. In this section, we describe demonstrate that Bicoid specifies anterior identity. Ante-
examples of the spatial and temporal regulation of gene rior cytoplasm extracted from a wild-type embryo and then
expression that results in subdivision of a developing Dro- injected into a bicoid mutant embryo causes anterior struc-
sophila embryo into its characteristic segments. tures to develop at the site of injection (see Figure 18.6a,
The coordinate gene bicoid plays a major role in the bottom panel). When the bicoid gene was cloned, similar
establishment of the anterior–posterior axis in Drosophila. experiments were carried out with purified bicoid mRNA,
Loss-of-function bicoid alleles result in a loss of anterior which produced the same result. These findings indicate
portions of the embryo; the anterior portions are replaced that the concentration gradient of Bicoid provides positional
instead by a mirror-image duplication of posterior regions information along the anterior–posterior axis of the embryo,
(Figure 18.6a). Bicoid mRNA is anchored to the ante- presumably by differentially regulating several genes that
rior region of the egg during oogenesis in the mother respond to different concentrations of Bicoid. Among the
(Figure 18.6b). After translation, the resulting protein known zygotic genes whose transcription is directly regu-
(Bicoid) diffuses from its site of synthesis at the anterior lated by Bicoid is the gap gene hunchback.
pole of the embryo throughout the syncytial embryo, owing Surprisingly, examination of the distribution of hunch-
to the absence of cell membranes to impede protein diffu- back mRNA revealed that hunchback is also maternally
sion. The diffusion results in a gradient of Bicoid in which expressed and that its maternal (mRNA) expression is uni-
the highest concentration is at the anterior end and very little form throughout the egg (Figure 18.7a). The hunchback
Bicoid is detected beyond the middle of the embryo. protein (Hunchback), on the other hand, is found only at
the anterior end of the early embryo, implying that pos-
terior hunchback mRNA is not translated. This seeming
(a) (b) contradiction was explained by the discovery of another
Anterior Posterior Anterior Posterior maternally expressed coordinate gene, nanos. The poste-
rior end of the embryo is patterned by nanos, whose pro-
tein forms a gradient with the highest concentration at the
ap
T1 T2 A8
posterior end. Rather than encoding a transcription factor,
T3A1
nanos encodes a protein that represses translation of hunch-
Wild-type embryo bicoid mRNA back mRNA. Thus, Hunchback is restricted to the anterior
(blue) Translation, end of the embryo by posterior translational repression of
diffusion maternal hunchback mRNA. In addition, zygotic hunchback
expression in the anterior end is transcriptionally activated
ap
ap by anteriorly localized Bicoid.
A8 A7 A8 Patterning of the posterior end of the embryo is gov-
A6 A7
erned by similar interactions. In addition to acting as a
Loss of bicoid activity results in transcription factor, Bicoid acts as a translational repres-
loss of anterior segments and
Bicoid protein sor of the maternally supplied caudal mRNA, which is
duplication of posterior
abdominal segments (A7, A8, (brown) uniformly distributed throughout the egg. Translational
anal plate [ap]). repression of caudal mRNA by the anterior gradient of
Bicoid results in a posterior gradient of caudal protein
ap
(Caudal). The end result is an embryo with graded distri-
ap butions of three transcription factors: Bicoid and Hunch-
back, in which the highest concentration is at the anterior
end; and Caudal, in which the highest concentration is
T1
T1 at the posterior end. The relative concentrations of these
three proteins provide positional information along the
Injecting bicoid mRNA into an
ectopic position (red) of a bicoid
length of the embryo, which is interpreted by the subse-
embryo results in a mirror-image quently acting gap genes.
duplication of anterior thoracic
segments (T1) flanking the site Domains of Gap Gene Expression
of injection.
The broad gradients of maternally supplied coordinate
Figure 18.6 Maternal bicoid patterning of the embryo along gene products are transformed into domains of gap gene
the anterior–posterior axis. expression with discrete boundaries. This occurs through
Q Nanos protein is localized to the posterior terminus similar to a combination of cooperative binding of transcription
the way that Bicoid is localized to the anterior end. Nanos acts factors—similar to the activation of the lambda repres-
as a translational repressor. Compare the actions of Nanos and sor described in Chapter 12—and cross-regulatory inter-
Bicoid with that of inhibitors and inducers (defined in Figure 18.3). actions among the gap genes themselves. To begin, let’s
18.2 Drosophila Development Is a Paradigm for Animal Development 671
Hunchback
Nanos
Bicoid
Successive deletions
of Bicoid binding sites
Protein expression in early embryo result in progressive
loss of hunchback
mRNA expression.
Figure 18.7 Gap gene expression patterns are activated by coordinate genes.
consider further how the gradual concentration gradient of The gradient of hunchback protein is critical for the
Bicoid is translated into the more discrete pattern of hunch- regulation of other gap genes, such as Krüppel (Figure 18.8),
back mRNA expression. which is repressed by high levels of Hunchback but acti-
As noted earlier, zygotic expression of the gap vated in the central region of the embryo where Bicoid lev-
gene hunchback is confined to the anterior region of the els are moderate. These interactions establish the anterior
embryo. Unlike Bicoid, which exhibits a gradual concen- margin of Krüppel expression toward the posterior end of
tration gradient, the concentration of hunchback mRNA the Hunchback protein gradient. The posterior margin of
produced in the embryo declines precipitously at a par- Krüppel expression appears to be determined through nega-
ticular point along the anterior–posterior axis. Transcrip- tive regulation by other gap genes, knirps and giant. Similar
tion of hunchback is activated by the binding of Bicoid to regulatory interactions between other gap genes help estab-
cis-regulatory elements 5′ to the hunchback coding region lish the rest of the partially overlapping patterns of gap gene
(Figure 18.7b). In this location, there are multiple cis- expression that subdivide the developing embryo into dis-
acting sites to which Bicoid can bind, and these sites are crete domains.
bound in a cooperative manner, meaning that the binding
of one Bicoid molecule to one site facilitates the binding of
a second Bicoid molecule to a second nearby site, and so
Regulation of Pair-Rule Genes
on. Mutation of the Bicoid binding sites alters the respon- From the domains of gap gene expression emerge narrower
siveness of hunchback expression to Bicoid, and removal stripes of gene expression that represent the first manifes-
of all binding sites abolishes hunchback expression in the tation of segmentation of the anterior–posterior body plan.
embryo (Figure 18.7c). Analysis of the regulation of the pair-rule gene even-skipped
A threshold level of Bicoid must be present for hunch- (eve) revealed that each stripe is established by independent
back expression to be activated. Consequently, hunchback enhancer modules of cis-acting regulatory sequences. Each
expression occurs on one side of a threshold concentration enhancer module responds to specific combinations of gap
with no expression on the other, and a sharp boundary is genes (Figure 18.9a). Thus, the formation of stripes of gene
produced. In this manner, the gradual anterior concentra- expression is the result of combinatorial control of gene
tion gradient of Bicoid is translated into a distinct anterior expression through multiple cis-acting regulatory elements
region of hunchback mRNA expression, which, after of the pair-rule genes.
translation, produces a sharp gradient of Hunchback (see Stripe 2 of eve provides an example of modularity in
Figure 18.7a). gene regulation. Gene expression within stripe 2 is controlled
672 CHAPTER 18 Developmental Genetics
Hunchback protein (a) The pair-rule gene even-skipped (eve) and its enhancer modules
hunchback Adjacent genes
Giant bicoid
established gradient. Thus the position of eve stripe 2 along reflects the positions along the anterior–posterior axis that
the anterior–posterior axis is a zone with a high concentra- are influenced by each gene (Figure 18.10).
tion of Hunchback, low concentrations of Giant and Krüp- The cloning of the homeotic genes revealed another
pel, and an intermediate concentration of Bicoid. Only in surprise: All eight genes encode closely related proteins,
parasegment 3, which is the location of stripe 2, are both suggesting that all members of the complex were derived
positive regulators present and both negative regulators from a common ancestor through a series of gene dupli-
absent (Figure 18.9c). This combination of gap and coor- cations. All of the genes share a conserved sequence of
dinate protein concentrations does not occur anywhere else DNA of 180 nucleotides that was dubbed the homeobox,
along the axis of the embryo and uniquely defines the eve which encodes a 60–amino acid protein domain, termed the
stripe 2 position. The integration of positive and negative homeodomain, with a helix-turn-helix motif. Such motifs
regulators results in the precise limiting of even-skipped had previously been recognized in bacterial and phage tran-
stripe 2 to a region only a few cells in width along the ante- scription factors, such as the Lac repressor and the lambda
rior–posterior axis. Similar combinatorial mechanisms are repressor proteins. They function to bind cis-regulatory
thought to control the expression patterns of all of the pair- DNA sequences of target genes. Since the homeobox genes
rule and segment polarity genes. of the Antennapedia and bithorax complexes share both
The discovery that in multicellular organisms the con- molecular and functional similarity as well as having a com-
trol of gene expression is modular provided important mon evolutionary origin, they are known collectively as
insight into the evolution of organisms. Modularity of gene Hox genes.
regulation allows changes in specific domains of expres- The patterns of Hox gene expression correlate with the
sion without catastrophic disruption of global expression regions affected in the corresponding mutants. Each of the
patterns. Hox genes has a well-defined anterior boundary of expres-
sion but in most cases a more diffuse boundary on the pos-
terior end, resulting in overlapping domains of Hox gene
Specification of Parasegments expression. The anterior boundaries of Hox gene expression
by Hox Genes do not correspond to segmental boundaries but rather to
boundaries of segment polarity gene expression. Thus, Hox
Having explored the mechanisms by which gap and pair-
gene expression is out of register with the groups of cells
rule genes successively subdivide the Drosophila embryo
that give rise to segments in the adult fly and instead marks
into segments and parasegments, we can now consider how
the boundaries of parasegments.
each segment acquires a unique identity through the action
Because of the parasegmental pattern of Hox gene
of the homeotic genes. Once again, the key discoveries were
expression, mutations of those genes affect cellular iden-
made through the study of mutations, pioneered by Edward
tity in a parasegmental manner. Each parasegment of the
B. Lewis starting in the 1950s.
embryo expresses a unique combination of Hox gene prod-
As we saw at the beginning of the chapter, a remark-
ucts, giving each parasegment a specific identity. The acti-
able aspect of homeotic mutant phenotypes is the devel-
opment of relatively normal structures in inappropriate vation of Hox genes is controlled by the earlier-acting gap
positions. Another general feature of homeotic mutations is and pair-rule genes in a combinatorial manner similar to
that they cause identity transformations of serially repeated that described for the activation of pair-rule genes by the
structures. Legs, for example, are appendages that are nor- gap and coordinate genes. In the absence of all Hox gene
mally limited to the three thoracic segments in Drosophila, activity, segments are formed, but they all differentiate into
whereas antennae are appendages that normally develop a “default” state that resembles a head segment. This out-
only on the third cephalic (head) segment. In the case of come indicates that Hox genes are not required for the for-
Antennapedia mutants, however, a leg appears in a segment mation of the segments but rather for the specification of
ordinarily reserved for an antenna (see Figure 18.1), sug- their identity.
gesting that Antennapedia normally specifies the identity of
one or more of the thoracic segments. Analyses of homeo- The Antennapedia Complex The Antennapedia complex
tic genes in Drosophila demonstrate that in fact they act in consists of five Hox genes—labial, Deformed, Sex combs
combination to specify the identity of each of the 14 body reduced, proboscipedia (Pb), and Antennapedia—that act
segments. in combination to specify the cephalic and thoracic para-
The homeotic genes of animals are also remark- segments (see Figure 18.10c). The original Antennapedia
able for being clustered in gene complexes. In Drosophila mutant (see Figure 18.1) was dominant and was found to
there are two homeotic clusters on the third chromosome: be the result of a gain-of-function allele (see Section 4.1).
the Antennapedia complex, consisting of five genes, and The Antennapedia gene is normally expressed only in para-
the bithorax complex, consisting of three genes. In other segments 4 and 5 (see Figure 18.10c), which give rise to
organisms, the homeotic genes are usually in a single clus- thoracic segments that each produce a pair of legs. In flies
ter. Amazingly, the order of the genes within the complexes carrying the dominant Antennapedia mutation, however,
674 CHAPTER 18 Developmental Genetics
(a) Adult body segments (b) In vivo Hox gene expression patterns
Abd-B abd-A
A1 A2
T3 A3
T1 T2 A4
A5
A6
A7
A8
14 13 1211 10 9
1 8 Parasegments
2 3 4 5 6 7
Antennapedia is expressed ectopically—meaning it is in expression levels between segments. Each has a sharp
expressed at an inappropriate time or place or both. One anterior border of expression and a more diffuse poste-
of the normal roles of Antennapedia expression in the tho- rior boundary of expression. Thus, each segment exhibits
racic segments is to promote the differentiation of thoracic a unique qualitative and quantitative pattern of Hox gene
appendages into legs. When expressed ectopically in the expression.
third head segment, Antennapedia inappropriately promotes Loss of Ultrabithorax activity results in paraseg-
differentiation of head appendages (antennae) into legs ments 5 and 6 having a combination of Hox gene products
instead. resembling that normally found in parasegment 4. This
causes transformations of the identity of thoracic segment
The bithorax Complex In contrast to Antennapedia T3 and abdominal segment A1 into thoracic segment T2
mutations, which affect anterior body segments, muta- (Figure 18.11b). Loss of the entire bithorax complex causes
tions in the three genes of the bithorax complex— most abdominal segments to develop as T2, so each has legs
Ultrabithorax, abdominal-A, and Abdominal-B—affect as appendages (Figure 18.11c). This observation suggests
more-posterior segments (Figure 18.11a). The bithorax that expression of Antennapedia, which promotes leg iden-
complex genes are expressed in overlapping sets of tho- tity in appendages, extends posteriorly in such mutants and
racic and abdominal parasegments and act in combina- that genes of the bithorax complex normally repress pos-
tion to specify the identity of those parasegments. How terior expression of Antennapedia. Such cross-regulatory
do only three genes specify the identity of nine segments, interactions between Hox genes, whereby more posteriorly
one thoracic and eight abdominal? The three genes vary expressed Hox genes repress the expression of Hox genes
not only in their spatial patterns of expression but also normally expressed in more-anterior positions, is a common
18.2 Drosophila Development Is a Paradigm for Animal Development 675
(a) Wild type although not universal feature in the regulation of Hox genes
Parasegments (Figure 18.11d–e).
3 4 5 6 7 8 9 10 11 12 13 14 As you have probably noticed, there is no single Hox
Abdominal-B (Abd-B; blue) gene called bithorax; so what became of the original bitho-
abdominal-A (abd-A; green) rax (bx) mutation that was isolated by Calvin Bridges?
Ultrabithorax (Ubx; red) When Ed Lewis recognized that mutations such as bithorax
T1 T2 T3 A1 A2 A3 A4 A5 A6 A7A8 could provide valuable insights into the genetic mechanisms
Segments of development, he began collecting mutations with simi-
Both Ubx and abd-A have a diffuse posterior boundary of expression lar but distinct phenotypic defects, some of which he called
due to negative regulatory interactions between genes. postbithorax (pbx), Contrabithorax, Ultrabithorax, and
bithoraxoid (bxd). Each of these mutations mapped to a dif-
(b) Loss of Ubx
ferent position in the same chromosomal region, so that they
3 4 4 4 7 8 9 10 11 121314
were separable by recombination events, and double-mutant
Abdominal-B combinations could be constructed. At the time Lewis per-
abdominal-A formed these studies, molecular cloning was unknown, and
he assumed that each mutant he identified represented a
T1 T2 T2 T2 A2 A3 A4 A5 A6 A7A8 different gene. When the bithorax complex was eventually
cloned in 1983, however, many of the mutant phenotypes
T3 and A1 are incorrectly specified as T2 due to a failure to
repress Antennapedia in these segments.
were found to result from mutations in different enhancer
modules controlling the expression of a single coding region
that is now called the Ultrabithorax gene (Figure 18.12a).
(c) Loss of all bithorax complex (Ubx, abd-A, and Abd-B) Mutations of the regulatory elements can be either reces-
3 4 4 4 4 4 4 4 4 4 4 14 sive, if in an enhancer module that acts to positively regulate
gene expression, or dominant, if in a silencer module that
acts to negatively regulate gene expression. Whereas null
loss-of-function alleles of Ultrabithorax result in embryo
T1 T2 T2 T2 T2 T2 T2 T2 T2 T2 T2 lethality, disruption of single enhancer modules results in
All segments posterior to T1 differentiate as T2 due to a milder defects. For example, recessive Ultrabithorax bithorax
failure to repress Antennapedia in all posterior segments. mutations (bx) result in the transformation of the anterior
part of T3 into T2, causing the anterior portion of the haltere
(d) Loss of abd-A and Abd-B
to develop as a wing (Figure 18.12b). Conversely, recessive
3 4 5 6 6 6 6 6 6 6 6 14
Ultrabithorax postbithorax mutations (pbx) result in the trans-
formation of the posterior region of T3 into T2 identity, and
the posterior portion of the haltere develops as a wing. Only
in the Ultrabithorax bithorax Ultrabithorax postbithorax double
Ultrabithorax
T1 T2 T3 A1 A1 A1 A1 A1 A1 A1A1 mutant is the identity of the entire T3 segment transformed
into a T2 identity, causing a four-winged fly to develop (see
All abdominal segments differentiate as A1 due to failure
of abd-A and Abd-B to repress Ubx expression in posterior
Figure 18.1).
segments. The cis-regulatory elements of Ultrabithorax span over
120 kb (see Figure 18.12a), and their modularity allows
the evolution of changes in gene expression without cata-
(e) Loss of Abd-B strophic disruption of Ultrabithorax function, such as those
3 4 5 6 7 8 9 9 9 9 9 14 caused by nonsense mutations within the coding region.
Thus, Ultrabithorax bithorax Ultrabithorax postbithorax double
abdominal-A mutants survive to adulthood because the remainder of the
Ultrabithorax cis-regulatory elements controlling Ultrabithorax expres-
T1 T2 T3 A1 A2 A3 A4 A4 A4 A4 A4 sion are intact. Genetic Analysis 18.1 asks you to evaluate
Ubx and abd-A are both expressed more posteriorly due cross-regulatory interactions among Hox genes.
to loss of repression by Abd-B, leading to most posterior
abdominal segments differentiating as A4. Downstream Targets of Hox Genes
Given that combinatorial action of the Hox genes speci-
Figure 18.11 Cross-regulatory interactions between bithorax
complex genes, specifying thoracic and abdominal segment fies parasegment identity and that Hox genes encode tran-
fates. scription factors, it follows that the downstream target
genes activated by the Hox genes must differ between seg-
ments. These Hox target genes have been called realizator
genes, and their expression contributes to the characteristic
676 CHAPTER 18 Developmental Genetics
Posterior
T3 Anterior T3 T3 T3
Haltere
Posterior
morphology of each segment. As an example, let’s consider the combinatorial activity of the Hox genes in conjunction
the formation of appendages on each segment. with Distal-less. For example, the identity of the T1 leg is
Wild-type flies have antennae on the most-anterior head specified by Distal-less and Sex combs reduced, whereas
segment and have mandibles and maxillary and labial sense the identity of the T2 leg is specified by Distal-less and
organs on other head segments. The three thoracic segments Antennapedia.
have legs; T2 and T3 also have wings and halteres, respec-
tively. The eight abdominal segments lack appendages. Loss
Hox Genes throughout Metazoans
of all Hox activity is lethal to the embryo and causes all
segments to resemble a head segment having antennae as Soon after the discovery of Hox gene clusters in Drosoph-
appendages. This outcome indicates that all segments have ila, researchers began to inquire whether Hox genes are a
the potential to form an appendage, and that expression peculiarity of Drosophila development, or whether they are
of Hox genes can either specify the appendage identity or found in a broader range of species. Many developmental
repress its formation. biologists did not expect to find Hox genes in other animals,
The formation of an appendage is dependent on a since there was no reason to expect that other animals would
gene called Distal-less. In wild-type Drosophila, Distal- use the same genes to direct very different developmental
less is expressed in the head and thoracic segments but programs. However, cross-hybridization studies using Dro-
not in any abdominal segments. This pattern suggests sophila Hox sequences as molecular probes revealed Hox
that the abdominal segment identity genes, Ultrabitho- gene sequences in the genomes of all animals, including
rax, abdominal-A, and Abdominal-B, negatively regulate insects, spiders, molluscs, and vertebrates (such as humans).
Distal-less expression in the abdominal segments. Loss of This revelation suggested a common developmental mecha-
function of all bithorax complex genes results in ectopic nism among animals.
Distal-less expression in all abdominal segments, along Subsequent experiments showed not only that most
with a concomitant development of appendages (legs) on animals have clusters of Hox genes but also that they
all abdominal segments. Conversely, if Ultrabithorax is are arranged in a manner similar to that in Drosophila
ectopically expressed at high levels throughout the embryo, (Figure 18.13). Each cluster consists of genes correspond-
Distal-less is not activated in any segment and no append- ing to those in the bithorax and Antennapedia clusters of
ages are formed. Thus, action of specific bithorax com- Drosophila, with some minor deletions and duplications.
plex Hox proteins on Distal-less cis-regulatory sequences For example, as in Drosophila, the mouse Hox genes are
represses Distal-less gene expression in the abdominal expressed in an anterior-to-posterior pattern that corre-
segments. The identity of the appendages is determined by sponds to the chromosomal position of the genes within
GENETIC ANALYSIS 18.1
PROBLEM Why do loss-of-function mutations in bithorax complex genes result in
homeotic transformations of parasegments into identities that correspond to more- BREAK IT DOWN: The bithorax
complex genes specify identity
anterior parasegments, whereas gain-of-function mutations (see Section 4.1) tend to along the anterior–posterior axis of
result in identities corresponding to more-posterior parasegments? Drosophila (see p. 675).
BREAK IT DOWN: In a homeotic transformation, a
normal body part is replaced by another body part
normally found in another region of the body.
Evaluate
1. Identify the topic this problem 1. The subject of this question is the effect of mutations in the bithorax complex
addresses and the nature of the on segment pattern formation. The answer requires descriptions of why
required answer. loss-of-function mutations lead to segments that resemble more-anterior
segments, whereas gain-of-function mutations lead to the formation of
segments that resemble more-posterior segments.
2. Identify the critical information given 2. The question suggests there is a key difference between the effects of loss-
in the problem. of-function mutations and gain-of-function mutations of the bithorax complex.
Deduce
3. Review the general patterns of 3. Homeotic genes, such as the Hox genes, specify segment identity in a com-
expression and segmental pattern binatorial manner through overlapping expression domains in parasegments.
formation resulting from the normal Each gene has a well-defined anterior boundary but a more diffuse posterior
expression of homeotic genes. boundary. Cross-regulatory interactions refine Hox gene expression domains,
TIP: Use Hox genes as an example of with more-posterior genes repressing more anteriorly expressed genes.
a set of developmental genes.
4. Review the general pattern of 4. The bithorax complex consists of three genes, Ubx, abd-A, and Abd-B. Ubx
expression and the normal segmental is expressed in the anterior abdominal segments and posterior thoracic seg-
pattern formation of bithorax genes. ments, abd-A is expressed in the middle abdominal segments, and Abd-B is
expressed in the posterior abdominal segments. Segment identity is specified
by the combination of Hox gene products and their levels of expression.
Solve
5. Explain why loss-of-function 5. The loss of function of a posterior gene leads to both the absence of expres-
mutations of bithorax genes lead sion of the mutant gene and posterior expansion in the expression domains of
parasegments to take on a more- more-anterior genes. For example, the posterior gene Abd-B acts to repress
anterior identity. abd-A in the most-posterior segments. Loss-of-function mutations in Abd-
TIP: Consider the cross-regulatory B result in a posterior expansion of abd-A expression into more-posterior
interactions of the Hox genes. abdominal segments. The result is that both middle and posterior abdominal
segments acquire an identity that is similar to that of the middle abdominal
segments—a homeotic transformation to more-anterior identity.
6. Explain why gain-of-function 6. Gain-of-function mutations cause gene expression at inappropriate times and
mutations of bithorax genes locations. Gain-of-function alleles often, but not always, result in Hox gene
lead parasegments to take on a expression in a more-anterior domain than in wild-type animals, thus resulting
more-posterior identity. in homeotic transformations to a more-posterior identity.
TIP: Gain-of-function Antennapedia mutations
cause legs (a posterior structure) to develop in the
position normally occupied by antennae (an anterior
structure).
For more practice, see Problems 6, 7, 21, and 24. Visit the Study Area to access study tools. Mastering Genetics
the Hox clusters. This pattern suggests that Hox genes also four copies. The conservation of the Hox complexes for
specify identity along the anterior–posterior axis of the more than 500 million years suggests that the spatial
mouse and, by extension, of mammals in general. colinearity of Hox genes along the chromosome with
The conservation of Hox gene clusters among ani- their expression along the body axis is essential for opti-
mals indicates that a common ancestor possessed a mal functionality.
Hox gene cluster specifying pattern formation along its Mice embryos with loss-of-function alleles of Hox
anterior–posterior axis. This cluster was duplicated dur- genes, constructed using gene-targeting techniques described
ing the evolution of the vertebrate genome, which has in Section 15.2, exhibit defects in the identity of serially
677
678 CHAPTER 18 Developmental Genetics
Sponges
Cnidarians
zen
lab pb bcd & z2 Dtd Scr ltz Amp Ubx abd-A Adb-B
Fruit fly
Onychophoran
Nematode
Priapulid ?
Polychaete
Leeches
Nemertean
Flatworms
Bilaterians Gastropod
Brachiopod
1 2 3 4 5 6 7 8 9 10 11 12 13
Mouse
Amphioxus
Sea urchin
Figure 18.13 Occurrence and arrangement of Hox complexes in metazoans. Hox genes have not been
detected in choanoflagellates, single-celled organisms that represent the sister clade to metazoans, but
they are present in all metazoans. In the vertebrate lineage (exemplified by the mouse), the entire complex
has been duplicated twice, resulting in four Hox complexes. Such events have produced duplicated genes
that were later co-opted to new developmental functions.
repeated structures. For example, loss of Hox function results can be extended to other members of the animal kingdom,
in a homeotic transformation of the lumbar and sacral verte- including ourselves.
brae, which do not normally bear ribs, into structures resem-
bling more-anterior thoracic vertebrae that do carry ribs (see Stabilization of Cellular Memory
Figure 14.1). These and additional Hox gene mutations sug-
by Chromatin Architecture
gest Hox genes direct the development of body plans in chor-
dates as well as in annelids, arthropods, molluscs, nematodes, The preceding sections describe how the basic body plan
and other animals. of Drosophila is established in early embryogenesis by the
Studies of Hox complexes in other metazoans reveal action of coordinate, gap, and segmentation genes and through
that gene duplication took place before the divergence of spatially restricted patterns of Hox gene expression that spec-
bilaterian animals (animals that have bilateral symme- ify segmental identity. The patterns of Hox gene expression
try). Thus, all bilaterian animals have essentially the same are then faithfully propagated throughout the remainder of
homeotic gene toolkit to pattern their anterior–posterior embryonic development. The proteins that activate Hox gene
axis. This homology indicates that the differences between expression have an ephemeral pattern of expression; it disap-
animals reflect how the toolkit is employed rather than pears soon after Hox expression patterns are initiated. Thus,
differences in the component parts. Indeed, large-scale one challenge cells face during embryonic development is for
sequencing of cnidarian (jellyfish, sea anemone) genomes specific lineages to maintain their identity as they proliferate.
suggests that other components of the genetic toolkit are Genetic screens for homeotic genes revealed that muta-
also largely shared by all metazoans. Given that all animals tions at loci other than those encoding the Hox genes can
share fundamental developmental patterning processes and also produce homeotic mutant phenotypes. In general, muta-
genes, much of what we learn from the study of model ani- tions at these other loci fall into two classes. The first class,
mals such as Drosophila, Caenorhabditis elegans, and mice exemplified by trithorax mutations, produces phenotypes
18.3 Cellular Interactions Specify Cell Fate 679
reminiscent of multiple Hox loss-of-function mutations. In (a) Six cells, P3.p to P8.p, have potential to develop into vulva.
contrast, phenotypes of mutants of the second class, exem-
AC Anchor cell
plified by Polycomb mutations, often resemble multiple
gain-of-function alleles of Hox genes. At the molecular P3.p P4.p P5.p P6.p P7.p P8.p
level, expression of multiple Hox genes is found to be ecto- Vulval precursor cells (VPCs)
pic in Polycomb mutants and reduced in trithorax mutants.
lin-3 expression in anchor cell
Although Hox gene expression is established normally in
both Polycomb and trithorax mutants, the expression either
fails to be maintained (trithorax mutants) or is later acti-
vated in inappropriate locations (Polycomb mutants). Thus,
rather than “remembering” what type of tissue they are des-
tined to form, mutant trithorax and Polycomb cell lineages
appear to “forget” their identity.
Vulval precursor cells (VPCs)
Recall the discussion in Section 13.2 of how Tritho-
rax group (TrxG) and Polycomb group (PcG) protein com-
(b) The three cells closest to anchor cell—P5.p to P7.p—form
plexes repress or activate, respectively, gene expression via
the vulva; the other cells develop into hypodermis.
chromatin modification. These proteins provide a type of
epigenetic cellular memory that is propagated through cell AC
3° 3° 2° 2° 3°
divisions occurring long after the initial activators of Hox P3.p P4.p P5.p P6.p 1° P7.p P8.p
gene expression patterns have disappeared.
Study of trithorax and Polycomb mutants has helped clar-
ify that the establishment of euchromatic or heterochromatic AC
chromatin at specific developmental genes is a primary mech-
D F F D
anism by which the potential fates of cells become restricted as
A B C E E C B A
development proceeds from totipotent zygote to differentiated
cell types. The relative rigidity or plasticity of these different Vulva
chromatin states is directly responsible for a cell’s ability to
One cell has 1° identity and forms the central part; two
express some genes and not express others, thus influencing flanking cells adopt 2° fate and form peripheral parts.
the developmental potential of particular cell types.
(c) Loss of the anchor cell results in loss of vulval development;
18.3 Cellular Interactions Specify all cells adopt hypodermal fate.
rise to structures of the vulva itself: One is called the primary (a) Wild type
(1°) cell, and the other two are called secondary (2°) cells of
the vulva. The other three cells differentiate into hypodermis
and are called tertiary (3°) cells. The VPC closest to a spe- Vulva
cific gonadal cell called the anchor cell differentiates as the
Mutagenize and
1° cell and forms the central part of the vulva. The two cells screen for mutants
flanking the 1° cell differentiate as the 2° cells and form the
peripheral regions of the vulva. The 1° and 2° fates can be Vulva-less
easily distinguished by their distinct cell-division patterns. Multi-vulva
Initially, each of the six VPCs has the potential to dif-
ferentiate along any of the pathways—1°, 2°, or 3°. This
(recessive lin-3, let-23,
flexible cell-fate potential is demonstrated by laser-ablation let-60 alleles)
experiments that destroy the anchor cell or one or more Vulvas
VPCs (Figure 18.14c). If the anchor cell is destroyed, no (dominant let-23 and let-60 alleles)
vulva will form, because all six VPCs differentiate with a 3°
fate and become hypodermis. This suggests that the anchor (b) Anchor cell
cell must be present to induce VPCs to differentiate with 1°
or 2° fates and thus form the vulva. Alternatively, if the VPC LIN-3
closest to the anchor cell is ablated, one of the cells that
would normally differentiate with a 2° fate instead develops
LIN-3
with a 1° fate and the two cells flanking this new 1° cell dif- LET-23
ferentiate as 2° cells, suggesting that any of the VPCs can
differentiate with a 1° or 2° fate. Vulval precursor cell
What limits the number of VPCs destined to form the
vulva to three? Given the loss of both the 1° and 2° fates LET-60
when the anchor cell is removed, researchers hypothesized
that the anchor cell might provide an inductive signal to
induce vulval cell differentiation (Figure 18.14d). If this Nucleus
inductive signal is disseminated in a gradient, the cell clos- Vulval Epidermal
est to the anchor cell could acquire a different fate than cells fate fate
that are more distant.
As predicted by the inductive interaction model, muta- Figure 18.15 Genetic analysis of vulval development in
C. elegans.
tions that eliminate either the inductive signal or the ability of
cells to respond to the inductive signal result in a loss of vul-
val development, and all VPCs differentiate as hypodermis molecule (LET-60) that communicates the signal from the
(Figure 18.15a). This mutant phenotype is called the vulva- plasma membrane to the nucleus, where changes in gene
less phenotype. In contrast, mutations that disseminate the expression are induced. The absence of a receptor for LIN-3,
inductive signal to all VPCs cause all VPCs to differentiate or the inability to transmit receipt of the signal, blocks the
into vulval cells, producing a multi-vulva phenotype. Multi- normal developmental fate of VPCs.
vulva mutants lay eggs similarly to normal worms; how- Epistatic analysis of developmental pathways, con-
ever, the fertilized eggs of vulva-less worms cannot be laid ducted by studying multiple mutant combinations, is used
and instead develop and hatch inside the mother’s uterus. to identify groups of genes that interact to control a partic-
Progeny developing in the uterus eventually consume their ular cellular process or pathway and to establish an order-
mother from the inside and then hatch out of the carcass. of-function map for the genes in the pathway (see Section
Recessive loss-of-function alleles at several loci pro- 4.3). Genetic analysis of developmental pathways can be
duce a vulva-less phenotype. These genes encode proteins more complicated than analysis of biochemical pathways
that act either in the production of the inductive signal from because often there is no way of assaying intermediate
the anchor cell or that facilitate cell response to the induc- steps in the developmental pathway. The analysis of double
tive signal (Figure 18.15b). For example, the lin-3 gene mutants and the availability of gain-of-function alleles can
encodes a small, secreted protein expressed only in the be crucial in these endeavors, as the studies of vulva-less
anchor cell and acting as the inductive signaling molecule and multi-vulva mutants in C. elegans show (Figure 18.16).
(see Figure 18.14a and d). Mutations that result in a loss of In the case of recessive loss-of-function alleles of lin-3,
active LIN-3 protein result in the loss of the inductive signal let-23, and let-60, all single mutants have the same phe-
from the anchor cell. In contrast, the let-23 and let-60 genes notype, suggesting all these genes might act in the same
are expressed in the VPCs and act as the receptor (LET-23) pathway (Figure 18.16b). However, all double-mutant
for the lin-3–encoded signal and as a signal transduction loss-of-function combinations also exhibit a vulva-less
18.3 Cellular Interactions Specify Cell Fate 681
(a) Wild type (b) lin-3 loss-of-function (or (c) let-23 gain-of-function (d) let-60 gain-of-function
let-23 or let-60 loss-of-function)
AC AC AC AC
LIN-3 LIN-3 LIN-3 LIN-3
(e) lin-3 loss-of-function (f) lin-3 loss-of-function (g) let-23 loss-of-function (h) let-60 loss-of-function
+ + + +
let-23 gain-of-function let-60 gain-of-function let-60 gain-of-function let-23 gain-of-function
AC AC AC AC
LIN-3 LIN-3 LIN-3 LIN-3
LIN-3
LIN-3
LET-23 LET-23 LET-23 LET-23
Figure 18.16 Analysis of double-mutant phenotypes to find order of genes in developmental path-
ways. (a) In wild-type worms, the vulva developmental pathway is active only in the presence of the signal
(LIN-3). (b) In lin-3 mutants, no signal is present, and worms develop with a vulva-less phenotype. (c) and (d)
In either let-23 or let-60 gain-of-function alleles, the pathway is constitutively active, and worms develop with
a multi-vulva phenotype. (e) and (f) Gain-of-function alleles of let-23 and let-60 are epistatic to loss-of-function
lin-3 alleles. The pathway is constitutively active regardless of whether the lin-3 signal is present. (g) and (h)
Gain-of-function alleles of let-60 are epistatic to loss-of-function alleles of let-23. Conversely, loss-of-function
alleles of let-60 are epistatic to gain-of-function alleles of let-23. This places let-60 downstream of let-23.
Q Explain why gain-of-function alleles of either let-23 or let-60 are epistatic to loss-of-function alleles
of lin-3.
phenotype, which complicates the effort to discover the loss-of-function alleles of lin-3 (i.e., the double mutants have
order of genes in the pathway. a multi-vulva phenotype like the let-23 and let-60 gain-of-
As shown in Figure 18.15, genetic screens of C. elegans function single mutants), as outlined in Figure 18.16c–f. The
identified dominant multi-vulva mutations in which all VPCs double-mutant phenotype indicates that the gain-of-function
differentiated as 1° or 2° cells. Two of the dominant mutations alleles of either let-23 or let-60 do not require the function of
mapped to the same positions as let-23 and let-60, suggest- lin-3 to exert their phenotypic effects, thus placing both let-
ing that they might be gain-of-function alleles of these genes, 23 and let-60 downstream of lin-3.
and both dominant mutant alleles proved to be epistatic to Similar analysis enables the ordering of the let-23
(that is, to suppress or repress expression of) recessive and let-60 genes in the pathway (see Figure 18.16g–h).
682 CHAPTER 18 Developmental Genetics
Dominant let-60 alleles are epistatic to recessive let-23 (a) AC Anchor cell
alleles, indicating that let-60 can function in the absence of
functional let-23, a finding that places let-60 downstream of Inductive signal
let-23. This conclusion is supported by the converse experi-
ment, where recessive let-60 alleles are epistatic to domi- P3.p P4.p P5.p P6.p P7.p P8.p
nant let-23 alleles, which indicates that let-23 requires the
function of let-60 to exert a phenotypic effect. 3° 3° 2° 1° 2° 3°
The genetic pathway was determined before the nature of P3.p P4.p P5.p P6.p P7.p P8.p
the proteins had been analyzed. Now that we know the molec-
ular identities of LIN-3 (signal), LET-23 (receptor), and LET-
60 (signal transduction molecule), these epistatic relationships Inhibitory signal
make sense. For example, dominant gain-of-function mutations Cell closest to anchor cell differentiates with 1° fate and
of let-60 result in constitutive activity of this protein, allowing then inhibits neighboring cells from 1° fate.
it to transduce a signal independent of the state of the LET-23
receptor. Likewise, gain-of-function alleles of let-23 act as if
(b) LIN-3
they are receiving a signal all the time, whether or not lin-3
is functional, and thus activate the downstream signal-trans-
duction cascade, which in turn depends on having a functional P5.p P6.p P7.p Center cell (P6.p) detects more
allele of let-60. This pathway, called the epidermal growth fac- LIN-3 signal (green), up-
regulates lateral signal (blue)
tor signaling pathway, is conserved throughout animals, with and down-regulates receptor
inappropriate activation of the pathway leading to cancer. (yellow). Left and right cells
P5.p P6.p P7.p (P5.p and P7.p) receive more
2° 1° 2° signal (blue) from center cell
Lateral Inhibition and up-regulate receptor
(yellow) while down-regulating
Given that they are both induced by the lin-3–encoded signal, lateral signal (blue).
how are the 1° and 2° fates specified? One possibility is a dif- P5.p P6.p P7.p
ferential response of the VPCs to a graded lin-3 signal, where 2° 1° 2°
the highest concentration of signal produces a 1° fate and a
lower concentration of signal produces 2° cells. However,
when the cell that would normally be a 1° cell is ablated, a cell
that would normally have been a 2° cell differentiates into a 1° 2° 1° 2°
cell instead. It is thus unlikely that the absolute concentration
of signal perceived is solely responsible for directing cell fate.
A possible explanation is that after reception of the lin-3 (c) LIN-3 LIN-3 LIN-3 LIN-3
signal, a second signal is sent from the 1° cell that inhibits the
neighboring cells from becoming 1° cells (Figure 18.17a).
This process is termed lateral inhibition, where an initial LIN-3 LIN-3 LIN-3 LIN-3
LET-23 LET-23 LET-23 LET-23 LET-23 LET-23
asymmetry is reinforced by signalling between adjacent
cells (Figure 18.17b). All VPCs initially have the potential
P6.p P7.p
to express a lateral signal, encoded by the lag-2 gene, and to
express the receptor for the LAG-2 signal, encoded by the 1° 1°
LAG-2
lin-12 gene. The lag-2 gene is activated in response to the
LIN-12 2° LAG-2 LIN-12 2° LAG-2
LIN-3 signal, so it is expressed at higher levels in the 1° cell.
Reception of LAG-2 results in down-regulation of the lag-2
gene in the receiving cells and up-regulation of the gene for Strong activation of lin-3/let-23 pathway promotes 1° cell
its receptor, LIN-12 (Figure 18.17c). This creates a feedback fate, in turn activating the lag-2/lin-12 pathway, which
promotes a 2° cell fate in neighboring cells.
loop that reinforces the initial asymmetry between the 1°
and 2° cells. Continued feedback between the signal and its Figure 18.17 Lateral inhibition in C. elegans vulval
perception amplifies the differences between the two cells, differentiation.
causing them to acquire distinct developmental fates.
such mutants have elucidated a genetic pathway that leads to Work on chickens and mice demonstrates that expression
cell death in response to a signaling molecule. This pathway of Hox genes along the anterior–posterior body axis defines
is largely conserved across the animal kingdom (in humans, the position at which a limb will develop. The anterior limit
as well) and is a natural and important process that helps of the expression domains of two Hox genes, Hoxc8 and
sculpt the development of tissues as well as maintain tissues Hoxc6, demarcates the position of the forelimb, and the pos-
in adult organisms. Indeed, it is estimated that 1011 cells are terior limit of expression marks the position of the hindlimb
programmed to die every day in an adult human, many of (Figure 18.18a). The expression of these two genes specifies
them in epithelial tissues such as skin and intestine. Whereas the thoracic region of vertebrates, which is characterized by
loss-of-function mutants for genes in the apoptosis path- the formation of ribs from the vertebral column.
way are viable in C. elegans, loss-of-function mutations in Once limb positions are specified, cells of the mesen-
homologous genes in mice result in embryo death, indicat- chyme (loosely connected sub-ectodermal cells) send a sig-
ing that cell death is an essential part of life in mammals. nal to the overlying ectodermal cells. This signal promotes
changes within a narrow band of cells that then forms the
apical ectodermal ridge (AER), whose primary function is
18.4 “Evolution Behaves Like to direct limb-bud outgrowth by responding to signals pro-
a Tinkerer” duced in a group of mesenchymal cells toward the posterior
side of the limb bud called the zone of polarizing activity
One of the major surprises emerging from genome sequence (ZPA; Figure 18.18b). The ZPA acts as an organizer that
analysis of animals is that, within a factor of about 2, most promotes digit formation at the distal ends of limb buds
animal genomes have very similar numbers of genes. The (that is, the ends farther from the center of the body) through
range is from about 12,000 to about 25,000. Thus relatively the production of a morphogen, a small secreted signaling
simple animals such as Drosophila have a genome contain- protein called Sonic hedgehog (Shh). The Sonic hedgehog
ing about 14,000 genes, whereas the human genome con- (Shh) gene is orthologous to the Drosophila segment polar-
tains about 25,000 genes. Even organisms such as jellyfish ity gene hedgehog. Sonic hedgehog is expressed principally
and sea anemones possess genomes with gene numbers in the neural tube, where it helps organize the brain, eyes,
largely similar to those of vertebrates. and other structures through patterning of a group of cells
Given this consistency of gene number, what is the bio- known as the floor plate, and in developing limbs, where
logical explanation of how the presumed “complexity” of it directs the development of digits. The Case Study in this
vertebrates is produced from a genetic toolkit that is similar chapter discusses the consequences of different Shh muta-
to the one possessed by comparatively “simple” animals? tions on mammalian development and morphology.
The answer seems to lie in the relative complexity of gene All extant tetrapods are characterized by five or fewer
regulation rather than the invention of new genes for addi- digits in each set, and each digit in the set has a unique iden-
tional developmental processes. This proposal suggests tity. Tetrapod digits arise along the anterior–posterior axis
that existing genes are recruited for new roles by means of of the limb bud. If you allow your arms to hang straight
changes in their regulation, both in space and time. Biolo- down, you will see that your thumb (digit 1) is in the ante-
gist Francois Jacob summed up this view of evolution when rior position on your hand, while your pinky (digit 5) is in
he said, “Evolution behaves like a tinkerer. . . . [It] does not the posterior position. Sonic hedgehog expressed in the ZPA
produce novelties from scratch. It works on what already plays an important role in initiating digit formation, and
exists, either transforming a system to give it new functions loss-of-function alleles of Shh result in a loss of digits 2–5;
or combining several systems to produce a more elaborate only digit 1 forms independently of Shh function. A second
one.” role of Shh in limb patterning is in the specification of digit
A common theme in the evolutionary history of all identity. Experiments where a second ZPA is transplanted
genes, and particularly those influencing development, is to an anterior position result in a mirror-image duplica-
the co-option of genes and genetic modules to direct the tion of digits, suggesting that the ZPA instructs those digits
patterning or growth of novel organs. In this section, we closer to the ZPA to differentiate with posterior identity (see
consider an example of the co-option of genes by evolution- Figure 18.18b).
ary “tinkering” to form newly evolved structures: digits (fin- The Hox genes that play a conserved role in patterning
gers and toes) on tetrapod limb appendages such as hands the anterior–posterior axis in animals were considered can-
and feet. The study of the evolution of development is often didates to be the genes acting downstream of Shh to specify
referred to as evo-devo. the patterning events in digits. In mice (and by inference
humans), five Hox genes are expressed in the limb bud at
the time and place where the digits are developing: Hoxd9,
Evolution through Co-option Hoxd10, Hoxd11, Hoxd12, and Hoxd13 (Figure 18.18c).
Limb positioning in tetrapods (four-legged vertebrates) These genes are also expressed in the posteriormost regions
results in large measure from the expression of Hox genes of the mouse embryo, where they contribute to pattern-
that direct the anterior–posterior organization of the body. ing along the anterior–posterior body axis, and later in the
684 CHAPTER 18 Developmental Genetics
(a)
Flank
Forelimb Hoxc6 Hindlimb
(anterior) (posterior)
Hoxc8
(b) Mesenchyme
Limb Anterior Posterior Limb Anterior Posterior
development development
Anterior ZPA Posterior Anterior ZPA ZPA Posterior
4
4 4
Ectoderm AER AER 22
2
3 3
3
Hoxa1 Hoxa2 Hoxa3 Hoxa4 Hoxa5 Hoxa6 Hoxa7 Hoxa9 Hoxa10 Hoxa11 Hoxa13
Human
Hoxb1 Hoxb2 Hoxb3 Hoxb4 Hoxb5 Hoxb6 Hoxb7 Hoxb8 Hoxb9
+ + + +
Hoxd9
only
Hoxd9, 10,
11, 12 + 13
Hoxd9, 10,
Hoxd9 11 + 12
+ 10 Hoxd9,
10 + 11
Anterior Posterior
developing nervous system. Despite the difference in posi- Hoxd expression during anterior–posterior patterning of the
tion of hindlimb and forelimb along the body axis, the same body axis, the changes would not result in defects of this
five Hox genes are expressed in the developing digits of each earlier process. The acquisition of gene expression in the
limb. Their expression in the limb bud follows a precise tem- developing limb could be thought of as a gain-of-function
poral and spatial pattern and is dependent on Shh activity. mutation. The modularity of enhancers and silencers facili-
The first gene to be expressed is Hoxd9, followed by Hoxd10, tates evolution by co-option because individual enhancer
then Hoxd11, and so on through Hoxd13. Spatially, all genes modules are free to evolve independently. Thus the patterning
share the same posterior boundary, but the anterior boundary of a novel tetrapod organ, the limb, involved the co-option of,
of expression is different for each gene. Consequently, the or tinkering with, preexisting genetic programs that already
five Hoxd genes subdivide the limb bud into five zones, each had developmental roles elsewhere. As noted above, a major
specified by a different combination of Hoxd gene expres- constraint on this type of evolutionary change is that the
sion. Analogous to patterning along the anterior–posterior more ancestral functions of the gene must not be disrupted.
axis, ectopic expression of different Hoxd genes within the
developing limb bud results in transformations of digit iden-
tity. A similar combinatorial code of Hox gene expression 18.5 Plants Represent
also appears to specify the proximal–distal patterning of the an Independent Experiment
limb buds themselves (e.g., upper arm, forearm, hand, digits).
Mutations that expand or increase Shh expression result in Multicellular Evolution
in extra digits and have been documented in mice, chick-
ens, dogs, cats, and humans. However, because identity is Multicellularity has evolved independently many times in
controlled by only five Hox genes, the extra digits always the history of life on Earth. The two lineages of multicellu-
have a morphology closely resembling that of an adjacent lar organisms you are likely to be most familiar with are ani-
digit, rather than having a unique identity (see Figure 4.13). mals and land plants. Since the common ancestor of plants
Finally, it is worth noting that the separation of the human and animals was a single-celled organism, multicellularity
limb bud into individual digits requires programmed cell evolved independently in each lineage.
death (see Section 18.3) of the intervening cells—a process Due to their independent origins, animals and plants
that has been lost in duck and bat limbs and has led to web- differ in certain crucial aspects of their development. One
bing in those animals. difference is that germ-line cells in animals separate from
These programs have been further modified during evo- somatic (body) cells much earlier in development than do
lution in the secondary loss of legs in snakes and cetaceans. the germ-line cells in land plants. Another difference is that
The loss of the front legs of snakes is due to an anterior shift animal cells are often motile during development, whereas
in both Hoxc6 and Hoxc8 gene expression all the way to plant cells are encased in a cell wall that essentially fixes
the base of the head. All vertebrae behind the snake head, them in the location at which they arise. Animals and land
except the first one, develop as thoracic vertebrae with ribs. plants also differ with respect to when the basic form of the
In contrast, the convergent evolution of loss of hind legs in body plan takes shape. The animal body plan is established
snakes and cetaceans is due to independent alterations in during embryogenesis, and subsequent development con-
Shh activity in the developing hind limb bud. sists primarily of growth in size but without the addition of
new organs. In contrast, throughout their lifetimes plants
Constraints on Co-option add new organs that are produced from pluripotent stem-cell
populations. Finally, because plants often grow in a fixed
The ancestral roles of Hoxd genes pertained to patterning location and are unable to migrate as many animals can,
along the anterior–posterior axis of the body. Therefore, the a plant must be able to alter its developmental program in
role of Hoxd genes in specifying digit identity represents a response to changing environmental conditions throughout
co-option of function of already existing genes. These same its lifetime. Thus, although identical twins in animals are
ancestral genes also acquired roles in the differentiation of nearly indistinguishable, genotypically identical plants may
the nervous system floor plate, whose presence in all ver- develop to look very different depending upon their growth
tebrates is an indication that it evolved before limbs during environment. Despite these differences, developmental pro-
vertebrate evolution. Limbs developed later within the tet- cesses occurring in plants are remarkably similar to those
rapod lineage, and in the course of limb evolution, Shh was in animals, especially in their reliance on the coordinated
co-opted to pattern digits, structures that did not previously action of transcription factors and signaling molecules.
exist. By what process are genes co-opted for new functions
during evolution?
In the case of limb evolution, genes of the Hoxd clus-
Development at Meristems
ter could have come under control of limb-specific enhancer Plant development occurs at organized groups of pluripotent
modules leading to expression of the Hoxd genes in develop- cells called meristems. The two functions of meristems are
ing limbs. As long as changes in regulation did not disrupt generation of organs and self-maintenance (to ensure that
686 CHAPTER 18 Developmental Genetics
a pool of stem cells is always present). The above ground response to seasonal changes into reproductive meristems. A
parts of a plant are produced by shoot meristems and the reproductive meristem may either develop directly into a flower
below ground parts by root meristems. The shoot meristem meristem, or alternatively into an inflorescence meristem that
is divided into three functional domains—a peripheral zone produces flower meristems—an inflorescence being a group of
from which leaves are formed, a rib zone from which part of flowers. In turn, flower meristems produce floral organs from
the stem is derived, and a central zone that acts as a stem-cell their peripheral zones. Unlike the other meristems, flower meri-
reservoir to replenish cells lost to the developing leaves and stems are determinate: no more stem cells are available after the
stem (Figure 18.19). Meristems are generally indeterminate— flower meristem has produced a fixed number of organs.
that is, they can remain active for years, or in some cases the Because each type of meristem is characterized by
entire life of the plant. For example, the shoot meristem at the a specific pattern of gene expression, mutations in key
top of a pine tree can be active for centuries, continually pro- genes can result in homeotic transformations of meristem
ducing leaves and side branches. Over time, the sizes of the types. We have all eaten one such mutant, cauliflower, in
central and peripheral domains remain remarkably constant. which meristems that would normally be specified as flow-
It is the continual production of new organs from meristems ers behave instead as inflorescence meristems (see Figure
throughout the life of a plant that allows plants to adjust and 18.19, lower right). The genetic basis of this phenotype has
adapt to changing local environmental conditions. been identified in Arabidopsis as loss-of-function alleles
The identity of the meristem determines what types of of two closely related paralogs, APETALA1 and CAULI-
organs are produced from its periphery. Early in the life of a FLOWER, encoding transcription factors.
flowering plant, leaves are produced from the flanks of the
shoot meristem, and roots are produced from the root meri- Combinatorial Homeotic Activity
stem. At the upper side of the attachment point of the leaf to
in Floral-Organ Identity
the stem an axillary meristem is formed, from which a branch
can arise. This reiterative formation of meristems that produce Several flowering plant species have been adopted as mod-
leaves that produce branches containing meristems forms the els for the study of genetics. For example, peas (Pisum
basis of most aboveground development of flowering plants. sativum), with which Mendel performed his experiments,
In response to appropriate environmental conditions, the and maize (Zea mays), in which transposons were discov-
identity of meristems can change. For example, shoot meri- ered, were introduced in earlier chapters. Due to its small
stems, which have been producing leaves, are converted in size, short generation time, and fully sequenced genome,
fm
fm fm
fm
Shoot meristem fm
fm im
fm
Central zone (stem-cell reservoir) fm
fm
fm
fm
im
the most widely used model plant is Arabidopsis thaliana. Wild-type Arabidopsis
Since the 1980s, study of homeotic mutants in Arabidopsis Whorl 1 sepals
and another plant species, Antirrhinum (snapdragon), has Whorl 2 petals
led to insights into the genetic basis of flower development Whorl 3 stamens
and revealed developmental parallels with animals. Whorl 4 carpels
Arabidopsis flowers are composed of four concentric
whorls of organs (Figure 18.20). The outermost whorl is
A-class mutant
occupied by sepals, organs that protect the flower bud dur- (apetala2)
ing development. The second whorl is occupied by petals, Whorl 1 carpels
which in many species attract pollinators. Stamens, the male Whorl 2 stamens
organs that produce pollen, are located in the third whorl, Whorl 3 stamens
and the female organs—carpels, containing the ovules— Whorl 4 carpels
occupy the central whorl.
B-class mutant
(apetala3 or pistillata)
Homeotic Floral Mutants of Arabidopsis Recessive floral
Whorl 1 sepals
homeotic mutants of Arabidopsis fall into three classes, each
having defects in two adjacent whorls (see Figure 18.20). Whorl 2 sepals
One class, named the A class, exhibits homeotic transforma- Whorl 3 carpels
tions in the outer two whorls, where carpels develop in the Whorl 4 carpels
positions normally occupied by sepals and stamens replace C-class mutant
petals, so that the four floral whorls consist of carpels, sta- (agamous)
mens, stamens, and carpels (see Figure 18.20). A second Whorl 1 sepals
class, the B-class mutants, exhibit homeotic transformations Whorl 2 petals
in the middle two whorls, where sepals replace petals and car- Whorl 3 petals
pels replace stamens, so that the four whorls consist of sepals, Whorl 4 sepals
sepals, carpels, and carpels. In C-class mutants, homeotic
BC double mutant
transformations in the third and fourth whorls result in flow-
(apetala3 agamous)
ers where petals develop in the positions normally occupied
Whorl 1 sepals
by stamens, and the cells that would normally give rise to the
Whorl 2 sepals
carpels behave as if they were another flower meristem that
Whorl 3 sepals
reiterates the developmental cycle. Similar mutants can be
Whorl 4 sepals
found in a number of ornamental plant species and are often
referred to as “double flowers.” AC double mutant
In Arabidopsis, A-class activity is promoted by two (apetala2 agamous)
genes, APETALA2 and APETALA1, B-class activity by the Whorl 1 leaf-like
carpels
APETALA3 and PISTILLATA genes, and C-class activity by
Whorl 2 petal-like
the AGAMOUS gene. Double mutants either display an addi-
stamens
tive phenotype (e.g., apetala3 agamous flowers consisting
Whorl 3 petal-like
of only sepals) or exhibit novel phenotypes (e.g., apetala2 stamens
agamous flowers with novel floral organs that do not exist in Whorl 4 leaf-like
wild-type flowers). Additive double-mutant phenotypes sug- carpels
gest that the two genes do not interact, whereas nonadditive
ABC triple mutant
double-mutant phenotypes suggest that the two genes interact (apetala2 pistillata
to influence a common developmental pathway. For example, agamous)
in apetala2 agamous flowers, the first and fourth whorls have Whorl 1 leaf-like
leaf-like carpels whereas the second and third whorls are occu- carpels
pied by organs with features of both petals and stamens. The Whorl 2 leaf-like
agamous mutation has a phenotypic effect in the first and sec- carpels
ond whorls in an apetala2 background (compare the identities Whorl 3 leaf-like
carpels
of these whorls in an apetala2 single mutant with a apetala2
Whorl 4 leaf-like
agamous double mutant), an effect not observed in a wild-type carpels
background, where phenotypic defects of agamous are limited
to the third and fourth whorls. This indicates that AGAMOUS Figure 18.20 Floral homeotic mutations in Arabidopsis.
is ectopically active in first and second whorls in apetala2
mutants. Likewise, based on the double-mutant phenotype,
APETALA2 is active in the inner whorls of agamous mutants.
688 CHAPTER 18 Developmental Genetics
On the basis of single and multiple mutant phenotypes, a B-class and C-class activities are absent, only A-class genes
model was formulated in which the identity of organs develop- are expressed in all four whorls, and a flower with only
ing in any whorl is determined by the combination of homeotic sepals develops (see Figure 18.20). In ABC triple mutants,
genes active in that whorl (Figure 18.21). It was presumed that in which all floral-organ–identity gene activity is compro-
each class of gene is active in the whorls seen to be in the respec- mised, leaf-like organs are found in all whorls. These obser-
tive mutants: APETALA2 and APETALA1 in the outer two vations suggest that since floral organs are evolutionarily
whorls, APETALA3 and PISTILLATA in the middle two whorls, derived from leaves, one role of the floral homeotic genes is
and AGAMOUS in the inner two whorls. Thus, each whorl to modify a leaf into a specialized floral organ.
is characterized by a different combination of homeotic gene
activity that specifies floral organ identity. The A-class activity Homeotic MADS Box Transcription Factors Many
by itself in the first whorl specifies sepals, A@class + B@class floral homeotic genes encode closely related transcrip-
in the second whorl specifies petals, B@class + C@class in the tion factors, similar to the situation with animal homeotic
third whorl specifies stamens, and C-class by itself in the fourth genes. However, rather than encoding homeobox genes, the
whorl specifies carpels. To account for the mutant phenotypes floral homeotic genes encode MADS box genes, named
(specifically the apetala2 agamous mutant described above), a after the DNA-binding domain of the transcription factors.
second postulate of the model is that the A-class and C-class The name MADS box is derived from four members of the
activities are mutually antagonistic, so that in an A-class mutant gene family: MCM1 of Saccharomyces cerevisiae, AGA-
background, C-class activity is found in all four whorls; and MOUS of Arabidopsis, DEFICIENS of Antirrhinum, and
conversely, in a C-class mutant background, A-class activity is SRF of humans. All of the B- and C-class genes, as well
in all four whorls. The specification of identity by combinations as APETALA1, encode MADS boxes. Consistent with the
of homeotic gene activities and cross-regulatory interactions model described above, the B-class genes are expressed in
between the floral homeotic genes is reminiscent of specifica- whorls two and three, and the C-class gene, AGAMOUS, is
tion of segmental identity in Drosophila by Hox genes. expressed in the third and fourth whorls (see Figure 18.21).
The model successfully predicts the phenotypes of mul- Subsequent studies have shown that the ABC classes of
tiple mutants. For example, in a double mutant in which both MADS box proteins interact with another class of MADS
Floral whorl 1 2 3 4 3 2 1
Evaluate
1. Identify the topic this problem addresses 1. This problem concerns the investigation of genes determining development
and the nature of the required answer. of kelp. Devising an answer requires evaluating the relative potential of reverse
2. Identify the critical information given in genetic analysis versus forward genetic analysis (see Sections 14.1 and 14.3).
the problem. 2. Kelp is identified as brown algae, a form of life distinct from land plants and
animals.
Deduce
3. Determine if looking for gene homology 3. Examination of Figure 16.12 indicates that kelp is only distantly related to
(a reverse genetic approach) has a high either land plants or animals. Therefore, searching for brown algal genes
probability of successfully identifying based on the sequences of plant or animal developmental genes is some-
developmental genes in kelp. thing of a fishing expedition (i.e., holds little promise of success).
TIP: Was the common ancestor of PITFALL: Distantly related organisms are likely to have evolved substan-
animals, plants, and kelp unicellular tially since they last shared a common ancestor, and the extent of gene
or multicellular? Review Figure 16.12. homology decreases as evolutionary distance between species increases.
Solve
4. Determine whether the use of mutagen- 4. A good approach to finding developmental genes is to perform a muta-
esis (a forward genetics approach) is likely genesis experiment that will identify mutants in which pattern formation is
to help identify kelp developmental genes. perturbed. Mutagenesis can potentially affect any gene; thus, the forward
genetics approach is not biased or restricted to genes that share homology
TIP: How were genes that regulate with genes in other species. Mutants displaying abnormalities of wild-type
development in Drosophila originally
identified? pattern formation are likely to carry mutations of pattern-forming genes.
For more practice, see Problems 17, 19, 22, 23, and 26. Visit the Study Area to access study tools. Mastering Genetics
box protein encoded by the SEPALLATA (SEP) genes (see Studies of B- and C-class genes from flowering plants and
Chapter 14 Case Study). The SEP proteins together with the gymnosperms (e.g., conifers) suggest that for all seed plants,
A-, B-, and C-class proteins form higher-order complexes that C-class genes alone promote female reproductive development
regulate transcription (see Figure 18.21). The SEP proteins and that B + C gene activity promotes male reproductive
provide a transcriptional activation activity to the complexes, development. However, unlike the Hox genes, which appear
an activity that the B and C proteins lack. Conversely, the A, to have evolved at the base of the animal lineage and which
B, and C proteins provide specificity to the complexes, an control patterning in all known animals, the B- and C-class
activity the SEP proteins lack. When A-, B-, or C-class genes genes are unknown in earlier-diverging lineages of land plants,
are ectopically expressed throughout the flower meristem, such as ferns, lycophytes, and bryophytes, whose reproductive
they cause homeotic transformations of floral organ iden- structures differ substantially in morphology and development
tity. For example, if B-class genes are ectopically expressed and whose leaf-like organs evolved independently.
throughout the flower, the result is a flower with organ identi- We have seen that the specification of serially repeated
ties of petal, petal, stamen, stamen, from the first to the fourth structures in both Drosophila and Arabidopsis is controlled
whorls. In contrast, ectopic expression of the A-, B-, and in a similar manner via the combinatorial action of closely
C-class genes alone is not sufficient to convert the leaves of related transcription factors. Although the mechanism of
the Arabidopsis plant into floral organs. However, if the SEP developmental patterning in plants and animals is similar,
genes are ectopically expressed in addition to, for example, the genes involved in development in the two kingdoms are
the A and B genes, the combination is sufficient to convert not related; this is consistent with the independent evolution
leaves into petals. In this manner, the identities of leaves and of multicellularity in plants and animals.
floral organs are interconvertible by the absence or presence Genetic Analysis 18.2 asks you to design an experimen-
of the expression of the floral homeotic genes, consistent with tal strategy to genetically dissect development in another
floral organs evolving by modification of an ancestral leaf. group of multicellular eukaryotes.
689
690 CHAPTER 18 Developmental Genetics
C A SE ST U D Y
Cyclopia and Polydactyly—Different Shh Mutations with Distinctive Phenotypes
Sonic hedgehog (Shh), introduced in Section 18.4, is an HOLOPROSENCEPHALY/CYCLOPIA Holoprosencephaly
evolutionarily conserved gene that performs multiple (HPE) is a genetically heterogeneous abnormality, meaning
related but distinct roles in developing tissues of animals. that mutations in different genes can cause the disorder. One
The gene’s best-understood developmental roles, stem- form of holoprosencephaly, HPE3, is caused by Shh muta-
ming from its expression in limb buds and in the neural tions. HPE3 is a clinically variable disorder that produces
tube, pertain to digit formation and to the development of many different morphological abnormalities in patients.
the floor plate. The floor plate divides the brain into hemi- The most subtle phenotypic defect is a slight loss of midline
spheres and is required for midline separation of other separation, resulting in a single central incisor. More severe
anatomical features, including separating developing eye defects include characteristic brain abnormalities; abnormali-
tissue into right and left eyes. Given the central role of Shh ties of the mid-face, such as the formation of a proboscis-like
in development, it stands to reason that Shh mutations pro- nose; or possibly, in the most extreme cases, cyclopia, the
foundly affect normal development and morphology. Here presence of a single large mass of eye tissue rather than two
we briefly examine two abnormal conditions that are caused separate eyes.
by changes in Shh activity: holoprosencephaly/cyclopia and Numerous Shh mutations that cause HPE3 affect the
polydactyly. coding region of the gene and result in the production of
Carrier
1 2 1 2 3
II II
1 2 3 4 5 6 7 8 1 2 3 4 5
III III
1 2 3 4 5 6 7 1 2
III III
Mild Strong
phenotype phenotype
(deceased)
Gain-of-function mutant alleles in limb-bud
Loss-of-function mutant alleles in Shh exons are enhancer prolong Shh expression and are
haploinsufficient and inherited in a dominant manner. inherited in a dominant manner.
Loss of Shh activity in floor plate Shh expression in developing Prolonged Shh activity in limb bud
causes cyclopia. mouse embryo causes extra digit development.
Figure 18.22 Effects of alterations in Shh morphogen activity in the floor plate and the limb bud.
Summary 691
a severely defective or nonfunctional protein product, lead- mechanism by which cyclopamine caused cyclopia and discov-
ing to a failure to form the floor plate and thus to form ered that the compound binds to the Shh receptor expressed
brain hemispheres. To date, there are no specific genotype– in cells in the floor plate and blocks their response to Shh pro-
phenotype correlations that tie specific Shh mutations to tein. This study illustrates that the action of normal proteins
more severe or less severe manifestations of HPE3 or cyclo- can be inhibited under certain environmental circumstances
pia. Although the HPE3 mutations in Shh are missense, non- to produce effects similar to those seen with gene mutation.
sense, and frameshift loss-of-function alleles, familial cases When an environmental condition induces a phenotype simi-
of HPE3 are inherited in an autosomal dominant manner, lar to that caused by mutation, the environmental condition is
indicating that the Shh mutations are haploinsufficient: The said to induce a phenocopy of the mutant phenotype.
presence of a single copy of a wild-type allele is not sufficient
for normal activity. Nevertheless, pedigrees exhibit variation POLYDACTYLY If Shh expression is eliminated from the
in both penetrance and expressivity, most likely because developing limb bud by loss-of-function mutations inactivat-
other genes involved in brain and mid-face formation (i.e., ing the Shh protein, limb patterning is perturbed and digits
the other genes that cause the HPE phenotype) influence do not form. However, if Shh expression is altered by muta-
the extent of morphological abnormality (Figure 18.22a–b). tion in the cis-regulatory region of the gene, changes in the
Thus, as with most genetic disorders that have been char- Shh protein concentration gradient can result in polydactyly,
acterized in humans, both penetrance and expressivity of the presence of extra digits (see Figure 18.22c). The extra
abnormal phenotypes are modified significantly by genetic digits develop because Shh protein is present in high con-
background. centration in parts of the limb bud where it is not normally
During the 1950s, an epidemic of cyclopia was reported found. Polydactyly in humans (discussed in Section 4.2) is
among sheep in the western United States (Figure 18.22c). an autosomal dominant disorder. Its inheritance is dominant
The compound cyclopamine, found in the plant Veratrum because the ectopic expression resulting from the mutation
californicum, was implicated as an environmental cause of the is a gain of function. The enhancer element responsible for
abnormalities. Evidence indicated that ingestion of V. califor- appropriate Shh expression in the developing limb buds was
nicum during gestation caused the production of lambs with identified using a phylogenetic footprinting approach (see
cyclopia. In 2002, Philip Beachy and colleagues looked at the Figure 16.17).
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
18.1 Development Is the Building rather than that of the embryo, dictates the embryonic phe-
of a Multicellular Organism notype for the traits these genes determine.
❚❚ Gap genes are regulated by maternal effect genes and subdi-
❚❚ Multicellularity has evolved independently multiple times. vide the Drosophila embryo into several broad regions. Pair-
❚❚ The development of a multicellular organism from a fertil- rule genes are regulated by both maternal effect and gap
ized egg cell entails the formation of specialized cell types, genes, and they subdivide the embryo into parasegments.
driven by differential expression of genes. ❚❚ Homeotic genes known as the Hox genes act in combina-
❚❚ As animal development proceeds, cells become tion to specify the parasegments of Drosophila. Hox genes
progressively restricted in their potential developmental are largely conserved throughout the metazoan kingdom.
fates, changing from totipotent to pluripotent to ❚❚ Downstream targets of the Hox genes contribute to the mor-
differentiated. phogenesis of body segments.
❚❚ Morphogens can provide positional information that is con- ❚❚ Hox gene expression patterns are maintained by regulation
verted into differential gene expression. at the level of chromatin, providing a cellular memory of
❚❚ Signaling between neighboring cells can induce or inhibit gene expression propagated through mitoses.
developmental pathways. Genes controlling developmental
processes often encode transcription factors or molecules
involved in signaling between cells.
18.3 Cellular Interactions Specify Cell Fate
❚❚ In C. elegans, an inductive signal from the anchor cell
18.2 Drosophila Development Is a Paradigm determines vulval cell fates, and lateral inhibition refines
cell specification in the developing vulva.
for Animal Development
❚❚ Programmed cell death, or apoptosis, is a normal aspect
❚❚ Genetic screens of Drosophila identified sets of succes- of development in animals. It is required for sculpting the
sively acting genes directing pattern formation during body plan during embryogenesis and maintaining tissues
embryonic development. postembryonically.
❚❚ The Drosophila embryo is successively subdivided into
segments, each with a unique identity, by the sequential
action of batteries of transcription factors.
18.4 “Evolution Behaves Like a Tinkerer”
❚❚ Genes whose products are supplied to the egg by the ❚❚ Most animals possess the same types of genes; therefore,
mother and act to guide the development of the embryo are the differences between animals are largely due to differ-
called maternal effect genes. The genotype of the mother, ences in how genes are deployed during development.
692 CHAPTER 18 Developmental Genetics
❚❚ Genes can be co-opted to direct the development of new ❚❚ Plants continue to add organs throughout their life span due
organs and tissues, often through changes in gene expres- to the action of meristems, which are groups of pluripotent
sion patterns. For example, the evolution of limbs and digits stem cells.
in tetrapods occurred through changes in Hox and Sonic ❚❚ Combinatorial action of homeotic genes specifies the identity
hedgehog gene expression. of floral organs in flowering plants; the homeotic genes in
plants encode MADS box transcription factors, analogous
18.5 Plants Represent an Independent to the transcription factors encoded by the homeobox in
animals.
Experiment in Multicellular Evolution
❚❚ Despite differences in cellular behavior between plants and
animals, the genetic control of development in plants has
many similarities to that of animals.
PRE PA R IN G F O R P R O BLE M S O LV I NG
In addition to the list of problem-solving tips and suggestions 5. Review the mechanisms (induction and lateral inhibi-
given here, you can go to the Study Guide and Solutions Man- tion) by which cell fate is specified during vulval devel-
ual that accompanies this book for help at solving problems. opment in C. elegans.
1. Understand how cell identity can be specified in multi- 6. Understand that old genes may be co-opted to perform
cellular organisms via the processes of pattern forma- new functions in the course of evolution.
tion, induction, and inhibition.
7. Understand that plants, animals, fungi, and brown algae
2. Be familiar with the hierarchy of gene activity that have all evolved multicellularity independently from
leads to the Drosophila embryo being more and more single-celled ancestors.
finely subdivided.
8. Be prepared to compare and contrast the specification
3. Understand the different inheritance patterns for mater- of segmental identity in Drosophila with the specifica-
nal effect versus zygotic genes. tion of floral organ identity in Arabidopsis.
4. Be familiar with how the homeotic genes, acting alone
and in combination, specify segmental identity.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Explain why many developmental genes encode either 4. Early development in Drosophila is atypical in that pat-
transcription factors or signaling molecules. tern formation takes place in a syncytial blastoderm,
allowing free diffusion of transcription factors between
2. Bird beaks develop from an embryonic group of cells
nuclei. In many other animal species, the fertilized egg is
called neural crest cells that are part of the neural tube
divided by cellular cleavages into a larger and larger num-
that gives rise to the spinal column and related struc-
ber of smaller and smaller cells.
tures. Amazingly, neural crest cells can be surgically
transplanted from one embryo to another, even between a. What constraints does the formation of a syncytial
embryos of different species. When quail neural crest cells blastoderm impose on the mechanisms of pattern
were transplanted into duck embryos, the beak of the host formation?
embryo developed into a shape similar to that found in b. How must the model that describes Drosophila devel-
quails, creating the “quck.” Duck cells were recruited in opment be modified for describing other animal spe-
addition to the quail cells to form part of the quck beak. cies whose early development is not syncytial?
Conversely, when duck neural crest cells were transplanted 5. Consider the even-skipped regulatory sequences in
into quail embryos, the beak of the embryo resembled that Figure 18.9.
of a duck, creating a “duail,” and quail cells were recruited a. How are the sharp boundaries of expression of eve
to form part of the beak. What do these experiments tell stripe 2 formed?
you about the autonomy or nonautonomy of the trans- b. Consider the binding sites for gap proteins and
planted and host cells during beak development? Bicoid in the stripe 2 enhancer module. What
3. How is positional information provided along the sites are occupied in parasegments 2, 3, and 4,
anterior–posterior axis in Drosophila? What are the and how does this result in expression or no
functions of bicoid and nanos? expression?
Problems 693
c. Explain what you expect to see happen to even-skipped which cell layers are established). What does this tell you
stripe 2 if it is expressed in a Krüppel mutant back- about maternal versus zygotic gene activity in early frog
ground. A hunchback mutant background? A giant development?
mutant background? A bicoid mutant background?
10. Ablation of the anchor cell in wild-type C. elegans results
6. What is the difference between a parasegment and seg- in a vulva-less phenotype.
ment in Drosophila development? Why do developmental a. What phenotype is to be expected if the anchor cell is
biologists think of parasegments as the subdivisions that ablated in a let-23 loss-of-function mutant?
are produced during development of flies? b. What about if the anchor cell is ablated in a let-23
7. Why do loss-of-function mutations in Hox genes usu- gain-of-function mutant?
ally result in embryo lethality, whereas gain-of-function 11. In gain-of-function let-23 and let-60 C. elegans mutants,
mutants can be viable? Why are flies homozygous for the all of the vulval precursor cells differentiate with 1° or 2°
recessive loss-of-function alleles Ultrabithorax bithorax and fates. Do you expect adjacent cells to differentiate with 1°
Ultrabithorax postbithorax viable? fates or with 2° fates? Explain.
8. Compare and contrast the specification of segmental iden- 12. In mammals, identical twins arise when an embryo
tity in Drosophila with that of floral organ specification in derived from a single fertilized egg splits into two inde-
Arabidopsis. What is the same in this process, and what is pendent embryos, producing two genetically identical
different? individuals.
9. Actinomycin D is a drug that inhibits the activity of RNA a. What limits might there be, from a developmental
polymerase II. In the presence of actinomycin D, early genetic viewpoint, as to when this can occur?
development in many vertebrate species, such as frogs, b. The converse phenotype, fusion of two genetically
can proceed past the formation of a blastula, a hollow distinct embryos into a single individual, is also
ball of cells that forms after early cleavage divisions; known. What are the genetic implications of such an
but development ceases before gastrulation (the stage at event?
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
13. The bicoid gene is a coordinate, maternal effect gene. the genes in a mutant background—for example, looking
a. A female Drosophila heterozygous for a loss-of- at fushi tarazu expression in an engrailed mutant back-
function bicoid allele is mated to a male that is hetero- ground, and vice versa.
zygous for the same allele. What are the phenotypes of a. Given the hierarchy of gene action during Drosophila
their progeny? embryogenesis, what might you predict to be the result
b. A female that is homozygous for a loss-of-function of these experiments?
bicoid allele is mated to a wild-type male. What are the b. Based on your prediction, can you predict the pheno-
phenotypes of their progeny? type of the fushi tarazu and engrailed double mutant?
c. If loss of bicoid function in the egg leads to lethality 17. In contrast to Drosophila, some insects (e.g., centipedes)
during embryogenesis, how are females homozygous have legs on almost every segment posterior to the head.
for bicoid produced? What is the phenotype of a male Based on your knowledge of Drosophila, propose a
homozygous for bicoid loss-of-function alleles? genetic explanation for this phenotype, and describe the
14. Given that maternal Bicoid activates the expression of expected expression patterns of genes of the Antenna-
hunchback (see Figure 18.7), what would be the con- pedia and bithorax complexes.
sequence of adding extra copies of the bicoid gene by
transgenic means to a wild-type female with two copies, 18. The bristles that develop from the epidermis in Dro-
thus creating a female fly with three or four copies of sophila are evenly spaced, so that two bristles never occur
the bicoid gene? How would hunchback expression be immediately adjacent to each other. How might this pat-
altered? What about the expression of other gap genes and tern be established during development?
pair-rule genes? 19. You are traveling in the Netherlands and overhear a tulip
15. What phenotypes do you expect in flies homozygous for breeder describe a puzzling event. Tulips normally have
loss-of-function mutations in the following genes: Krüp- two outer whorls of brightly colored petal-like organs, a
pel, odd-skipped, hedgehog, and Ultrabithorax? third whorl of stamens, and an inner (fourth) whorl of car-
pels. However, the breeder found a recessive mutant in his
16. The pair-rule gene fushi tarazu is expressed in the seven field in which the outer two whorls were green and sepal-
even-numbered parasegments during Drosophila embryo- like, whereas the third and fourth whorls both contained
genesis. In contrast, the segment polarity gene engrailed carpels. What can you speculate about the nature of the
is expressed in the anterior part of each of the 14 para- gene that was mutated?
segments. Since both genes are active at similar times
and places during development, it is possible that the 20. A powerful approach to identifying genes of a develop-
expression of one gene is required for the expression of mental pathway is to screen for mutations that suppress
the other. This can be tested by examining expression of or enhance the phenotype of interest. This approach was
694 CHAPTER 18 Developmental Genetics
undertaken to elucidate the genetic pathway controlling b. Loss-of-function mutations in the coding region of
C. elegans vulval development. the homologous gene in humans result in loss of hair,
a. A lin-3 loss-of-function mutant with a vulva-less phe- teeth, and sweat glands, as in the toothless men of Sind
notype was mutagenized. Based on your knowledge of (India). What does this suggest about hair, teeth, and
the genetic pathway, what types of mutations will sup- sweat glands in humans?
press the vulva-less phenotype? 23. The flowering jungle plant Lacandonia schismatica, dis-
b. In a complementary experiment, a gain-of-function covered in southern Mexico, has a unique floral structure.
let-23 mutant with a multi-vulva phenotype was also Petal-like organs are in the outer whorls surrounding a
mutagenized. What types of mutations will suppress number of carpels, and stamens are in the center of the
the multi-vulva phenotype? flower. Closely related species are dioecious; female
21. The Hoxd9–13 genes are thought to specify digit identity plants bear flowers that resemble those of Lacandonia,
(see Figure 18.18). but without the central stamens. What type of muta-
a. What would be the consequence of ectopically tion could have resulted in the evolution of Lacandonia
expressing Hoxd10 throughout the developing mouse flowers?
limb bud? What about Hoxd11? What about both 24. Homeotic genes are thought to regulate each other.
Hoxd10 and Hoxd11? a. What aspect of the phenotype of apetala2 agamous
b. You wish to examine the effect of loss-of-function double mutants indicates that these two genes act
alleles in developing limbs. How would you construct antagonistically?
a mouse in which the function of Hoxd9–13 is retained b. Are similar interactions observed between Hox
during anterior–posterior embryonic patterning but is genes?
absent from developing limbs?
25. Dipterans (two-winged insects) are thought to have
22. Three-spined stickleback fish live in lakes formed when evolved from a four-winged ancestor that had wings on
the last ice age ended 10,000 to 15,000 years ago. In both T2 and T3 thoracic segments, as in extant butterflies
lakes where the sticklebacks are prey for larger fish, they and dragonflies. Describe an evolutionary scenario for the
develop 35 bony plates along their body as armor. In con- evolution of dipterans from four-winged ancestors. What
trast, sticklebacks in lakes where there are no predators types of mutations could lead to a butterfly developing
develop only a few or no bony plates. with only two wings?
a. In crosses between fish of the two different morpholo-
gies, the lack of bony armor segregates as a recessive 26. Basidiomycota is a monophyletic group of fungi that
trait that maps to the ectodermal dysplasin (Eda) gene. includes most of the common mushrooms. You are inter-
Comparisons between the Eda-coding regions of the ested in the development of the body plan of mushrooms.
armored and nonarmored fish revealed no differences. How would you identify the genes required for patterning
How can you explain this result? during mushroom development?
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
27. Zea mays (maize, or corn) was originally domesticated in Brenner carried out genetic screens to identify mutations in
central Mexico at least 7000 years ago from an endemic three genes that result in either XX males (tra-1, tra-2) or
grass called teosinte. Teosinte is generally unbranched, XO hermaphrodites (her-1). Double-mutant strains were
has male and female flowers on the same branch, and constructed to assess for epistatic interactions between the
has few kernels per “cob,” each encased in a hard, leaf- genes (see table). Propose a genetic model of how the her
like organ called a glume. In contrast, maize is highly and tra genes control sex determination.
branched, with a male inflorescence (tassel) on its cen-
tral branch and female inflorescences (cobs) on axillary
branches. In addition, maize cobs have many rows of
kernels and soft glumes. George Beadle crossed cultivated Genotypea XX Phenotype XO Phenotype
maize and wild teosinte, which resulted in fully fertile Wild-type Hermaphrodite Male
F1 plants. When the F1 plants were self-fertilized, about tra@1rec Male Male
1 plant in every 1000 of the F2 progeny resembled either
a modern maize plant or a wild teosinte plant. What did tra@2rec Male Male
rec
Beadle conclude about whether the different architectures her@1 Hermaphrodite Hermaphrodite
of maize and teosinte were caused by changes with a tra@1dom/+ Hermaphrodite Hermaphrodite
small effect in many genes or changes with a large effect rec rec
in just a few genes? tra@ 1 tra@2 Male Male
rec rec
tra@1 her@1 Male Male
28. In C. elegans there are two sexes: hermaphrodite and male.
Sex is determined by the ratio of X chromosomes to hap- tra@2rec her@1rec Male Male
loid sets of autosomes (X/A). An X/A ratio of 1.0 produces tra@2rec tra@1dom/+ Hermaphrodite Hermaphrodite
a hermaphrodite (XX), and an X/A ratio of 0.5 results in a
rec = recessive mutation; dom = dominant mutation.
a male (XO). In the 1970s, Jonathan Hodgkin and Sydney
Problems 695
29. In Drosophila, recessive mutations in the fruitless gene a. What phenotypes, and in what proportions, do you
(fru) result in males courting other males; and reces- expect in the F2 obtained by interbreeding F1 animals?
sive mutations in the Antennapedia gene (Ant - ) lead to b. Your cross results in the following phenotypic
defects in the body plan, specifically in the thoracic region proportions:
of the body, where mutants fail to develop legs. The two
genes map 15 cM apart on chromosome 3. You have iso- Legs on head, normal courting behavior 75
lated a new dominant Ant d mutant allele that you induced Normal head, abnormal courting behavior 25
by treating your flies with X-rays. Your new mutant has
legs developing instead of antennae on the head of the Legs on head, abnormal courting behavior 0
fly. You cross your newly induced dominant Ant d mutant Normal head, normal courting behavior 0
(a pure-breeding line) with a homozygous recessive fru
mutant (which is homozygous wild type at the Ant + Provide a genetic explanation for these results and
locus), as diagrammed below: describe a test for your hypothesis.
c. Provide a molecular explanation for the reason
Ant dfru+ Ant +fru Ant dfru+ your new Ant d mutant is dominant and for its novel
* S F 1 phenotype.
Ant dfru+ Ant +fru Ant +fru
19 Genetic Analysis
of Quantitative Traits
CHAPTER OUTLINE
19.1 Quantitative Traits Display
Continuous Phenotype
Variation
19.2 Quantitative Trait Analysis Is
Statistical
19.3 Heritability Measures the
Genetic Component of
Phenotypic Variation
19.4 Quantitative Trait Loci Are
the Genes That Contribute
to Quantitative Traits
ESSENTIAL IDEAS A human histogram depicting the distribution of heights of 138 faculty
members and students of the University of Connecticut. The women are in
❚❚ Quantitative traits are influenced by
white shirts and the men are in blue shirts.
multiple genes and may also be influenced
by the environment. They are continuously
E
distributed along a phenotypic scale.
Some quantitative traits are separated xplaining the connection between phenotypes and
into distinct phenotypes by a threshold.
genotypes is simplest when the phenotypic variation in
❚❚ The phenotypic distributions of
quantitative traits are described by
a trait is decided by variation in a single gene. The segrega-
statistical measures that also estimate the tion of alleles of a single gene determining whether peas are
genetic and environmental contributions round or wrinkled, as in Mendel’s studies, is a classic exam-
to phenotype.
ple. Other genes are not involved, and there is no evidence
❚❚ The extent to which genetic variation
contributes to phenotype variability can of gene interaction (i.e., epistasis) or of interaction between
be estimated for quantitative traits and the gene and specific environmental factors.
provides an indication of how traits may
In reality, however, such direct correlations between
respond to artificial selection.
❚❚ The genes that influence quantitative
phenotypes and genotypes are not common. Many traits
traits are identified and mapped using display variation resulting from epistatic gene interactions (see
genetic crosses and molecular and Section 4.3). In addition, numerous traits, known as polygenic
statistical techniques.
traits, result from the influence of multiple genes. The genes
696
19.1 Quantitative Traits Display Continuous Phenotype Variation 697
contributing to polygenic traits generally assort inde- 19.1 Quantitative Traits Display
pendently to produce a large number of genotypes
Continuous Phenotype Variation
and multiple phenotypes. The inheritance of polygenic
traits is identified as polygenic inheritance. Further For most of the traits we discuss in earlier chapters, phe-
complicating the imperfect correlations between gen- notypic variation is controlled by allelic variation at single
otypes and phenotypes in polygenic inheritance is the genes. The phenotypes of these single-gene traits commonly
display discontinuous variation, meaning differences that
possibility that environmental factors or circumstances allow organisms to be assigned to discrete, sharply distin-
during development can interact with one or more of guishable phenotypic categories. The discontinuous patterns
the genes to shape the phenotype. Thus both genetic of variation lead to the specification of consistent phenotype
variation and nongenetic variation can contribute to ratios, such as a 3:1 ratio among the F2 progeny of self-
fertilized F1 organisms. Even when two genes take part in
the phenotypic variation of certain traits, which are epistatic interactions that affect phenotypic expression, the
therefore identified as multifactorial traits. phenotypes are discrete and occur in predictable ratios (see
Many multifactorial traits have phenotypes that Section 4.3).
are best described in quantitative rather than qualita- In contrast, polygenic and multifactorial traits usually
display continuous variation, which is phenotypic varia-
tive terms, that is, with the aid of numbers rather than tion distributed across a range of values in an uninterrupted
descriptive adjectives. Qualitative phenotypes often continuum. This section explores the genetic factors con-
fall into discrete categories that may correspond to tributing to traits displaying continuous variation.
specific genotypes and that are distinctly different
from one another. “Round seeds” versus “wrinkled Genetic Potential
seeds” or “blood type A” versus “blood type B” are Human adult height is an example of a multifactorial trait
examples of qualitative phenotypic differences. In that varies continuously along a scale of measurement usu-
contrast, quantitative phenotypic variation usually ally marked off in centimeters or inches. This continuous
variation is demonstrated in the chapter-opening photo, in
takes the form of continuous variation along a phe-
which 138 University of Connecticut students and faculty
notypic scale, and the traits are frequently described are arranged according to height. The height distribution
using units of measure. For example, one might use of this sample, divided into 1-inch increments, ranges from
kilograms to measure quantitative variation in the 60 inches (5 feet) to 77 inches (6 feet, 5 inches). The length
of each line of individuals behind the height markers repre-
weight of cattle or centimeters to measure quantita-
sents the frequency of each incremental category, and the
tive variation in the length of ears of corn. Traits of this sweatshirt and hat color identifies the wearer’s sex (white
kind are often identified as quantitative traits. Note, for women and blue for men). Examining the overall distri-
however, that this term is also used for traits that are bution, you can see that it is actually composed of two dif-
ferent distributions, one for each sex, and you can also see
nonnumeric but vary over a phenotypic range, as with
that the distribution is uneven.
a range of color phenotypes (e.g., from black through Adult height is influenced by multiple genes. For
shades of gray to white). example, in a 2011 study by Matthew Lanktree and many
The genetic study and analysis of quantitative colleagues, analysis of human genomic variation combined
with statistical methods suggested that more than 60 genes
traits is the focus of the field of inquiry known as
may influence adult height. Although the actual number of
quantitative genetics. In this chapter, we explore how genes influencing human height continues to be investi-
quantitative genetics examines the hereditary variation gated, your own personal experiences most likely agree with
of polygenic and multifactorial traits. In the process, the data from population studies, telling you that taller par-
ents tend to have taller children and shorter parents tend to
we address some of the ways geneticists attempt to
have shorter children.
disentangle the genetic and environmental influences In addition to this genetic influence, however, envi-
on trait variation and discuss genetic approaches to ronmental and developmental factors can have a significant
interpreting the relative effects of those factors on effect. If your genetics class is typical of most, a survey of
your classmates would likely find that many of the men are
quantitative trait phenotypes.
taller than their fathers and grandfathers and that many of
the women are taller than their mothers and grandmothers.
698 CHAPTER 19 Genetic Analysis of Quantitative Traits
These differences are due almost exclusively to improved these two genes are identified as major genes. OCA2 has
prenatal and childhood health and nutrition and only mini- several alleles that greatly influence eye color and skin tone.
mally to changes in the population genetic makeup influ- One variant of the gene produces an autosomal recessive
encing adult height. Longitudinal studies confirm that form of albinism called oculocutaneous albinism type 2.
much of the world’s population is getting taller. During the This form of albinism features very lightly pigmented eyes
20th century, the height of the average American woman and skin. The gene derives its name from this condition.
increased from approximately 5 feet, 2 inches in 1900 to Other alleles of OCA2 reduce the amount of melanin pig-
almost 5 feet, 5 inches in 2000. An even more dramatic ment production to a lesser extent. These alleles are strongly
increase in average adult height can be observed by walk- associated with blue and green eye colors. Along with light
ing through the doors of houses and other structures built a eye colors, the joint effects of OCA2 alleles that produce
few centuries ago. Most modern-day visitors have to stoop small amounts of melanin include freckling, moles, and
to enter! Such observations lead to the clear conclusion that light hair color. The HERC2 gene regulates the expression
adult height is a multifactorial trait. of OCA2; thus, alleles of HERC2 that down-regulate OCA2
To understand the role of genetics in a trait like adult expression are associated with blue and green eye colors. A
height, you might think of parents as transmitting to their dozen or more additional genes have minor effects on eye
children a “genetic potential” for reaching a certain maxi- color. These are classified as modifier genes.
mum adult height; the genetic potential will be attained if
the child grows and develops under ideal conditions. Not all
of the children of a particular pair of parents will have the
Additive Gene Effects
same genetic potential, since segregation and independent Polygenic traits for which no individual gene or genes exert
assortment of the contributing genes can produce many dif- major gene effects have a continuous phenotypic distribu-
ferent genotypes. These processes produce offspring with tion that results from incremental contributions by mul-
different genotypes conveying genetic potential for a range tiple genes. Genes contributing to phenotypic variation in
of heights, including heights that are greater or lesser than this way are known as additive genes. The alleles of each
those of their parents. On average, however, progeny genetic additive gene can be assigned their own quantitative values
potential for height will be at approximately the midpoint that indicate the contribution to the trait. In the absence of
of the two parents’ genetic potential. The phenotypic out- environmental influence, phenotypes can sometimes be pre-
come (actual adult height) is subject to various influences dicted by adding the values of the alleles together. For cer-
on the height potential of the genotype, including prenatal tain traits, each of the additive genes has an approximately
and maternal health and childhood health and nutrition, as equal effect on the phenotype, or at least a level of effect that
the following discussions illustrate. does not differ substantially from that of the other genes.
For other traits the influence of each gene is distinguishable.
Grasping the notion of additive genes requires a differ-
Major Gene Effects ent way of thinking about genotypes and phenotypes than
The continuous phenotypic variation of polygenic traits we have discussed previously. Since traits controlled by
results from the effects of multiple genes that may each additive genes have a phenotype that is the sum of allelic
exert about the same amount of influence or may exert dif- contributions across multiple genes, it is possible for more
ferent amounts of influence on the phenotype. One example than one genotype to correspond to certain phenotypes.
of a polygenic trait with a wide range of phenotypes deter- Segregation and independent assortment of alleles of addi-
mined by genes with different levels of influence on the tive genes produces the various genotypes, but the pheno-
phenotype is eye color in humans. Contrary to popular mis- type corresponding to each is based on the sum of the values
conception, eye color is a polygenic trait that is influenced of the alleles at all the contributing loci.
by up to 15 genes. Two principal factors affect eye color: In the early 1900s, coinciding with the verification and
(1) the amount of the pigment called melanin deposited in expansion of the then recently rediscovered hereditary prin-
the iris and (2) the turbidity of the viscous stroma of the ciples of Mendel, geneticists began to explore the hypoth-
iris, which can also contain melanin. Individuals with the esis that the segregation of alleles of multiple genes played
darkest eye colors (black and dark brown) have irises and a role in phenotypic variation of particular traits. Known as
stroma containing the most melanin, whereas those with the multiple-gene hypothesis, the proposal was that alleles
the lightest eye colors (blue and light green) have irises and of each of the contributing genes obeyed the principles of
stroma containing the least amount of melanin. Melanin is segregation and independent assortment and had an additive
also responsible for skin pigmentation. Populations with effect in the production of phenotypic variation.
darker eye colors tend to have darker skin tones as well, and The multiple-gene hypothesis was the foundation of
conversely, populations with higher rates of light eye colors quantitative genetics, and the plant geneticist Hermann
tend to have lighter skin tones. Nilsson-Ehle was one of the first to use the hypothesis in his
Two genes having strong effects on human eye color are 1909 description of genetic control of kernel color in wheat.
OCA2 and HERC2. Because of their predominating effects, Figure 19.1 illustrates one of Nilsson-Ehle’s genetic models,
19.1 Quantitative Traits Display Continuous Phenotype Variation 699
A1A1B1B1C1C1 A2A2B2B2C2C2
P ×
A1A2B1B2C1C2
F1
0.30
0.25
0.20
0.15
0.10
0.05
—1 6
— 15
— 20
— 15
— —6 —1
64 64 64 64 64 64 64
Progeny proportion
Figure 19.2 A three-gene additive model for wheat kernel color. Color is determined by total number
of 1 alleles (A1, B1, and C1) in the genotype. The F2 have seven phenotypic classes in proportions generated
by independent assortment at three loci.
(a) One locus: A1A2 × A1A2 Figure 19.3 Phenotype distributions with additive
Number of color-producing alleles genes. The parents producing progeny in each example are
2 1 0 heterozygous for each gene. The color-contributing alleles are
designated as 1 for each gene. The number of F2 phenotype
Progeny frequency
0.50
categories increases with the number of additive genes.
0.40
0.30 Q If a trait was produced by the action of six additive genes,
0.20 how many phenotype categories would there be?
0.10
1
— —1 1
—
4 2 4 genes contributing to a polygenic trait, n = 3, and the
Progeny proportion number of distinct phenotypic categories is 2(3) + 1 = 7.
(b) Two loci: A1A2B1B2 × A1A2B1B2 The expected frequencies of the most extreme pheno-
Number of color-producing alleles types are (4n). Table 19.1 lists the numbers of phenotypic
4 3 2 1 0 categories for different numbers of contributing genes
Progeny frequency
1 4 6 4 1
Allele Segregation in Quantitative Trait
— — — — —
16 16 16 16 16 Production
Progeny proportion
(c) Three loci: A1A2B1B2C1C2 × A1A2B1B2C1C2 In 1916, plant geneticist Edward East undertook a com-
Number of color-producing alleles
prehensive examination of the multiple-gene hypothesis
by testing its ability to explain patterns of inherited varia-
6 5 4 3 2 1 0
tion that he produced in the length of the corolla (the petal-
Progeny frequency
0.40
producing part of the flower) in Nicotiana longiflora. In this
0.30 long-flower species of tobacco, the corolla is a tube-shaped
0.20 structure whose length can be measured and compared with
0.10 corollas in other plants.
East began his experiments with pure-breeding
1
— —6 15
— 20
— 15
— —6 —1 parental lines, one having a short corolla approximately
64 64 64 64 64 64 64 40 millimeters (mm) long and the other producing a long
Progeny proportion
corolla of approximately 90 mm (Figure 19.4). Note that
(d) Four loci: A1A2B1B2C1C2D1D2 × A1A2B1B2C1C2D1D2 there is a small amount of variation in corolla length in
Number of color-producing alleles each pure-breeding line, suggesting that despite attempts
8 7 6 5 4 3 2 1 0 to produce pure-breeding lines, gene–gene interaction,
Progeny frequency
0.40
0.30
0.20 Table 19.1 The Effect of Multiple Contributing
Genes on Phenotypic Variation
0.10
Number of Number of Phenotype Frequency of Most
1 8 28 56 70 56 28 8 1 Genes Categories Extreme Phenotypes
— — — — — — — — —
256 256 256 256 256 256 256 256 256 1 3 1/4
Progeny proportion
2 5 1/16
(e) Five loci: A1A2B1B2C1C2D1D2E1E2 × A1A2B1B2C1C2D1D2E1E2
3 7 1/64
Number of color-producing alleles
10 9 8 7 6 5 4 3 2 1 0 4 9 1/256
Progeny frequency
0.30 5 11 1/1024
6 13 1/4096
0.20
7 15 1/16,384
0.10
8 17 1/65,536
9 19 1/262,144
1 10 45 120 210 252 210 120 45 10 1
1024 1024 1024 10241024 1024 10241024 1024 1024 1024 10 21 1/1,048,576
Progeny proportion
702 CHAPTER 19 Genetic Analysis of Quantitative Traits
20 length is genetic
and environmental. Disentangling the genetic and nongenetic factors that deter-
10
mine phenotypic variation is a difficult but important task
F5 in genetics. In humans, for example, common diseases
Selection for
different lengths such as heart disease, cancer, and diabetes are influenced
60 by heredity, but nonhereditary factors are also critically
50 Three generations of important in disease development. Identifying the particu-
selection for short
Percent
Evaluate
1. Identify the topic this problem addresses 1. This problem concerns assessment of a three-gene additive model for
and the nature of the required answer. plant height. The model is to be applied to crosses of pure-breeding
parental plants of different heights to predict the frequencies of geno-
types and heights in the F1 and F2 progeny.
2. Identify the critical information given in the 2. The genotypes of the pure-breeding parents are given. In applying the
problem. polygenic additive model, we are to assume that genotype alone deter-
mines variation in plant height.
Deduce
3. Deduce the contribution of each allele 3. The 48-cm height of line A plants is determined by six alleles of additive
of the additive genes to height in line A. genes. Each 1 allele in the line A genotype contributes 48 cm/6 = 8 cm
TIP: Assume that each allele makes to plant height.
an equal contribution in this addi-
tive genetic model.
4. Deduce the contribution of each allele of 4. Six alleles also contribute equally to the 12-cm height of line B plants.
the additive genes to height in line B. Each 2 allele in the line B genotype contributes 12 cm/6 = 2 cm to plant
height.
5. Deduce the gametes produced by each 5. Line 1 has the genotype A1A1B1B1C1C1 and produces gametes with the
pure-breeding line. genotype A1B1C1. Line 2 has the genotype A2A2B2B2C2C2 and produces
TIP: The laws of segregation and the gamete genotype A2B2C2.
independent assortment apply to genes
controlling polygenic traits.
Solve Answer a
6. Determine the genotype and height of F1 6. F1 progeny of these pure-breeding parental plants will have the genotype
plants. A1A2B1B2C1C2. Based on the contribution of each 1 and 2 allele, the pre-
dicted F1 plant height is [(3)(8 cm)] + [(3)(2 cm)] = 30 cm.
Answer b
7. Determine the frequency and height of 7. The expected F2 progeny are
each category of F2 plants.
Number of Alleles Frequency Height (cm)
TIP: Either use Pascal’s
triangle (Figure 2.14) or 1 2
determine the probability
PITFALL: Remember that for of genotypes containing
most categories there are 0 6 1/64 12
different numbers of 1 and
multiple genotypes with the
same total number of 1 and
2 alleles. 1 5 6/64 18
2 alleles.
2 4 15/64 24
3 3 20/64 30
4 2 15/64 36
5 1 6/64 42
6 0 1/64 48
For more practice, see Problems 10 and 22. Visit the Study Area to access study tools. Mastering Genetics
703
704 CHAPTER 19 Genetic Analysis of Quantitative Traits
Figure 19.5 The effect of (a) No gene–environment (b) Moderate gene– (c) Substantial gene–
gene–environment interaction. interaction environment interaction environment interaction
The phenotype determined by A2 A2
a single gene with codominant P A1 A1 A2 A2 A1A1 A2 A2 A1A1
alleles can be modified by the
action of environmental factors.
F1
A1A2 A1A2
A1A2
A2 A2
A1A1 A2 A2 A1A1
F2 A1A1 A2 A2
term environmental variance. In that section, we employ a underlying the categorization of “affected” and “unaffected”
quantitative approach to determining how much of the vari- phenotypes is a continuous phenotypic distribution. Some of
ance in phenotype is due to environmental factors. the alleles contributing to the continuous distribution each
carry a certain level of genetic liability—a term conveying
the idea that certain alleles can push the phenotype toward
Threshold Traits the “affected” end of the continuous distribution. Each per-
Most polygenic and multifactorial traits exhibit a continu- son’s risk of having the affected phenotype is the result of
ous phenotypic distribution, but certain of these traits, while the individual’s genotype—or of the individual’s genotype
having an underlying continuous distribution, can never- along with nongenetic influences, in cases of multifacto-
theless be divided into distinct categories. Such traits are rial phenotypes. Different genotypes may confer different
often called threshold traits. Traits of this kind are espe- amounts of genetic liability, making some individuals more
cially important in medical contexts, where individuals are likely to cross the threshold and display an affected pheno-
often classified (not always with great clarity) as falling into type. Figure 19.6 shows a continuous distribution of genetic
one of two clinical categories—“unaffected” (or “normal”)
and “affected” (or “abnormal”)—to distinguish individuals
who have an abnormality from those that do not. Traits such
as cleft lip (the failure of the upper lip to fully close), cleft Continuous distribution Threshold of
of genetic liability in the genetic liability
palate (the failure of the hard palate in the mouth to fully general population
close), and congenital hip dysplasia (the misalignment of
the upper leg bone ball with its socket on the hip) are exam-
ples of human traits in this category.
For human threshold traits, the vast majority of the pop-
Unaffected Affected
ulation will have phenotypes on the unaffected side of the
threshold and will display the normal phenotype. A small
proportion of the population, however, will be situated on
Low Average High
the affected side of the threshold and have the abnormal
Genetic liability
phenotype. Cases that lie at the borderline between the two
categories can be problematic to diagnose. Figure 19.6 Threshold traits. A theoretical continuous
The genetic hypothesis explaining threshold traits pro- phenotypic distribution and a threshold of genetic liability for
poses that such traits are polygenic or multifactorial, so that a threshold trait.
19.1 Quantitative Traits Display Continuous Phenotype Variation 705
liability for a population and the designation of a threshold 1 or 2 at each locus. The genetic liability increases with a
that separates unaffected from affected individuals in the greater number of 1 alleles. In this model, the threshold
population. The portion of the population to the left of the of liability is passed when at least five 1 alleles are pres-
threshold of genetic liability, by far the majority, are iden- ent. Because independent assortment drives the distribution
tified as unaffected or normal, and the small group to the of alleles from parents to offspring, a greater number of
right of the threshold are classified as affected or abnormal. 1 alleles in parental genotypes increases the proportion of
Models used to simulate these concepts generally progeny that will cross the threshold of liability and display
assume that alleles of the genes affecting the trait are dis- an affected phenotype. The model can compare the risks of
tributed as described by Mendel’s law of independent having a child affected by a threshold trait for parents car-
assortment and that the threshold of liability that marks rying different numbers of liability alleles. Notice that the
the transition from unaffected to affected is crossed when overall shape of the phenotypic distribution is reminiscent
a sufficient number of “liability alleles” are present in the of the kind of continuous distribution expected for poly-
genotype. For example, Figure 19.7 depicts a hypothetical genic traits. The difference here is that one end of the con-
three-gene model in which alleles are designated as either tinuous distribution crosses the phenotypic threshold into
the affected category.
Cross 1, in Figure 19.7a, is between a parent with two
1 alleles and a parent with three 1 alleles. Both parents have
(a) Cross 1: A1A2B1B2C2C2 × A 1A 2B 1B 2C 1C 2
the unaffected (normal) phenotype. Among the progeny of
(Two liability alleles) (Three liability alleles)
this cross, 1/32 (3%) are expected to carry five 1 alleles,
Number of liability alleles
but none can carry six 1 alleles. Thus, 1/32 is the chance
0 1 2 3 4 5 6
that a child of this cross will have the affected phenotype.
0.35 Unaffected Affected Figure 19.7b shows Cross 2, with parents that each carry
Progeny frequency
expressing a particular trait. These factors can play a role (a) Number and frequency of heights in 3-cm intervals
in determining whether individuals whose genetic liabil-
ity places them near the threshold of liability express the Height (cm) Number Frequency (%)
affected phenotype or not. Gene–gene interactions such as 155–157 4 0.4
158–160 8 0.8
epistasis (see Section 4.3) can also contribute to phenotype 161–163 26 2.6
outcomes of threshold traits. 164–166 53 5.3
167–169 89 8.9
170–172 146 14.6
173–175 188 18.8
19.2 Quantitative Trait Analysis 176–178 181 18.1
179–181 125 12.5
Is Statistical 182–184 92 9.2
185–187 60 6.0
The statistical methods applied today to the study of quan- 188–190 22 2.2
191–193 4 0.4
titative traits are a direct extension of contributions made 194–196 1 0.1
nearly a century ago by statistician and evolutionary biol- 197–199 1 0.1
ogist Sir Ronald Fisher. In 1918, Fisher used statistical 1000 100
analysis to show that quantitative traits result from the seg-
regation of alleles of multiple genes displaying an additive (b) Number of females and males of each height
effect. Fisher also showed that interactions between genes
(i.e., epistasis) can be detected by these methods. In addi- Female Male
tion, he explored the role of gene–environment interaction Height (in) Number Height (in) Number
and concluded that environmental factors contribute to con- 60 5 64 2
tinuous variation by blurring the lines between phenotypic 61 5 65 5
62 7 66 2
classes. The tools and approaches described here and pio- 63 7 67 6
neered by Fisher allow scientists to identify genetic influ- 64 9 68 7
ences on phenotypes in terms of quantitative measurement 65 9 69 7
66 12 70 9
rather than qualitative appearance. 67 6 71 6
68 3 72 10
69 2 73 7
Statistical Description of Phenotypic 70 1 74 2
Variation 71 1 75 3
72 1 76 1
The first step in quantifying the phenotypic variation of a 77 3
trait in a population is to construct a frequency distribution Total 68 70
of values of the trait on a quantitative scale. Average (x) 64.5 inches 70.2 inches
A frequency distribution shows what proportion of Standard
the population exhibits each measured value of the trait or deviation (s) +/– 2.7 inches +/– 3.2 inches
falls into each category defined for the trait. Figure 19.8a Variance (s2) +/– 7.29 inches +/– 10.24 inches
provides an example, showing the number and frequency
of each designated height category in a sample of 1000 Figure 19.8 Adult height. (a) The frequency distribution of
college-aged males. height in 1000 college-aged males is shown in tabular form.
(b) Height data for 138 male and female college students. Data
Since the individuals in this study were not selected for
from W.E. Castle (1916)
any attribute related to height, they are considered a random
sample of college-aged males. Random samples are used in
quantitative trait analysis for two reasons. First, it is often individuals in the sample. For the height of the 1000 men in
impossible or impractical to collect data on every individual this sample, the mean height value is determined to be 175.33
in a population; and second, random samples can be just as cm (about 68.5 inches). In contrast, the height averages for
accurate in the statistical sense as “samples” consisting of the 138 University of Connecticut students shown in the
whole populations. As an analogy, about 10 milliliters of chapter-opening photo and summarized in Figure 19.8b are
blood—approximately two-tenths of 1% of a person’s total 64.5 inches for the women and 70.2 inches for the men. Both
blood volume—is drawn for most routine blood tests. The of these values are very close to the current U.S. population
amount taken is not large enough to cause physiological averages. The increase of 1.7 inches in average male height is
problems, but it is representative enough to provide depend- likely an example of the influence of improved nutrition and
able information concerning a person’s health status. child and maternal health on adult height.
After the frequency distribution is constructed, the first Frequency distributions vary depending on several fac-
piece of information to be calculated is the average, or mean, tors, including the sample size and the number of classifi-
value (x) of the distribution. This is calculated by summing all cation categories for the trait. When graphed, the distinct
the values in the sample and dividing by the total number of frequency distributions dictated by different data sets can
19.2 Quantitative Trait Analysis Is Statistical 707
have different shapes, as is seen for the three distributions also have ways of measuring (and thus describing) the
depicted in Figure 19.9. As a consequence of such differ- nature of the distribution around the mean. Two forms of
ences in distributions, it is necessary to provide a statisti- measurement are commonly used.
cal description of the shape of the frequency distribution The first, called the variance (s2), is a numerical mea-
when comparing trait values. For example, it is important sure of the spread of the distribution around the mean. This
to report the mode, or modal value, that is, the most com- measure interprets how much variation exists among indi-
mon value in a distribution. For the height data shown in viduals in the sample. The variance value depends on the
Figure 19.8a, the mode is the 173–175 cm category, contain- relationship between the width of the distribution and the
ing 188 individual values. Each distribution also possesses a number of observations in the sample. It will be small if all
middle value, known as the median, or median value. In the observations are close to the mean, and it will be large
the height distribution, you can think of the median value if the observations are widely spread around the mean (see
as entry number 500 (in order of increasing height) of the Figure 19.9). The variance is determined by summing the
1000 entries in the distribution. This median value also squares of the difference between each individual value
resides in the 173–175 cm category. and the sample mean and dividing that sum by the number
Data in the real world are usually skewed—that of degrees of freedom (df) in the sample. The number of
is, unevenly distributed on either side of the mean, as degrees of freedom is equal to the number of independent
Figure 19.8a and the chapter-opening photo both illustrate. variables. Squaring the differences between individual val-
Therefore, to describe the frequency distribution, we must ues and the sample mean prevents positive and negative
differences from canceling each other out. This is why the
variance is expressed as squared units:
s2 = a (xi - x)2/df
x Large variance with
relatively few organisms
in each category
In our example of variation in a quantitative pheno-
type, the variance is described as phenotypic variance (VP).
Figure 19.8a reports the measured values for height in cen-
timeters, so the variance would be expressed in centimeters
Number of organisms in each phenotypic category
squared.
The second measure that describes the distribution of
data is the standard deviation (s), a value expressing devia-
tion from the mean in the same units as the scale of measure-
Intermediate variance with ment for the sample. The standard deviation (s) is calculated
larger numbers of
organisms in each category as 2s2. In our sample of the heights of 1000 college-aged
males, VP = s2 = 43.30 cm2, and the standard deviation
is s = 6.58 cm. In a recent sample of 138 college students
enrolled in a genetics course, the standard deviations and
variances for height of the 68 females and 70 males are
reported in inches as 64.5 +/- 2.7 inches for females and
70.2 +/- 3.2 inches for males (see Figure 19.8b).
after strictly controlled laboratory inbreeding, however; they Both parental lines
are rarely found in nature, due to the ubiquitous presence of (a) Parental lines are genetically
genetic variation in natural populations. Genetic variation in uniform, so VP = VE .
natural populations generates individuals with different gen- VG = 0
VG = 0
otypes for quantitative traits and leads to phenotypic vari- VP = VE
VP = VE
ability that is directly attributable to the genetic variability.
Environmental variance (VE) is the portion of phe-
notypic variance that is due to variability of the environ-
VE VE
ments inhabited by individual members of a population.
Differences in sun exposure, in water and nutrient content
of the soil, and in exposure to pests are examples of envi-
(b) F1 progeny
ronmental variables that influence VE in plants. Carefully
controlled laboratory experiments can sometimes control VG = 0 The F1 are genetically
all of the environmental variables and produce a situation VP = VE uniform, so VP = VE .
in which VE approximates zero. In nature, however, such
circumstances rarely occur. Individual members of natural
populations are almost certain to experience variability in
VE
the environmental conditions they encounter.
Some differences may be systematic and predictable.
For example, members of a plant population growing below
a natural spring will experience wetter growth conditions (c) F2 progeny
than plants living above the spring. Other environmental VP = VE + VG , or The F2 pheno-
variables are sporadic or unpredictable. For example, a dry VG = VP – VE typic variance
year might reduce the flow of water from a natural spring results from
genetic and
and affect the plants living below the spring more severely environmen-
than those living above it. tal variance.
Let’s use an example to illustrate the dissection of VE + VG
VG and VE as components of VP. Suppose that two differ-
ent pure-breeding parental lines are established. Each line FIGURE 19.10 Sources of phenotypic variance.
is genetically uniform, with VG = 0; therefore, VP = VE
(Figure 19.10a). The pure-breeding lines are crossed to
produce F1 progeny that are genetically uniform. In the F1,
VG = 0 because there is no genetic variation among the phenotypes. Dominance variance (VD) is variance resulting
individuals, and VP = VE (Figure 19.10b). Production of from dominance relationships in which alleles of a hetero-
F2, however, leads to genotypic variation and thus to the zygote produce a phenotype that is not exactly intermedi-
production of phenotypic variation that results from a com- ate between those of homozygotes (i.e., the nonadditive
bination of genetic variance and environmental variance effects of alleles of contributing genes). Lastly, interactive
(Figure 19.10c). Among the F2, VP = VE + VG. Since VE has variance (VI) derives from epistatic interactions between
been determined for the parents and the F1, genetic variance the alleles of different genes that influence a quantitative
can be calculated by subtracting environmental variance phenotype. Collectively these three components unite to
from the phenotypic variance among the F2. In other words, produce the genetic variance in a model summarized by
VG = VP - VE. Genetic Analysis 19.2 provides practice in VG = VA + VD + VI. We use these values in the following
determining environmental and genetic variance. section to discuss heritability.
Evaluate
1. Identify the topic this problem addresses 1. This problem concerns the determination of environmental variance and
and the nature of the required answer. genetic variance for the tomato plant data given.
2. Identify the critical information given in 2. Fruit weight and phenotypic variance are given for the two pure-breeding
the problem. parental lines and for the F1 and F2 progeny.
Deduce
3. Describe the relationship between VP, VG, 3. VP = VG + VE
and VE.
4. Identify the variance values that 4. Each of the pure-breeding parental lines (P1 and P2) and the F1
contribute to VP in each line and progeny are genetically uniform. As a consequence, all phenotypic
generation. variance is due to environmental variance, and genetic variance makes
TIP: For organisms that are genetically no contribution. The F2 contains genotypic variety, so both VG and VE
identical, VP = VE. contribute to VP.
Solve Answer a
5. Determine VE for this trait. 5. In the genetically uniform P1, P2, and F1, VG = 0, and in each line VP = VE.
The average environmental variance among these three lines is calculated
as (1.6 + 3.5 + 2.2)/3 = 2.43 grams.
Answer b
6. Determine VG for this trait. 6. VG is calculated by rearranging the expression in step 3 to VG = VP - VE.
The genetic variance for these data is VG = 4.0 - 2.43 = 1.57 grams.
For more practice, see Problems 4, 12, and 14. Visit the Study Area to access study tools. Mastering Genetics
heritability was developed to help measure the proportion genetic variation, but that most of it is due to influences of
of phenotypic variation that is due to genetic variation. the environment, so the expression of the trait in a popula-
Heritability differs from trait to trait, and it can change tion is not effectively changed by selection processes. Two
for the same trait measured in different environments or widely used measures of heritability assess different compo-
under different conditions. Heritability is an important nents of the contribution of genetic variation to phenotypic
measure of the potential responsiveness of a trait to natu- variation. Broad sense heritability (H 2) estimates the pro-
ral selection or artificial selection. It is of special interest portion of phenotypic variation that is due to total genetic
to evolutionary biologists and to plant and animal breed- variation. This form of heritability is defined by the equal-
ers, who use it to assess the potential impact of selection on ity H 2 = VG /VP. Narrow sense heritability (h2) estimates
traits of agricultural or economic importance. the proportion of phenotypic variation that is due to addi-
A high heritability value indicates that most of the tive genetic variation. Narrow sense heritability is defined
observed phenotypic variation is due to genetic variation. by the equality h2 = VA/VP. Both measures of heritability
Such a finding implies that the trait can be strongly influ- are expressed as proportions that range in magnitude from
enced by natural selection or by artificial selection programs 0.0 to 1.0. In all cases, greater heritability values indicate a
focused on changing the frequency of a phenotype in a pop- larger role for genetic variation in phenotypic variation.
ulation. Conversely, a low heritability value indicates that Heritability is easily misunderstood. An erroneous
little of the observed phenotypic variation is due to inherited understanding can lead to the mistaken idea that genetic
709
710 CHAPTER 19 Genetic Analysis of Quantitative Traits
variation makes a much larger contribution to phenotypic and then produced F2 fish and measured their eye tissue
variation than the data actually support. Heritability is diffi- as well. Since the F1 fish were nearly genetically uniform,
cult to apply to humans except under limited circumstances the variance in the amount of eye tissue was due entirely
(described later in the discussion of twin studies), but it can to the environment. In these F1, VE was 0.057 cm2. Among
be used for other organisms. The following attributes of her- the F2, phenotypic variance (VP) was 0.563 cm2 and was
itability are central to its meaning: the result of both genetic and environmental variance
(VG + VE). Broad sense heritability is derived by deter-
1. Heritability is a measure of the degree to which genetic
mining VG and dividing it by phenotypic variation. In this
differences contribute to phenotypic variation of a trait.
case,
In other words, heritability is high when much of the
phenotypic variation is produced by genetic variation VG = VP - VE = 0.563 - 0.057 = 0.506
and little is contributed by environmental variation.
Heritability is not an indication of the mechanism by H 2 = VG/VP = 0.506/0.563 = 0.899
which genes control a trait, nor is it a measure of how
much of a trait is produced by gene action. This broad sense heritability of approximately 0.90
means that approximately 90% of the phenotypic variation
2. Heritability values are accurate only for the environ-
in eye size between these populations of cave fish is due to
ment and population in which they are measured.
genetic variation.
Heritability values measured in one population can-
not be transferred to another population, because both
genetic and environmental factors may differ between Twin Studies
populations.
Heritability can be quantified when both mating and envi-
3. Heritability for a given trait in a population can change ronmental factors can be controlled. However, when mating
if environmental factors change, and changes in the and environmental variation are not among the controlled
proportions of genotypes in a population can alter the experimental parameters, heritability is far more difficult—
effect of environmental factors on phenotypic varia- some would say impossible—to measure accurately. This
tion, thus changing heritability. limitation applies to attempts to measure the heritability of
4. High heritability does not mean that a trait is not traits in humans. Fortunately, studies of phenotypic varia-
influenced by environmental factors. Traits with high tion in human twins can offer insights into broad sense heri-
heritability can be very responsive to environmental tability of human traits.
changes. Identical twins, also known as monozygotic (MZ)
twins, are produced by a single fertilization event that is
followed by a splitting of the fertilized embryo into two
Broad Sense Heritability zygotes. MZ twins share all of their alleles. Theoretically,
We have seen that genetic variance (VG) is a composite value broad sense heritability can be determined by assuming that
that derives its magnitude from additive, dominance, and phenotypic variance between them is fully attributable to
interaction variance. Unfortunately, genetic variance is not environmental variance. Under this assumption, in MZ twin
always easy to partition into these separate components. pairs, VP = VE.
Fortunately, broad sense heritability (H 2 = VG/VP) can be Fraternal twins, on the other hand, are dizygotic
used as a general measure of the magnitude of genetic influ- (DZ), produced by two independent fertilization events
ence over phenotypic variation of a trait, when VG cannot be that take place at the same time. DZ twins are siblings
partitioned. that are born at the same time, but they are no more
In a 1988 study of the genetics and evolution of cave closely related than siblings born at different times.
fish (Astyanax fasciatus), Horst Wilkens used broad sense Like all full siblings, DZ twins have an average of 50%
heritability analysis to describe the genetic contribution to of their alleles in common. To control for differences
the evolution of the organism’s eye tissue. Some populations between the sexes, only DZ twins of the same sex are
of this species live in completely dark underground cave used in twin studies. Phenotypic variance between DZ
streams in eastern Mexico and have a dramatically reduced twins is the sum of environmental variance plus one-
amount of eye tissue in comparison with closely related half of the genetic variance (the 50% of alleles not
fish living aboveground. In these populations, the eye tissue shared by the average DZ twin pair): In DZ twin pairs,
appears to be undergoing rapid evolutionary change. The VP = VE + 1/2VG. On the basis of these general formu-
eyes in sighted fish of this species are approximately 0.7 cm las for calculation of H 2, broad sense heritability can be
in diameter. In comparison, blind cave fish have less than estimated for human traits by methods we do not discuss
0.2 cm of eye tissue diameter. here (Table 19.2).
Wilkens crossed sighted cave fish with blind cave Studies of traits in human twins usually compare MZ
fish, measured eye tissue mean and variance in the F1, twins with same-sex DZ twins to make heritability estimates
19.3 Heritability Measures the Genetic Component of Phenotypic Variation 711
Table 19.3 Concordance Values for Common Medical Table 19.4 Selected Narrow Sense Heritability (h2)
and Behavioral Conditions in Humans Values for Animals and Plants
Trait Percent Concordance Organism Trait Heritability (h2)
MZ Twins DZ Twins Cattle Body weight 0.65
Medical Conditions with Likely Genetic Influence Milk production 0.40
Cleft lip 40 4 Corn Plant height 0.70
Club foot 30 2 Ear length 0.55
Congenital hip dislocation 35 3 Ear diameter 0.14
Epilepsy 60 20 Horse Racing speed 0.60
Multiple sclerosis 30 6 Trotting speed 0.40
Pyloric stenosis 25 3 Pig Back-fat thickness 0.70
Rheumatoid arthritis 35 6 Weight gain 0.40
Medical Conditions Unlikely to Have Strong Genetic Litter size 0.05
Influence
Poultry Body weight (8 weeks) 0.50
Handedness (left and 79 77
right) Egg production 0.20
by breeders or to natural selection, the extent to which the selection differs as a result of different degrees of herita-
mean value of a trait changes in a population depends on bility. This comparison illustrates that selection response
its heritability. Breeders and evolutionary biologists predict is expected to be maximal when heritability is h2 = 1.0.
substantial change in trait mean values (i.e., large values for Selection response is substantially less when heritability
R) when heritability is high, but little or no change in trait is h2 = 0.2, and there is no selection response when heri-
mean values when heritability is low. In other words, traits tability is h2 = 0. Figure 19.11b shows selection operating
evolve when a substantial proportion of the phenotypic vari- over many generations in three different modes that have
ation is due to genetic variation. different effects on phenotypic means and variances. In the
Figure 19.11a shows three examples in which the mode known as directional selection, the mean pheno-
selection differentials are the same but the response to typic value is shifted in one direction because one extreme
of the phenotype distribution is favored. This narrows
the phenotypic range and reduces phenotypic variance.
In contrast, selection favoring an intermediate phenotype
(a)
h2 = 0.0 h2 = 0.2 h2 = 1.0 over extreme phenotypes results in stabilizing selection
that reduces the phenotypic variance without shifting
the mean value. Disruptive selection occurs when both
extreme phenotypes are favored over intermediate pheno-
types. The result is an increase in the phenotypic variance
Parent M S Ms M S Ms M S Ms
and, potentially, a phenotypic split within the population.
These modes can operate in both artificial selection and
R R
natural selection.
The inherited DNA sequence variation of a SNP is usu- (a) Parental cross and backcross
ally not the molecular basis of the QTL. Instead, the SNP Genetic marker
is usually genetically linked to the QTL. The connection
L S
between the genetic marker and the phenotype implies Large Small
that a QTL exists near the genome location encoding the Parental
fruit × fruit
cross:
genetic marker. (100 g) (10 g)
L S
TG396
TG14
TG353
TG469
TG93
TG140
by controlled crosses as the sources of DNA for genetic
marker identification and as the source of data for the Chromosome 2 markers
quantitative trait of interest. If, for example, a researcher
wants to identify QTLs that influence large fruit size in FIGURE 19.12 Quantitative trait locus (QTL) detection and
mapping. (a) Parental tomato plants producing large fruit (and
tomatoes, he or she will cross two parental lines of toma-
homozygous for L marker alleles) or small fruit (and homozygous
toes that differ in fruit size. The F1 progeny of this cross for S marker alleles) are crossed to produce F1 (LS). The F1 are then
could then be used to produce F2 progeny or, as we illus- backcrossed to the large-fruit line to yield backcross progeny that
trate here, the F1 could be used in a backcross to one of are either LL or LS. (b) The significance of linkage between poten-
the parental lines. Genetic markers will be determined in tial QTLs and genetic markers is tested among backcross progeny
the original parental lines and in the backcross progeny. by lod score analysis. A lod score profile assessing fruit-weight
Tomato sizes produced by backcross progeny will be QTLs reveals significant scores exceeding the threshold value on
weighed and the results compared with genetic markers in tomato chromosome 2.
the individual plants.
Figure 19.12a illustrates the structure of a back-
cross experiment designed to collect genetic marker either LL, if the F1 transmits the large-strain allele, or LS,
and tomato-weight data for QTL mapping analysis. One if the F1 transmits the small-strain allele. The backcross
parental tomato strain producing large fruit that aver- progeny in this example produce tomatoes that vary in
ages 100 grams (g) contains an allele of a genetic marker weight from 80 to 88 g. Tomato weight from the back-
that is identified by the letter L. There are actually many cross plants is greater than from the F1 plants because the
markers linked to QTLs in the line, and for each marker backcross plants are the result of a cross between the F1
gene tested, the large-tomato strain will have two copies and the large-tomato strain.
of the large-strain marker allele, designated LL. Similarly, Table 19.5 displays tomato-weight data for 10 back-
a small-tomato–producing strain, with an average tomato cross plants (1–10) and genetic marker data for two genes,
weight of 10 g, is characterized for the same genetic marker A (MA) and marker B (MB), that are not linked to
marker, and the locus tested in the small-strain genotype one another and are located in different parts of the genome.
is designated SS. The F1 progeny of the large * small In an actual QTL backcross experiment, several hundred
cross is heterozygous for the marker locus and is desig- backcross plants might be examined, and each plant might
nated LS. These plants in this example are shown to pro- be genotyped for dozens of genetic markers that ideally
duce tomatoes that weigh 60 g. The backcross is made to would be spaced about every 5 to 10 centimorgans (cM) in
the large-tomato strain, and the marker genotype will be the genome. This number of genetic markers and their close
19.4 Quantitative Trait Loci Are the Genes That Contribute to Quantitative Traits 715
Table 19.5 QTL Analysis of Tomato Weight in values are greater than the threshold value and give statis-
Backcross Progeny tically significant evidence favoring linkage between these
genetic markers and a QTL. On the other hand, the lod
Average Fruit
scores falling below the threshold value in the figure give
Backcross Plant Weight (g) Genotype
no statistical evidence of linkage to a QTL. For chromo-
Marker Marker some 2 in tomato, lod scores for genetic markers to the left
MA MB
of TG353 are less than the threshold lod score value.
1 86 LS LL Andrew Paterson and his colleagues published a 1988
2 82 LL LS study mapping 15 QTLs in the tomato genome that influ-
3 85 LL LL ence fruit weight, fruit acidity, and the amount of soluble
4 88 LL LL
solids in the fruit. Each trait has agricultural importance,
and together they determine the quality and yield of
5 81 LS LS
tomato paste from the fruit. Paterson’s study used 70 DNA
6 83 LS LS markers spaced an average of 20 cM apart throughout the
7 84 LL LL tomato genome. Collectively, these markers span about
8 80 LL LS 95% of the 12 chromosomes that constitute the tomato
genome.
9 84 LS LS
The parental plants were two closely related and inter-
10 87 LS LL
fertile species: a domestic tomato (Solanum esculentum)
Total average weight 84 and a wild South American green-fruited tomato (Sola-
LL average weight 83.8 86.0 num chmielewskii). The F1 hybrids were backcrossed to
LS average weight 84.2 82.0 S. esculentum, producing 237 backcross progeny plants for
analysis. All backcross plants were grown under identical
conditions to minimize the influence of environmental fac-
tors on the traits of interest. Individual fruits from backcross
proximity maximize the chance of identifying the location plants were assayed for fruit weight (grams), soluble solids
of QTLs detected by the analysis. content (percentage), and acidity (pH). Lod score analysis
In Table 19.5, the average weight of tomatoes from was used to test whether genes influencing any of the three
backcross plants is 84 g. Average tomato weight is com- traits exhibited genetic linkage to genome markers. Sig-
pared for LL plants versus LS plants for each marker. nificant lod score values traced six genes influencing fruit
There is almost no difference in average weight for weight, five influencing acidity, and four influencing sol-
MA (LL = 83.8 g versus LS = 84.2 g), but for MB, LL plants uble solids content to regions of nine chromosomes in the
produce tomatoes that are 4 g heavier on average than are the tomato genome. The regions of tomato chromosomes 6 and
tomatoes from LS plants (LL = 86.0 g versus LS = 82.0 g). 7 containing QTLs influencing the three traits are shown in
Figure 19.13.
These data may indicate that a QTL influencing tomato
weight is located near MB. Conversely, there is no evidence
to indicate that a QTL is located near MA.
To determine the statistical significance of this kind Chromosome
of information provided for genetic markers and tomato 15 19 15 15 20 cM
weight, a lod score is calculated. In this case, the lod score is 6
based on odds ratios dividing the probability of the data if a CD67 SOD3 TG54 CD42 SP PC5
QTL is linked to the marker by the probability of the data if
there is no QTL linked to the marker. The odds ratios for the
backcross plants are added together, and the log (the log of 19 18 25 3 8 cM
the odds) is taken to yield the lod score. Like the analysis of 7
lod scores for genetic linkage, there is a threshold value for CD61 TG23 GOT2 TG61 TG113 TG113A
significance of the score (see Section 5.5). If the lod score
for a genetic marker is greater than the threshold value, the
lod score indicates a statistically significant probability that
Fruit weight Acidity Soluble solids
a QTL is linked to the marker.
In Figure 19.12b, a lod score profile for several FIGURE 19.13 QTL mapping in domestic tomato. Multiple
genetic markers located on chromosome 2 of tomato QTLs influencing fruit weight, fruit acidity, and percentage of
reveals significant evidence indicating genetic linkage to soluble solids of tomatoes are shown on chromosome 6 and chro-
a QTL. Beginning at the marker designated TG353 and mosome 7. Many other QTLs populate the rest of the genome.
spanning to the right through marker TG140, the lod score Distances between genes are in cM (centimorgans).
716 CHAPTER 19 Genetic Analysis of Quantitative Traits
Identification of QTL Genes differences are called introgressions, thus giving these
lines their name.
Since QTL mapping identifies the location of genes Figure 19.14a illustrates six introgression lines (IL1
influencing quantitative traits but not the genes them- to IL6) descended from a cross between two original
selves, additional genetic analysis is required to iden- parental lines, one a domesticated species and the other
tify the genes. To acquire information leading to gene a wild species. The chromosome colors illustrate cross-
identity, researchers use near isogenic lines (NILs), overs that produce differences between the introgression
also called introgression lines (ILs). These are lines of lines. Crossover locations are identified by analysis of
organisms derived from backcross progeny produced as genetic markers, and each introgression line is charac-
described earlier. Different backcross progeny are self- terized for a trait phenotype. In the figure, the bars to
fertilized over many generations to form highly inbred the right of each line indicate the percentage difference
lines that are nearly isogenic, meaning they are geneti- between the phenotype of the IL and the domesticated
cally identical at almost all genes. The lines differ from parental species. Two potential QTL regions, QTL-A and
one another, however, as a result of different crossovers QTL-B, contain variations of the crossover segments.
that occurred during the backcrossing and that introduced The greatest positive percentage difference relative to the
different alleles near the site of a QTL. The introduced domesticated species phenotype occurs in IL2 and IL3
IL6 +2
(b)
A G T TT
G C A CC CW invertase
activity (%)
SNP
100 200 300 Phenotypic
3283
2799
2859
2878
3263
Brix 9-2-5
19.4 Quantitative Trait Loci Are the Genes That Contribute to Quantitative Traits 717
that carry crossover chromosomes containing domesti- the percentage that have the phenotype but not the genetic
cated DNA in the vicinity of QTL-A and wild-species marker.
DNA near QTL-B. One advantage of GWAS over other QTL mapping
To identify the genes responsible for QTL variation, approaches is that GWAS can scan the entire genome for
“candidate genes,” genes that are potentially responsible for QTLs by statistically testing for marker variants that are
the observed variation, must be identified and investigated. associated with phenotypic variation. Positive statistical
Genes in the QTL-A and QTL-B regions are located by results indicating association identify chromosome regions
examining DNA sequences, and sequence variants in can- that can be more closely inspected for genes that influence
didate genes among introgression lines are identified. The the trait. A second advantage of GWAS is that organisms
sequence differences detected are studied to determine if in random mating populations can be analyzed. Rather than
they correlate with phenotypic variation. requiring controlled crosses and the formation of introgres-
Figure 19.14b illustrates a portion the results of experi- sion lines, GWAS uses “cases,” or organisms with a particu-
mental analysis of tomato introgression lines by Eyal Frid- lar phenotype, and compares them with “controls” that lack
man and colleagues in 2004 designed to identify genes the particular phenotype to assess the association between
contributing to Brix value in tomato. The Brix value of fruit QTL markers and a phenotype.
refers to the total soluble solids content, of which sugars and This case–control approach identifies the SNP geno-
acids are the primary constituents. Fridman and colleagues types in all the individuals with, for example, a genetic
created a large number of ILs from an initial cross between disease (the cases) as well as in healthy controls. The fre-
the domesticated tomato species (Solanum lycopersicum) quency of each SNP allele in the cases is compared with
and a wild relative (Solanum pennellii). the allele frequency in the controls. When the allele fre-
The parental species and each of the ILs were assessed quency in the case group is greater than the frequency
for Brix value, and a QTL found to have a high Brix in the control group, the odds ratio is greater than 1.0.
value, Brix 9-2-5, was intensively studied. DNA sequenc- Statistics applied to the odds ratio determine the P value
ing of the 484 nucleotides (positions 2799 to 3283) in of each odds ratio. Significant association between a SNP
Brix 9-2-5 revealed the five SNP variants shown in the and a disease is found when the P value is less than the
figure. The Brix 9-2-5 QTL corresponds to a segment of cutoff value. The results of each SNP examination are plot-
the tomato LIN5 gene that produces the cell wall enzyme ted as in the following description of a GWAS analysis of
invertase (CW invertase). In the figure, the positions of Crohn’s disease.
SNPs are shown relative to 13 ILs that carry recombina- Discussion on pages 173–174 and associated with
tion in or near Brix 9-2-5. The bar to the right of each IL Figure 5.17 describes the use of GWAS analysis to iden-
indicates its percentage difference in CW invertase activ- tify genome regions that may contain genes influencing the
ity relative to S. lycopersicum. The results show that when development of selected human disorders. An example of
the S. pennellii sequence is present, CW invertase activity the human hereditary condition known as Crohn’s disease
is significantly greater than in S. lycopersicum. The data (CD) investigated by GWAS is displayed in Figure 5.17. CD
shown, along with additional data not shown, indicated is an intestinal disorder that is influenced by inherited varia-
that the SNP at position 2878 (boxed) was strongly corre- tion. No single gene with a major effect is known for CD,
lated with increased CW invertase activity. DNA and pro- but GWAS analysis identified nine genome regions having
tein sequence analysis revealed that this SNP produced an significant associations with CD. Any or all of them may
amino acid difference that altered CW invertase activity. contain genes whose variants play a functional role in the
development of CD.
To verify the possibility of genetic influence identi-
Genome-Wide Association Studies fied by GWAS analysis, it is necessary to find the gene or
The widespread availability of genome sequencing informa- genes involved. This requires close examination of each
tion has opened a new avenue to the identification of QTLs identified region. For CD, the chromosome region 16q.2.1
in numerous species, including humans. As described in revealed a highly significant association, and Yasunori
Section 5.5, the method known as genome-wide association Ogura and colleagues dissected this region, ultimately
studies (GWAS) seeks to tie the presence of a DNA marker identifying a gene known as CARD15 (caspase recruitment
to a QTL influencing a specific phenotype. Recall that in domain, member 15) as a candidate for a gene influencing
GWAS studies the inherited genetic marker variant and the susceptibility to CD.
phenotype are related by “association,” which means organ- CARD15 encodes 12 exons that direct the production
isms that carry a particular variant are more likely to have of a 1040–amino acid protein. The protein is involved in
a certain phenotype than are organisms that carry a differ- recognizing bacterial proteins and stimulating an immune
ent variant. The assessment of association is quantitative; response. Ogura and colleagues sequenced the exons and
that is, it expresses the percentage of organisms with a introns of CARD15 in 12 CD patients from different fami-
genetic marker that also display a certain phenotype versus lies having multiple cases of CD. They performed the
718 CHAPTER 19 Genetic Analysis of Quantitative Traits
C A SE ST U D Y
The Genetics of Autism Spectrum Disorders
Autism spectrum disorders (ASD) are a large group of neu- of ASD that point to a role for genetic influence. First, stud-
rodevelopmental impairments affecting language, social ies have found that the first-degree relatives of children with
cognition, and mental flexibility in humans. ASD generally ASD are much more likely to develop ASD than the popula-
has its onset by the age of 3, and most cases are diagnosed tion average. (Recall that first-degree relatives, e.g., full sib-
in young children. When autism was first described, in the lings, share 50% of their DNA and have the closest genetic
early 1940s, it was thought to be a single, severe condition of relationships in families; see the discussion in Section 19.1
social and language dysfunction that primarily affects boys. and Figure 19.7.) Second, the studies have identified sev-
This unitary definition has expanded in the ensuing decades, eral genetic syndromes in which ASD can be one compo-
and now neurobiologists and psychiatric specialists recog- nent of the syndrome. One example is fragile X syndrome.
nize ASD to be a large collection of conditions rather than Recall from Section 11.2 and Table 11.2 that fragile X syn-
only one. In addition, ASD is now known to have a biological drome is caused by a large DNA triplet repeat expansion
basis, and it is no longer classified as a psychiatric condition. mutation affecting the FMR1 gene. The gene normally has
A number of hypotheses have been proposed regarding its between 6 and about 50 repeats of a CGG triplet repeat.
causation, and over recent decades, evidence of significant Fragile X syndrome occurs in males who have a large expan-
genetic influence in ASD has mounted. sion containing more than 200 of the CGG repeats in FMR1.
The first strong evidence of a genetic basis for ASD Fragile X syndrome symptoms include physical abnor-
came in 1977 when Susan Folstein and Michael Rutter pub- malities and mental impairment. The link to ASD is detected
lished research that examined concordance for ASD in MZ in males and females who carry abnormal X chromosomes
and DZ twin pairs. Their finding of 82% concordance for cog- with CGG triplet expansions between 50 and 200 copies.
nitive disabilities associated with ASD in MZ twin pairs versus These are not as large as the expansions of full-mutation X
10% in DZ twin pairs led to the conclusion that genetic varia- chromosomes that cause fragile X syndrome. There is evi-
tion plays a major role in ASD. dence that the function of FMR1 is altered but not inactivated
These findings have been supported in numerous by these smaller expansions. The role they play in generating
follow-on studies. In addition to providing support for the ASD-like symptoms is a subject of active investigation.
overall concordance results identified by Folstein and Rutter, The search for gene mutations and variants that may
the follow-on studies have identified two additional features be responsible for large numbers of ASD cases has not
Summary 719
produced results. However, it has led to the identification in which mutation of one gene modifies or prevents the
of dozens of genes whose mutations play a role in a small expression of another gene or genes; see Foundation
percentage (at most 2% or so) of cases of ASD. Frequently, Figure 4.21). We can think of the many gene variants and
these are mutations classified as copy number variants mutations associated with ASD as being part of large, com-
(CNVs)—submicroscopic chromosome duplication or dele- plex pathways, and of these genes often having pleiotropic
tion mutations that usually affect just a few kilobases of DNA. or epistatic effects. Many distinct pathways and features
Collectively, the identified CNVs are associated with approxi- must, and usually do, develop normally to generate lan-
mately 10% of all ASD cases. The CNVs that are implicated guage and social skills classified as falling within the “nor-
are scattered throughout the genome. They and other vari- mal” range. Mutations can alter these pathways in manners
ants associated with ASD appear to affect many different that disable social ability, language function, or mental
molecular processes, including cell adhesion, synaptic struc- flexibility to the extent that a child suffers an identifiable
ture and function, pre-mRNA processing and splicing, and abnormality.
protein production. Research into the genetic and biological foundations
The picture of ASD that has emerged over the past of ASD is active and ongoing. Examples of the directions in
two decades is one of disruption to complex brain circuitry which the study of ASD is moving include genome sequenc-
in which many different developmental and communication ing to identify any genetic similarities among groups of ASD
pathways must be functional for optimal performance. In patients with similar manifestations of the condition, exami-
other words, the complexities of brain function that lead nation of how mutations affect synaptic and cellular circuits
to “normal” language, social skills, and mental flexibility that are disrupted in ASD, studies of potential epigenetic
can potentially be disrupted by the mutation of any one of contributions to ASD, and the search for the mutational
hundreds of different genes. Sections 4.2 and 4.3 describe basis and categorization of different subtypes of ASD. ASD
the concept of genes operating in pathways to produce is a complex and diverse set of conditions with many distinct
certain phenotypic features. Those sections also describe causes. It is a goal of neuroscience and neurogenetics that
pleiotropy between genes (situations in which mutation of the next decade or two provide a much clearer picture of the
one gene can affect multiple, usually distinct, attributes causes and development of ASD, along with effective meth-
of the individual; see Figure 4.16) and epistasis (situations ods of treatment.
SU M M A RY Mastering Genetics For activities, animations, and review quizzes, go to the Study Area.
19.1 Quantitative Traits Display Continuous ❚❚ Genetic variance is partitioned into additive variance (VA),
Phenotype Variation dominance variance (VD), and interactive variance (VI), the
latter resulting from the epistatic interaction of genes deter-
❚❚ Quantitative phenotypic traits are polygenic and are mining a phenotype.
described by scales of measure that can be assigned values
having a quantitative basis.
❚❚ The phenotypes of multifactorial traits result from polygenic 19.3 Heritability Measures the Genetic
inheritance and the influence of environmental factors. Component of Phenotypic Variation
❚❚ Most quantitative traits have a continuous phenotypic ❚❚ Heritability is a measure of the extent to which genetic
distribution. Those influenced by larger numbers of genes are variation contributes to total phenotypic variation.
more likely to display continuous variation. Discontinuous
❚❚ Broad sense heritability (H 2) measures the ratio of genetic
variation in phenotype is a frequent feature of threshold traits.
variance to phenotypic variance (VG/VP). One method
❚❚ Threshold traits are explained by additive alleles and have of applying broad sense heritability analysis to humans
a threshold of liability that separates one phenotypic cat- is through twin studies that give a general estimate of
egory (unaffected) from another (affected). The threshold heritability.
of liability is crossed when a sufficient number of additive
❚❚ Narrow sense heritability (h2) measures the contribution of
alleles accumulate in the genotype.
additive genetic variance to phenotypic variance (VA/VP).
❚❚ Narrow sense heritability is used to predict the selection
19.2 Quantitative Trait Analysis Is Statistical response (R) of a trait to artificial selection or to natural
selection.
❚❚ Quantitative traits are analyzed using statistical methods
that evaluate the mean, median, mode, and variance of
quantitative trait phenotype distribution. 19.4 Quantitative Trait Loci Are the Genes
❚❚ The frequency distribution for the phenotype range is That Contribute to Quantitative Traits
described by the variance or the standard deviation in sam-
ple values. In the case of quantitative trait phenotypes, the ❚❚ QTL mapping is used to determine the location of potential
phenotypic variance (VP) is a useful measure of the sample QTLs in genomes.
distribution. ❚❚ QTL mapping uses methods that closely resemble recombi-
❚❚ The phenotypic variance of a trait is the sum of genetic nation mapping, such as controlled crosses and analysis of
variance (VG) and environmental variance (VE). recombinant chromosomes.
720 CHAPTER 19 Genetic Analysis of Quantitative Traits
❚❚ Specific genes influencing quantitative trait phenotypes are ❚❚ Genome-wide association studies (GWAS) scan the entire
identified and their variation characterized through QTL genome of organisms in randomly mating populations for
candidate locus analysis. statistical evidence of QTLs.
PRE PA R IN G F O R P R O BLE M S O LV I NG
In addition to the list of problem-solving tips and suggestions 4. Understand the definitions and concepts pertain-
given here, you can go to the Study Guide and Solutions Manual ing to broad sense heritability and narrow sense
that accompanies this book for help at solving problems. heritability.
1. Be able to analyze the results of crosses involving poly- 5. Be prepared to describe the concept of heritability and
genic traits and to predict the possible outcomes of crosses. the use of concordance in twin studies for assessing it
in humans.
2. Understand the concepts pertaining to multifactorial
traits, their inheritance, and their expression. 6. Be prepared to calculate the mean, standard deviation,
variance, and heritability of quantitative traits.
3. Be prepared to define and explain threshold traits and to
describe their relationship to polygenic or multifactorial 7. Be prepared to assess the results of artificial selection
traits. experiments.
Chapter Concepts For answers to selected even-numbered problems, see Appendix: Answers.
1. Which of the following traits would you expect to be 5. Describe the difference between continuous phenotypic
inherited as quantitative traits? variation and discontinuous variation. Explain how poly-
a. body weight in chickens genic inheritance could be the basis of a trait showing
b. growth rate in sheep continuous phenotypic variation. Explain how polygenic
c. milk production in cattle inheritance can be the basis of a threshold trait.
d. fruit weight in tomatoes 6. Calculate the mean, variance, and standard deviation for
e. coat color in dogs a sample of turkeys weighed at 8 weeks of age that have
2. For the traits listed in the previous problem, which do the following weights in ounces: 161, 172, 155, 173, 149,
you think are likely to be multifactorial traits, with 177, 156, 174, 158, 162, 171, 181.
phenotypes that are influenced by genes and environ-
7. Provide a definition and an example for each of the fol-
ment? Identify two environmental factors that might
lowing terms:
play a role in phenotypic variation of the traits you
identified. a. additive genes
b. concordance of twin pairs
3. Compare and contrast broad sense heritability and narrow c. multifactorial inheritance
sense heritability, giving an example of each measurement d. polygenic inheritance
and identifying how the measurement is used. e. quantitative trait locus
4. In a cross of two pure-breeding lines of tomatoes produc- f. threshold trait
ing different fruit sizes, the variance in grams (g) of fruit 8. What is a random sample, and why can a random sample
weight in the F1 is 2.25 g, and the variance among the F2 be used to represent a population?
is 5.40 g. Determine the genetic and environmental vari-
9. Why is heritability an important phenomenon in plant and
ance (VG and VE) for the trait and the broad sense herita-
animal agriculture?
bility of the trait.
Application and Integration For answers to selected even-numbered problems, see Appendix: Answers.
10. Three pairs of genes with two alleles each (A1 and A2, B1 and b. What height is expected in the F1 progeny of a cross
B2, and C1 and C2) control the height of a plant. The alleles between A1A1B1B1C1C1 and A2A2B2B2C2C2?
of these genes have an additive relationship: each copy of c. What is the expected height of a plant with the geno-
alleles A1, B1, and C1 contributes 6 cm to plant height, and type A1A2B2B2C1C2?
each copy of alleles A2, B2, and C2 contributes 3 cm. d. Identify all possible genotypes for plants with an
a. What are the expected heights of plants with each expected height of 33 cm.
of the homozygous genotypes A1A1B1B1C1C1 and e. Identify the number of different genotypes that are
A2A2B2B2C2C2? possible with these three genes.
Problems 721
f. Identify the number of different phenotypes (expected 16. In a line of cherry tomatoes, the average fruit weight is
plant heights) that are possible with these three genes. 16 g. A plant producing tomatoes with an average weight
11. In selective breeding experiments, it is frequently of 12 g is used in one self-fertilization cross to produce a
observed that the strains respond to artificial selection for line of smaller tomatoes, and a plant producing tomatoes of
many generations, with the selected phenotype changing 24 g is used in a second cross to produce larger tomatoes.
in the desired direction. Often, however, the response to a. What is the selection differential (S) for fruit weight in
artificial selection reaches a plateau after many genera- each cross?
tions, and the phenotype no longer changes as it did in b. If narrow sense heritability (h2) for this trait is 0.80,
past generations. what are the expected responses to selection (R) for
a. What is the genetic explanation for the plateau fruit weight in the crosses?
phenomenon? 17. Two pure-breeding wheat strains, one producing dark red
b. Once a plateau has been reached, is the heritability of kernels and the other producing white kernels, are crossed
the trait very high or is it very low? Explain. to produce F1 with pink kernel color. When an F1 plant
12. Two inbred lines of sunflowers (P1 and P2) produce dif- is self-fertilized and its seed collected and planted, the
ferent total weights of seeds per flower head. The mean resulting F2 consist of 160 plants with kernel colors as
weight of seeds (grams) and the variance of seed weights shown in the following table.
in different generations are as follows.
Kernel Color Number
Generation Mean Weight/Head (g) Variance White 9
P1 105 3.0 Dark red 12
P2 135 3.8 Red 39
F1 122 3.5 Light pink 41
F2 125 7.4 Pink 59
a. Use the information above to determine VG, VE, and VP a. Based on the F2 progeny, how many genes are involved
for this trait. in kernel color determination?
b. Determine H 2 for this trait. b. How many additive alleles are required to explain the
13 What is a quantitative trait locus (QTL)? Suppose you five phenotypes seen in the F2?
wanted to search for QTLs influencing fruit size in toma- c. Using clearly defined allele symbols of your choice,
toes. Describe the general structure of a QTL experiment, give genotypes for the parental strains and the F1.
including the kind of tomato strains you would use, how Describe the genotypes that produce the different phe-
molecular markers should be distributed in the genome, notypes in the F2.
how the genetic marker alleles should differ between the d. If an F1 plant is crossed to a dark red plant, what are
two strains, and how you would use the F1 progeny in a the expected progeny phenotypes and what is the
subsequent cross to obtain information about the possible expected proportion of each phenotype?
location(s) of QTLs of interest. 18. In studies of human MZ and DZ twin pairs of the same
sex who are reared together, the following concordance
14. In Nicotiana, two inbred strains produce long (PL) and
values are identified for various traits. Based on the values
short (PS) corollas. These lines are crossed to produce F1,
shown, describe the relative importance of genes versus
and the F1 are crossed to produce F2 plants in which corolla
the influence of environmental factors for each trait.
length and variance are measured. The following table
summarizes mean and variance of corolla length in each
generation. Calculate H 2 for corolla length in Nicotiana. Trait Concordance
MZ DZ
Generation Mean Corolla Length (mm) Variance Blood type 100 65
PL 85.75 4.21 Chicken pox 89 87
PS 43.15 2.89 Manic depression 67 13
F1 62.26 3.62 Schizophrenia 72 12
F2 67.37 38.10 Diabetes 62 15
Cleft lip 51 6
15. Suppose the length of maize ears has narrow sense heri- Club foot 40 4
tability (h2) of 0.70. A population produces ears that have
an average length of 28 cm, and from this population a
breeder selects a plant producing 34-cm ears to cross by 19. During a visit with your grandparents, they comment on
self-fertilization. Predict the selection differential (S) and how tall you are compared with them. You tell them that
the response to selection (R) for this cross. in your genetics class, you learned that height in humans
722 CHAPTER 19 Genetic Analysis of Quantitative Traits
has high heritability, although environmental factors also 23. New Zealand lamb breeders measure the following vari-
influence adult height. You correctly explain the meaning ance values for their herd.
of heritability, and your grandfather asks, “How can height
be highly heritable and still be influenced by the environ-
ment?” What explanation do you give your grandfather? Trait VP VG VA
20. An association of racehorse owners is seeking a new Body mass (kg) 42.4 20.5 7.4
genetic strategy to improve the running speed of their Body fat (%) 38.9 16.2 5.7
horses. Traditional breeding of fast male and female Body length (cm) 51.6 26.4 8.1
horses has proven expensive and time-consuming, and the
breeders are interested in an approach using quantitative
trait loci as a basis for selecting breeding pairs of horses.
Write a brief synopsis (∼ 50 words) of QTL mapping to a. Calculate the broad sense heritability (H 2) and the nar-
explain how genes influencing running speed might be row sense heritability (h2) for each trait in this lamb
identified in horses. herd.
b. How would you characterize the potential response to
21. Applied to the study of the human genome, a goal of GWAS selection (R) for each trait?
is to locate chromosome regions that are likely to contain
genes influencing the risk of disease. Specific genes can be 24. Cattle breeders would like to improve the protein con-
identified in these regions, and particular mutant alleles that tent and butterfat content of milk produced by a herd of
increase disease risk can be sequenced. To date, the identi- cows. Narrow sense heritability values are 0.60 for pro-
fication of alleles that increase disease risk has occasionally tein content and 0.80 for butterfat content. The average
led to a new therapeutic strategy, but more often the identifi- percentages of these traits in the herd and the percent-
cation of disease alleles is the only outcome. ages of the traits in cows selected for breeding are as
follows.
a. From a physician’s point of view, what is the value of
being able to identify alleles that increase the risk of a
particular disease?
b. What is the value of being able to identify alleles that Trait Herd Average Selected Cows
increase disease risk for a person who is currently free Protein content 20.2% 22.7%
of the disease but who is at risk of developing the dis-
Butterfat content 6.5% 7.4%
ease due to its presence in the family?
c. What personal or ethical issues arising from GWAS
might be of concern to physicians or to those who
might carry an allele that increases disease risk? a. Determine the selection differential (S) for each trait in
22. Suppose a polygenic system for producing color in ker- this herd.
nels of a grain is controlled by three additive genes, G, M, b. Which trait is likely to be the most responsive to arti-
and T. There are two alleles of each gene, G1 and G2, M1 ficial selection applied by the cattle breeders through
and M2, and T1 and T2. The phenotypic effects of the three selection of cows for mating?
genotypes of the G gene are G1G1 = 6 units of color, 25. In human gestational development, abnormalities of the
G1G2 = 3 units of color, and G2G2 = 1 unit of color. The closure of the lower part of the midface can result in cleft
phenotypic effects for genes M and T are similar, giving lip, if the lip alone is affected by the closure defect, or
the phenotype of a plant with the genotype G1G1M1M1T1T1 in cleft lip and palate (the roof of the mouth), if the clo-
a total of 18 units of color and a plant with the genotype sure defect is more extensive. Cleft lip and cleft lip with
G2G2M2M2T2T2 a total of 3 units of color. cleft palate are multifactorial disorders that are threshold
a. How many units of color are found in trihybrid plants? traits. A family with a history of either condition has a
b. Two trihybrid plants are mated. What is the expected significantly increased chance of a recurrence of midface
proportion of progeny plants displaying 9 units of cleft disorder in comparison with families without such
color? Explain your answer. a history. However, the recurrence risk of a midface cleft
c. Suppose that instead of an additive genetic system, disorder is higher in families with a history of cleft lip
kernel-color determination in this organism is a thresh- with cleft palate than in families with a history of cleft lip
old system. The appearance of color in kernels requires alone.
nine or more units of color; otherwise, kernels have no a. Suppose a friend of yours who has not taken genetics
color and appear white. In other words, plants whose asks you to explain these observations. Construct a
phenotypes contain eight or fewer units of color are genetic explanation for the increased recurrence risk of
white. Based on the threshold model, what proportion midface clefting in families that have a history of cleft
of the F2 progeny produced by the trihybrid cross in disorders versus families without a history of such
part (b) will be white? Explain your answer. disorders.
d. Assuming the threshold model applies to this kernel- b. Construct a similar explanation of why the recurrence
color system, what proportion of the progeny of the risk of a cleft disorder is higher in families with a his-
cross G1G2M1M2T2T2 * G1G2M1M2T1T2 do you expect tory of cleft lip with cleft palate than in families with a
to display colored kernels? history of cleft lip alone.
Problems 723
26. The children of couples in which one partner has blood a. If the trait is substantially influenced by genes, would
type O (genotype ii) and the other partner has blood type you expect the concordance rate to be higher in MZ
AB (genotype I AI B) are studied. twins or higher in DZ twins? Explain your reasoning.
a. What is the expected concordance rate for blood type b. If the trait is produced with little contribution from
of MZ twins in this study? Explain your answer. genetic variation, what would you expect to see if you
b. What is the expected concordance rate for blood type compared the concordance rates of MZ twins versus
of DZ twins in this study? Explain why this answer is DZ twins? Explain your reasoning.
different from the answer to part (a).
27. Answer the following in regard to multifactorial traits in
human twins.
Collaboration and Discussion For answers to selected even-numbered problems, see Appendix: Answers.
28. Suppose the mature height of a plant is a multifactorial c. List two genotypes that have a height potential of 80 cm.
trait under the control of five independently assorting d. If two plants that each have a height potential of 75
genes, designated A, B, C, D, and E, and five environ- cm are crossed, what proportion of the progeny will
mental factors. There are two alleles of each gene (A1, A2, have a height potential of 80 cm? (Hint: See Figure
etc.). Each allele with a subscript 1 (i.e., A1, etc.) contrib- 19.3e for assistance making this determination.)
utes 5 cm to potential plant height, and each allele with
a 2 subscript (i.e., A2, etc.) contributes 10 cm to potential
plant height. In other words, a genotype containing only
1 alleles (A1A1B1B1C1C1D1D1E1E1) would have a potential Environmental Factor States
height of [(10)(5)] = 50 cm, and a genotype with only
Genotype 1 2 3 4 5
2 alleles (A2A2B2B2C2C2D2D2E2E2) would have a potential
height of [(10)(10)] = 100 cm. a. A1A2B1B2C2 C2D1D2E1E2 G F O G M
The five environmental factors are (1) amount of b. A
1A2B1B2C2 C2D1D2E1E2 F M G G F
water, (2) amount of sunlight, (3) soil drainage, (4) nutri-
c. A
1A1B1B2C1 C2D1D2E1E2 O G G G G
ent content of soil, and (5) temperature. Each environ-
mental factor can vary from optimal to poor. If all factors
are optimal, assume that full potential height is attained.
However, if one or more of the environmental factors is
29. A three-gene system of additive genes (A, B, and C)
less than optimal, then height is reduced. The state of each
controls plant height. Each gene has two alleles (A
environmental factor has an effect on growth. In this exer-
and a, B and b, and C and c). There is dominance
cise, we’ll assume that the growth is affected according to
among the alleles of each gene, with alleles A, B, and
the following scale:
C dominant over a, b and c. Under this scheme, the
dominant genotype for a gene contributes 10 cm to
Environmental Factor State Height Lost height potential, and the recessive genotype contrib-
utes 4 cm.
Optimal (O) 0 cm lost
a. What is the height potential of a plant that is homozy-
Good (G) 4 cm lost
gous for all three dominant alleles?
Fair (F) 8 cm lost b. What is the height potential of a plant that is homozy-
Marginal (M) 12 cm lost gous for all three recessive alleles?
c. What is the height potential of the F1 progeny of the
Poor (P) 16 cm lost
homozygous plants identified in (a) and (b) of this
problem?
Thus, for example, if one environmental factor is optimal, d. What are the phenotypes and proportions of each phe-
two are good, one is fair, and one is marginal, the loss notype among the F2?
of potential height is (0 + 4 + 4 + 8 + 12) = 28 cm. 30. Congenital dislocation of the hip is a threshold condi-
If the loss of height potential is greater than the height tion in which the head of the femur (the femoral head)
potential of the plant, the plant does not survive. is out of its normal position relative to the bones that
a. Calculate the potential height, based on inherited will form the socket of the hip (the acetabulum). This
alleles, and the attained height, based on growth in misplacement can lead to potentially serious orthope-
the environmental circumstances given, for the three dic problems later in life if the condition is not treated
plants (a, b, and c) in the accompanying table. in infancy. Numerous studies have shown that (a)
b. How many 1 and 2 alleles must be present to give a brothers and sisters of infants born with congenital hip
height potential of 80 cm? dislocation are more likely to develop the condition
724 CHAPTER 19 Genetic Analysis of Quantitative Traits
than are the siblings of those without the condition. Subject Men Women
These studies also find that (b) more female infants
Height (in) Weight (lb) Height (in) Weight (lb)
than male infants have the trait, and (c) if the affected
child is a girl, the risk to her siblings is lower than 1 65 136 60 95
if the affected infant is a boy. Explain the meaning 2 66 146 61 103
of the three observations (a, b, and c) in the context
3 67 141 62 110
of proposing a threshold model that explains these
observations. 4 67 148 62 109
Charles Darwin (1809–1882) studied the morphology and adaptation ESSENTIAL IDEAS
of finches on islands of the Galápagos and Cocos chains in formulating
his theory of evolution by natural selection. The molecular genetics ❚❚ The Hardy–Weinberg equilibrium
underlying the evolution of the two predominant beak shapes in predicts frequencies of genotypes in
finches—pointed for insect eating and blunt for seed crushing—have populations.
recently been described. The two shapes are shown along with an ❚❚ The impact of natural selection on allele
image of Darwin on this United Kingdom stamp commemorating the frequencies can be estimated.
100th anniversary of his death.
❚❚ The effect of mutations on allele
frequencies can be quantified.
just as it shaped life in the past, and will continue to 20.1 The Hardy–Weinberg
shape it into the future. All four of the evolutionary
Equilibrium Describes the
processes described in Section 1.5—natural selection,
mutation, migration (gene flow), and genetic drift—
Relationship of Allele and Genotype
play a role in shaping the evolution of genes, proteins, Frequencies in Populations
populations, and species.
The origin of population genetics can be traced to the
The modern synthesis focused on uniting two earliest years of the 1900s, shortly after the rediscovery
elements of evolutionary biology. One was the large- of Mendel’s laws of heredity, and to a time when George
scale evolutionary change linked to speciation and to Udny Yule, William Castle, Karl Pearson, Godfrey Hardy,
the divergence of taxonomic groups above the spe- Wilhelm Weinberg, and others first debated the fate of genes
in populations. In 1902, the inheritance of brachydactyly
cies level. The second element consisted of what was (OMIM 112500), an autosomal dominant condition charac-
known about Mendelian inheritance and the connec- terized by shortening of fingers and toes, was described in
tion between inherited molecular variation (i.e., varia- humans as a trait paralleling a Mendelian pattern of hered-
tion of DNA and protein sequences) and evolutionary ity. In contemplating this observation, Yule proposed that
since three-quarters of the progeny of a cross of heterozy-
change. Through this unification, the modern synthesis gous parents with brachydactyly will also display short-
has given rise to a simple definition of evolution: the ened digits, the frequency of the dominant allele might be
change in allele frequencies in populations over time. expected to increase over time. William Castle thought Yule
Much of the discussion in this chapter centers on that was wrong, and in 1903 he offered, as a partial refutation of
Yule’s contention, a mathematical demonstration that in the
definition of evolution. absence of natural selection, genotype frequencies remain
The impact of the evolutionary processes stable in populations. Karl Pearson supported Castle’s
on populations has been a focus of population position by showing that if two alleles of a gene had equal
biologists, evolutionary biologists, and frequency in a population, there would be a single, stable
equilibrium frequency for their genotypes. Reginald Punnett
mathematicians since the beginning of the 20th (of Punnett square fame) also thought Yule was wrong, but
century, several decades before DNA was identified unable to formulate a mathematical argument to refute Yule,
as the hereditary molecule and its structure became he took the problem to his friend and regular cricket partner
known. The central predictions that were made more Godfrey Hardy.
Hardy, a mathematician rather than a biologist, quickly
than a century ago about populations on the basis identified a “very simple” solution to the question of the
of evolutionary principles have been proven cor- fate of alleles in populations. He showed that with ran-
rect time and again in countless experiments and dom mating and in the absence of evolutionary change
observations in natural and experimental popula- in a population, the allele frequencies result in a stable
equilibrium frequency. Hardy also showed that, at equi-
tions. In this chapter, we focus on the connection librium, allele frequencies are stable and that genotypes
between the evolution of populations and evolution occur in predictable frequencies derived directly from allele
at the molecular level, that is, the evolution of genes, frequencies. In 1908, Hardy penned a letter to the editors
genomes, and proteins. We begin our discussion of Science magazine that began with these self-effacing
words:
with the application of evolutionary principles to
populations that forms the foundation of the field I am reluctant to intrude in a discussion concerning
matters of which I have no expert knowledge, and
of population genetics. We then discuss the opera- I should have expected the very simple point which
tion of each of the evolutionary processes, using I wish to make to have been familiar to biologists.
examples that largely focus on humans. The causes However, some remarks of Mr. Udny Yule, to which
of speciation are then explored, and we conclude the Mr. R. C. Punnett has called my attention, suggest it
may be worth making.
chapter with a discussion of molecular evolution.
In his letter, Hardy demonstrated that Yule was wrong.
Dominant alleles do not increase in frequency over time.
His letter laid out the concept that has become known as
the Hardy–Weinberg (H-W) equilibrium. The name
recognizes Hardy’s explanation of allele and genotype
20.1 The Hardy–Weinberg Equilibrium Describes the Relationship of Allele and Genotype Frequencies in Populations 727
Male gametes 1
p q
A2A2 = q2 A1A1 = p2
p p 2
pq
Female 0.8
Genotype frequencies
gametes
q pq q2
0.6 A1A2 = 2pq
Binomial expansion
(p + q)(p + q) = p2 + pq + pq + q2 = p2 + 2pq + q2 = 1
0.4
Figure 20.1 The Hardy–Weinberg equilibrium for autosomal
genes. The Punnett square method and the binomial expansion
of alleles with frequencies p and q predict genotype frequencies 0.2
under assumptions of the Hardy–Weinberg equilibrium.
Table 20.2 Hardy–Weinberg Mating Table for Two Alleles of an Autosomal Gene
A 1A 1 A 1A 2 A 2A 2
A 1A 1 * A 1A 1 2 2
(p )(p ) = p 4
p4 — —
A 1A 1 * A 1A 2 2 3
2[(p )(2pq)] = 4p q 2p3q 2p3q —
reciprocal mating to account for, but if different genotypes Determining Autosomal Allele Frequencies
occur in the parents, the reciprocal matings must be taken in Populations
into account. The progeny of each mating are predicted
according to Mendelian principles. The frequency or frac- Allele frequencies and genotype frequencies are commonly
tion of offspring with each genotype is summed once the used measures of the genetic structure of populations. Com-
table is filled. The term that is the sum of each genotype parison of these frequencies between populations can iden-
frequency can be simplified to show that offspring are pro- tify relationships and diversification of populations, and
duced in the genotype proportions p2, 2pq, and q2, just as documentation of allele frequency change over time is evi-
they occur in the parents. This analysis is compelling evi- dence of population evolution.
dence that in the presence of random mating and the absence Allele frequencies in populations can be estimated by
of evolutionary change, the allele frequencies in populations two methods, the gene-counting method and the square
are stable over time. root method. The gene-counting method does not require
In populations that meet the assumptions of the H-W any assumptions about the population; it only requires that
equilibrium, a single generation of random mating will all genotypes can be identified. It can be used whether or
“reset” the genotype frequencies in the population into not one knows or can assume the population is in H-W
the predicted proportions p2, 2pq, and q2. Moreover, if a equilibrium. For the square root method, on the other
population meets the assumptions but is not initially in hand, one must know or must assume that the population
H-W equilibrium, we can predict what the consequence is in H-W equilibrium. The square root method is often
of one generation of random mating will be. As an exam- used when the trait of interest is the result of a reces-
ple, Figure 20.4 illustrates the effect of uniting two previ- sive homozygous genotype and where the heterozygous
ously separate populations with different frequencies of A1 and homozygous dominant genotypes result in identical
and A2 to form a new population. Each of the contributing phenotypes.
populations originally contained 500 individuals, and the The gene-counting method can be accomplished in
new population contains 1000 individuals. Immediately either of two ways: by calculating the proportions of
after forming the new population, the genotypes are not in genotypes or by directly counting the number of alleles
Hardy–Weinberg proportions. One generation of mating in from the genotypes themselves. We describe these two
the new population under Hardy–Weinberg assumptions, “gene-counting” approaches separately for convenience,
however, produces genotype frequencies in the next genera- but they are really the same. The choice is dictated by
tion that are in H-W equilibrium. The new population has the type of genotype or phenotype information available
new allele frequencies as a result of the mixing of the two and the composition of the population or of the sample
populations. data.
730 CHAPTER 20 Population Genetics and Evolution at the Population, Species, and Molecular Levels
New population
The Genotype Proportion Method The first approach to The allele frequency calculation recognizes that each
gene counting is called the genotype proportion method. of the 1482 people in the sample carries two alleles of
This approach calculates allele frequencies (f) as already the gene and that there are (2)(1482) = 2964 alleles
demonstrated in one of the examples above, by adding represented in the sample. The frequency of each allele
the frequency of the homozygotes for the allele and the is determined by counting the two alleles of that type
frequency of one-half of the heterozygotes carrying the from each homozygote and the single allele of that type
allele. For instance, suppose that a population has the follow- from each heterozygote and dividing the result by the
ing composition: B1B1 = 0.64, B1B2 = 0.32, B2B2 = 0.04. total alleles in the sample. The allele frequencies are
Applying the genotype proportion method, the frequency therefore f (M) = [(2)(406) + (744)]/2964 = 0.525 and
of B1 is the sum of the frequency of B1B1 plus one- f (N) = [(2)(332) + (744)]/2964 = 0.475.
half the frequency of B1B2 heterozygotes. In this case,
f (B1) = p = (0.64) + [(0.5)(0.32)] = 0.80. Similarly, for The Square Root Method The alternative approach for
B2, the allele frequency is calculated by adding the fre- allele frequency determination in populations is the square
quency of B2 B2 and one-half the frequency of B1B2, or root method. It is used only when the gene has two alleles,
f (B2 ) = q = (0 .0 4 ) + [(0 .5 )(0 .3 2 )] = 0 .2 0 . For this one dominant and one recessive; the condition or trait of inter-
example, notice that p + q = 0.80 + 0.20 = 1.0. est is recessive; and the investigator knows or can assume that
the population is in H-W equilibrium. In the human autoso-
The Allele-Counting Method The second approach to mal recessive disorder cystic fibrosis, for example, one allele
the gene-counting method is called the allele-counting (cf) is recessive and therefore is evident only in the homozy-
method. As an example of the allele-counting method, con- gous genotype. When the recessive allele is in a heterozygous
sider the human MN blood group system, a codominant genotype, it is “hidden” by the dominant allele (CF). In a cir-
system produced by two alleles, M and N. Both alleles are cumstance like this, the dominant phenotype consists of two
present in all human populations and produce three blood genotypes, CFCF and CFcf. In contrast, the recessive phe-
group phenotypes: type M, type MN, and type N. Each notype is produced only by the homozygous recessive geno-
blood group has a corresponding genotype. Individuals type cfcf. The correspondence of the recessive phenotype
with blood type M or blood type N have homozygous geno- and homozygous genotype allows use of Hardy–Weinberg
types MM and NN, respectively, and the blood type MN is principles to estimate the frequency of the recessive allele by
produced by the MN genotype. MN blood group testing of taking the square root of the recessive homozygous genotype
1482 members of a Japanese population produced the fol- frequency. In the U.S. population, the frequency of cystic
lowing results: fibrosis among newborn infants is approximately 1 in 2000.
Where f (CF) = p and f (cf) = q, f (cfcf) = q2 = 0.0005.
Blood group M MN N
The frequency of q is thus estimated as the square root of
Number 406 744 332 = 1482
0.0005, or f (q) = 0.022; that is, about 2.2 percent.
20.1 The Hardy–Weinberg Equilibrium Describes the Relationship of Allele and Genotype Frequencies in Populations 731
With f (cf ) determined, the frequency of CF is esti- genotype frequencies resulting from the trinomial expansion is
mated as f (CF) = p = 1 - q = 1.0 - 0.022 = 0.978. (p + q + r)2 = p2 + 2pq + q2 + 2pr + r 2 + 2qr = 1.0.
Then, according to the Hardy–Weinberg principle, the The human ABO blood group system provides an oppor-
population frequency of carriers is f (CFcf ) = 2pq = tunity for applying the H-W equilibrium to a gene with three
2(0.978)(0.022) = 0.043. In other words, approximately alleles (see Section 4.1). Recall that among the three alleles
4.3 percent of the population, or about 1 in 23 people, carry producing ABO blood types (I A, I B, and i) I A and I B exhibit
a recessive mutant allele for cystic fibrosis. The frequency of dominance over i but are codominant to one another. These
carriers of cystic fibrosis is of practical importance for deter- allelic relationships result in four blood types from the six
mining the chance that a person could pass the allele on to his genotypes (see Figure 4.3). Using f (I A) = p, f (I B) = q,
or her progeny. Estimates like this can be particularly valuable and f (i) = r, along with data reporting the frequencies of
in genetic counseling situations, where it is desirable to know each blood type in a population as type O = 46,, type
the probability that a person who has a dominant phenotype A = 37,, type B = 13,, and type AB = 4,, we can
might be a heterozygous carrier of a recessive allele. Genetic estimate the frequency of each allele by applying a version
Analysis 20.1 provides more practice in calculating allele fre- of the square root method. Table 20.3b shows the calcula-
quencies and applying the H-W equilibrium. tions of the genotype frequencies from allele frequencies.
They are derived as follows:
The Hardy–Weinberg Equilibrium for More Step 1. Blood type O is found with recessive homozygous
than Two Alleles genotypes, and the frequency of the blood type is
Having examined the application of the H-W equilib- r 2 = 0.46. The square root of 0.46 = r; thus, the
rium to genes with two alleles, we can now consider allele frequency is f (i) = r = 0.68.
the more complex case of a gene that has more than two Step 2.
The combined frequency of blood types A and
alleles. We shall limit our discussion to three alleles, O is p2 + 2pr + r 2 = (p + r)2, so f (I A) = p
whose frequencies are represented by the variables p, is estimated by the square root of the com-
q, and r, where p + q + r = 1.0, and where the trino- bined frequency of A plus the frequency of
mial expansion (p + q + r)2 represents random mat- O minus r. The calculation is f (I A) = p =
ing and predicts the distribution of alleles in genotypes. 2[0.37 + 0.46] - r = 0.91 - 0.68 = 0.23.
Assuming that the population is in H-W equilibrium, Step 3. Having estimated p and r, we can solve for q by
the frequencies of the six resulting genotypes are pre- q = 1 - (p + r) = 1 - (0.23 + 0.68) = 0.09.
dicted to be as listed in Table 20.3a, and the sum of
Evaluate
1. Identify the topic this problem addresses 1. This problem addresses the determination of allele frequencies from
and the nature of the required answer. population data and the determination of expected genotype frequencies
under assumptions of the H-W equilibrium.
2. Identify the critical information given in 2. The number of individuals with each blood type is given, and the blood
the problem. type is identified as an autosomal codominant trait.
Deduce
3. Determine the genotype corresponding to 3. For this autosomal codominant trait, blood type M individuals have the
each blood group. genotype MM, those with blood type N are NN, and MN individuals are MN.
4. Calculate the frequency of each blood 4. Blood type M is 342/1029 = 0.332, MN is 500/1029 = 0.486, and N is
type in the sample. 187/1029 = 0.186.
TIP: The frequency of each genotype
is the number of people with the
genotype over the total sample size.
Solve Answer a
5. Calculate allele frequencies using the 5. The frequencies are
genotype proportion method. f (M) = (0.332) + [(0.5)(0.486)] = 0.575 and
f (N) = (0.186) + [(0.5)(0.486)] = 0.425
6. Calculate the allele frequencies by the allele- 6. For the sample of 1029 people, there are 2058 alleles. The allele
counting method frequencies are
TIP: If the allele frequencies f (M) = [(2)(342)] + (500)/2058 = 0.575 and
are calculated correctly,
their sum will be 1.0. f (N) = [(2)(187)] + (500)/2058 = 0.425
Answer b
7. Determine the expected genotype 7. The expected genotype frequencies are
frequencies and the number of MM = (0.575)2 = 0.33; (0.33)(1029) = 339.57
individuals with each genotype under MN = 2[(0.575)(0.425)] = 0.49; (0.49)(1029) = 504.21 and
Hardy–Weinberg assumptions.
NN = (0.425)2 = 0.18; (0.18)(1029) = 185.22
TIP: Assume f (M) = p and f (N) = q,
and expand the binomial equation
(p + q)2 = p2 + 2pq + q2.
For more practice, see Problems 17, 18, 21, and 25. Visit the Study Area to access study tools. Mastering Genetics
significant deviation are most often either small population 20.2 Natural Selection Operates
size, substantial migration in or out of the population, or
nonrandom mating. We discuss these effects in following through Differential Reproductive
sections. Fitness within a Population
The H-W equilibrium has application beyond the mere
examination of populations. Some of the most interesting appli- Application of the H-W equilibrium to idealized populations
cations are seen in forensic genetics—for instance in crime reveals that allele frequencies, along with the frequencies
scene analysis of DNA or in paternity assessment. Application of genotypes, are maintained when the population mates
Chapter E: Forensic Genetics explores these applications. at random and in the absence of the action of evolutionary
732
20.2 Natural Selection Operates through Differential Reproductive Fitness within a Population 733
mechanisms. But what happens to allele frequencies when For example, if an organism not having the favored trait
evolution does occur? The simple answer is that allele fre- reproduces 80 percent as well as the organism with the trait,
quencies change, and along with them genotype frequen- the selection coefficient is s = 0.2, and the relative fitness of
cies are altered. The evolutionary impact can be quantified the organism is expressed as w = 1 - s, or 1 - 0.2 = 0.8.
by determining the change in allele frequencies. An implicit If other organisms experience yet a different level of relative
component of the description of evolution as change in fitness, a second selection coefficient, designated t, is used.
allele frequencies in a population over time is the presence Where an organism with one genotype is most fit and organ-
of inherited genetic diversity. If there is no genetic diversity, isms with either of two other genotypes experience reduced
there can be no evolution. fitness, the relative fitness values for the two less fit genotypes
In this section, we look at the effects of different mech- are expressed as w = 1 - s and w = 1 - t.
anisms of natural selection on allele frequencies and H-W
equilibrium. In later sections, we examine how the other Directional Natural Selection
evolutionary processes—mutation, migration (gene flow),
and genetic drift—affect allele frequencies and H-W equi- In the pattern of natural selection called directional natural
librium in populations (see Section 1.5). selection (directional selection, for short), the favored
phenotype has a homozygous genotype. Organisms with
Differential Reproductive Fitness this phenotype have higher relative fitness than other phe-
notypes in the population. Natural selection favoring one
and Relative Fitness homozygous genotype produces a directional change in
Natural selection results from the differential reproduc- allele frequencies that increases the favored allele frequency
tive success of organisms in the population. Organisms that and decreases others.
leave more offspring distribute more copies of their alleles In the directional selection example that follows,
to the next generation, and this increases the frequency of assume alleles B1 and B2 are codominant. The codominant
the alleles that those most successful reproducers pass on. relationship of these alleles will result in one genotype that
Natural selection usually operates as a result of differences in occurs in organisms with the highest relative fitness and in
anatomical, physiological, behavioral, or other traits passed reduced fitness in organisms with the other genotypes. In
to progeny by the more successful reproducers and not pres- this example, the allele frequencies are f (B1) = 0.6 and
ent in organisms that are less successful at reproduction. The f(B2 ) = 0 .4 , there are 1000 members of the population,
most successful individuals may survive to reproductive age the favored phenotype has a relative fitness of w = 1.0, and
at higher rates than other population members, they may the other phenotypes have different relative fitness values of
reproduce at higher rates, or both. This phenomenon is called w = 0.80 and w = 0.40. The genetic profile of the popula-
differential reproductive fitness, and it is a central feature tion, therefore, is as follows:
of natural selection. The consequence of differential repro-
ductive fitness is that more of some alleles than others are Genotype B1 B1 B1 B2 B2 B2
passed to the next generation, and this imbalance changes Frequency 0.36 0.48 0.16
allele frequencies in the population over time. On this basis, Number 360 480 160
natural selection is sometimes said to “favor the most fit” Relative fitness (w) 1.0 0.80 0.40
organisms in the population, meaning those with the highest
reproductive fitness among the organisms in the population. As this table shows, the B1B1 organisms have the highest
Reproductive fitness is not an idealized concept or relative fitness (w = 1.0). In comparison, B1B2 organisms
value. It is a real consequence of inherited variation oper- have s = 0.20 and w = 1 - s = 0.80, and organisms with
ated on by natural selection, causing the most fit among a the B2B2 genotype have a selection coefficient of t = 0.60
generation of organisms to produce more offspring. A com- and a relative fitness of w = 1 - t = 0.40.
mon way to measure the intensity of natural selection is to Given this profile as a starting point, the impact of natu-
determine the impact of differential reproduction on the next ral selection on the population is computed in two steps. First,
generation. This involves use of the relative fitness (w) of assuming natural selection has its effect before organisms
organisms, a value that quantifies the reproductive success reach reproductive age, the surviving number of organisms of
of other genotypes relative to the most favored genotype. each genotype is calculated by multiplying the original num-
Since this is a relative comparison, organisms with the great- ber of each genotype by the relative fitness value of the geno-
est reproductive success have a relative fitness of w = 1.0. type. In this case the numbers of survivors of each genotype
The genotypes that reproduce less successfully than the are B1B1 = (1.0)(360) = 360, B1B2 = (0.80)(480) = 384,
most favored genotype have a relative fitness of less than and B2B2 = (0.40)(160) = 64. In this hypothetical popula-
w = 1.0. These less fit genotypes have their relative fitness tion, 808 organisms of the original 1000 remain after natural
reduced by a proportion called the selection coefficient (s). selection.
The selection coefficient identifies the proportionate differ- The second step is determination of the allele fre-
ence between the fitnesses of organisms with different traits. quencies after natural selection and of the genotype
734 CHAPTER 20 Population Genetics and Evolution at the Population, Species, and Molecular Levels
frequencies in the next generation. In this case, the frequen- directional selection favoring B1 increases the frequency of
cies are most readily calculated using the allele-counting that allele at a pace determined by the intensity of natural
method, since we can identify the genotype of each sur- selection.
vivor. There are a total of 1616 alleles in the 808 survi- The concept of relative fitness values can be applied
vors, and the allele frequencies after natural selection are to populations in several ways. Table 20.4 illustrates a
f (B1) = [(2)(360) + (384)]/1616 = 1104/1616 = 0.683, case of natural selection against a homozygous recessive
and f (B2) = [(2)(64) + (384)]/(2)(808) = 512/1616 = genotype. In this case, frequencies of f (B) = 0.50 and
0.317. If we assume that random mating takes place f (b) = 0.50 are subjected to natural selection against bb,
among the survivors and that no other evolutionary mecha- where wbb = 0.0 and wBb = wBB = 1.0. No bb individu-
nism other than natural selection is operating, the geno- als survive to reproductive age, thus removing 25 percent
type frequencies in the next generation are f (B1B1) = of the population. When the relative genotype frequencies
(0.683)2 = 0.467, f (B1B2) = 2(0.683)(0.317) = 0.433, and are determined using their new proportions in the surviv-
f (B2B2) = (0.317)2 = 0.100. ing reproductive population, f (B) and f (b) are calculated
The changes in allele frequencies are symbolized to be f (B) = 0.667 and f (b) = 0.333. Among the progeny
by the Greek delta (∆) and found by taking the abso- in generation 1, genotype frequencies are f (BB) = 0.445,
lute value of the difference between the original allele f (Bb) = 0.444, and f (bb) = 0.111.
frequency and the new allele frequency. For this exam- Directional natural selection against the homozygous
ple in which B1 has increased and B2 has decreased, recessive genotype causes the frequency of the dominant
the values are, ∆B1 = 0.683 - 0.60 = 0.083, and allele to increase and the frequency of the recessive allele to
∆B2 = 0.317 - 0.40 = 0.083. If this pattern of natural decrease. Eventually, the recessive allele may be eliminated
selection continues for enough generations, the frequency of from the population gene pool. The recessive allele is not
the B1 allele will eventually become fixed at f (B1) = 1.0, eliminated quickly, however, and its frequency changes slowly,
and the frequency of B2 will be eliminated, so that its final especially as the allele gets less frequent. The slow pace of
frequency will be f (B2) = 0.0. Once an allele frequency evolutionary change at low allele frequencies is due to the
is either fixed (f = 1.0) or eliminated (f = 0.0), natural smaller number of recessive homozygotes in the population.
selection can no longer change the frequency. Population Numerous directional selection experiments, taking
allele frequencies of 0.0 or 1.0 can, however, be changed place over the last several decades of research, demonstrate
by migration and mutation. Figure 20.5 illustrates that support for the theoretical predictions for populations under
directional selection. A 1981 study by Douglas Cavener and
Michael Clegg examined four subpopulations of Drosophila
melanogaster for 50 generations to test the effectiveness of
(a) artificial directional selection at increasing the frequency of
1.0
the allele AdhF of the alcohol dehydrogenase (Adh) gene.
The enzyme product of AdhF breaks ethanol down rapidly.
Frequency of allele B1
0.8
An original population with an AdhF frequency of 0.38 was
0.6
0.4
Table 20.4 A Model of Directional Selection against
0.2 a Recessive Lethal Allele
0.0 Genotype
0 200 400 600 800 1000 BB Bb bb
Generation
Frequency 0.25 0.50 0.25
(b) Selection Relative fitness (w) 1.0 1.0 0.0
strength Relative fitness
Survivors after
Strong B1B1 B1B2 B2B2 selection (total, 0.75) 0.25 0.50 0.00
1.0 0.90 0.80
Relative genotype 0.25/0.75 = 0.50/0.75 =
1.0 0.98 0.96
frequencies 0.333 0.667 0.00
1.0 0.99 0.98
1.0 0.995 0.990 Estimated allele frequencies after natural selection:
1.0 0.998 0.996 f (B) = (0.333) + (0.5)(0.667) = 0.667
Weak
f (b) = (0) + (0.5)(0.667) = 0.333
Figure 20.5 The consequences of the intensity of natural Estimated genotype frequencies after reproduction:
selection on allele frequency. (a) The curves illustrate the rela- f (BB) = (0.667)2 = 0.445
tionship between the rate of change in f (B1) and the intensity of f (Bb) = 2(0.667)(0.333) = 0.444
natural selection. (b) Relative fitness values for natural selection of f (bb) = (0.333)2 = 0.111
different intensities.
20.2 Natural Selection Operates through Differential Reproductive Fitness within a Population 735
divided into four subpopulations of equal size. Two subpop- Table 20.5 A Model of Natural Selection Favoring
ulations reared on ethanol-rich food (population 1 and pop- the Heterozygous Genotype
ulation 2) showed progressive increases in the frequency of
Genotype
AdhF over 50 generations (Figure 20.6). In contrast, control
populations (control 1 and control 2), which were reared on CC Cc cc
food without ethanol, showed an overall upward (control 1) Frequency 0.25 0.50 0.25
and downward (control 2) drift of AdhF frequency. Relative fitness 0.65 1.0 0.20
A similar effect is seen in the action of strong direc- Survivors after 0.1625 0.50 0.05
tional natural selection in human populations. Two indepen- selection (total =
dent reports published in 2010, one by Xin Yi and colleagues 0.7125)
and the other by Tatum Simonson and colleagues, describe Relative genotype 0.1625/ 0.50/ 0.05/
the rapid evolutionary changes that have occurred in the last frequencies 0.7125 = 0.7125 = 0.7125 =
5000 years in native Tibetans who have adapted to low oxygen 0.228 0.702 0.070
conditions in the high-altitude environment of the Himalayan New allele frequencies after natural selection:
mountains. Strong directional natural selection has operated f (C ) = 0.579
in favor of certain alleles of multiple genes that increase oxy- f (c) = 0.421
gen utilization and improve oxygen transport and metabolism. Genotype frequencies after reproduction:
f (CC ) = (0.579) 2 = 0.335
Natural Selection Favoring Heterozygotes f (Cc) = 2[(0.579)(0.421)] = 0.448
f (cc) = (0.421) 2 = 0.177
A pattern of natural selection that can produce and main-
tain genetic diversity in populations is seen when the
heterozygous genotype is favored. The consequence of indicating that few homozygotes with the cc genotype survive
natural selection favoring the heterozygote is a balanced to reproductive age. The example assumes that the allele fre-
polymorphism, in which alleles reach stable equilibrium quencies are initially equal—that is, f (C) = f (c) = 0.50 in
frequencies that are maintained in a steady state, balancing generation 0. One generation of natural selection changes the
the selective pressures favoring the maintenance of a mutant allele frequencies to f (C) = 0 .5 7 9 and f (c) = 0 .4 2 1 .
allele when it occurs in a heterozygote but acting against it The table shows calculations illustrating the action of natural
when it occurs in a homozygous genotype. selection in the production of generation 1.
Table 20.5 depicts a natural selection scheme favoring Natural selection operating in favor of heterozygotes
heterozygotes. In this example, the relative fitness values are will eventually lead to a balanced polymorphism. (We
based on the heterozygous genotype (Cc) being 1.0, the rela- explore another example of this pattern of natural selection
tive fitness of CC being 0.80, and the fitness of cc being 0.20, in the Case Study at the end of the chapter.) Once attained,
the equilibrium frequencies of the alleles will be maintained
in a balanced polymorphism as long as natural selection
Population 1 remains steady. Population geneticists can predict the stable
1.0
High-ethanol equilibrium frequencies of alleles in a balanced polymor-
environment phism using the relative intensity of natural selection against
0.8
Population 2 the homozygous genotypes. Using the variables s and t to
Frequency of AdhF
Evaluate
1. Identify the topic of this problem and the 1. This problem is about the effects of natural selection on the frequencies
nature of the required answer. of two chromosome forms, AR and ST. The answer requires an explana-
tion of the pattern of natural selection and a calculation to determine the
ultimate frequencies of the chromosome forms.
2. Identify the critical information given in the 2. The relative fitness values are given, and these can be used to determine
problem. the final frequencies of AR and ST.
Deduce
3. Examine the relative fitness values 3. The relative fitness value for the heterozygous genotype is 1.0, and the
for each genotype, and calculate the relative fitnesses of the homozygous genotypes are lower. The selection
selection coefficients (s and t) against coefficient s operating against AR / AR is 1.0 - 0.65 = 0.35. The selec-
each genotype. TIP: Subtract the relative fitness of a tion coefficient t operating against ST / ST is 1.0 - 0.50 = 0.50.
genotype from 1.0 to determine the
selection coefficients s and t.
4. Consider how the relative fitness values can 4. A ratio of relative fitness values operating against each homozygous
be used to calculate the final frequencies of genotype can be used to calculate the equilibrium frequency of each of
AR and ST. the chromosome forms, with pE = t /(s + t) and qE = s /(s + t).
Solve Answer a
5. Describe the natural selection pattern 5. This is an example of heterozygous advantage, and both chromosome
operating on these genotypes. forms are expected to remain in the population at equilibrium values
determined by the relative strength of natural selection against each
form.
Answer b
6. Determine the equilibrium frequencies of 6. If the equilibrium frequency of AR is pE and of ST is qE, the
each chromosome form. equilibrium frequencies are pE = 0.50 / (0.35 + 0.50) = 0.588 and
PITFALL: Double-check your arithmetic by making qE = 0.35 / (0.35 + 0.50) = 0.412.
sure that the sum of the equilibrium frequencies you
calculate is 1.0.
For additional practice see Problems 4, 11, and 24. Visit the Study Area to access study tools. Mastering Genetics
20.3 Mutation Diversifies gradual. For example, if mutation converts one in every
10,000 A1 alleles to A2 alleles each generation, a population
Gene Pools containing f (A1) = 0.90 and f (A2) = 0.10 in generation 0
will have frequencies f (A1) = 0.81 and f (A2) = 0.19 after
Mutation is the ultimate source of all new genetic variation 1000 generations, assuming no effects from the other evolu-
in populations, and the genetic variation it generates is an tionary processes.
indispensable component of evolution. By itself, however, An additional reason that mutation alone is a slow evo-
gene mutation is a very slow evolutionary process because lutionary process has to do with the two directions in which
its effect on allele frequencies in populations is small and mutation can affect any given allele. The forward mutation
736
20.4 Gene Flow Occurs by the Movement of Organisms and Genes between Populations 737
rate (m) pertains to mutations that create a new A2 allele homozygous recessive genotype. This results in the persis-
by mutation of A1, whereas the reverse mutation rate (v), tence of recessive mutant alleles in most populations at a
also known as the reversion rate, pertains to mutation of frequency somewhat greater than the mutation frequency.
alleles in the opposite direction, A2 to A1. Forward and Under these circumstances, the frequency of mutant
reverse mutation can create a balanced equilibrium, given alleles in a population is a balance of the intensity of natural
a sufficient number of generations and the absence of other selection against the mutant and the frequency of mutation of
evolutionary processes. the gene. This expression is called the mutation–selection
balance, and it determines the equilibrium frequency of the
Quantifying the Effects of Mutation mutant allele (qE) by considering the rate of elimination of
deleterious alleles by natural selection (s) and the rate at
on Allele Frequencies
which new mutant alleles are generated (m).
In the absence of other evolutionary effects, the conse- Consider the following situation for a recessive lethal
quences of forward and reverse mutation (reversion) for mutation.
allele frequencies in a population can be quantified. If
f (A1) = p and f (A2) = q, the effect of forward mutation Genotype A1 A1 A1 A2 A2 A2
on f (A1) is described by the value mp, and the effect of Relative fitness 1 1 1 - s
reversion on f (A2) = nq. These two expressions iden-
tify, respectively, the rate at which A2 alleles are created Here, the equilibrium frequency of the recessive allele (qE)
from A1 by forward mutation and the rate at which A2 is calculated as the balance between selection against a
alleles are reverted to A1. In each generation, the change recessive genotype (s) and the rate of mutation (m):
in the frequency of A2 is quantified by the expression ∆q
(“delta q”) that is calculated as ∆q = mp - nq. Over an qE = 2m /s
infinite number of generations in a theoretical population
where m and n are constant and no other evolutionary This expression predicts that when selection against the
processes are operating, allele frequency equilibrium is recessive genotype is complete (i.e., s = 1.0), the equilib-
established. rium frequency of the mutant allele is approximately the
The equilibrium frequencies of alleles subject only square root of the mutation rate. When the selection coef-
to mutation and reversion are a ratio of the frequencies of ficient is less than 1.0, the equilibrium frequency is greater
the respective events. Since the equilibrium frequencies than the square root of the mutation frequency.
are purely a function of the ratios of the rates at which In the case of complete selection against a lethal domi-
new copies of an allele are added and removed from the nant mutant allele B2, the relative fitness values of the geno-
population gene pool, they are calculated as pE = n/(m + n) types are as follows.
and qE = m/(m + n). In a theoretical population
Genotype B1 B1 B1 B2 B2 B2
where f (A1) = 0.99, f (A2) = 0.01, m = 2 * 10-6, and
n = 3 * 10-8, ∆q is expressed as ∆q = [(2 * 10-6)(0.99) Relative fitness 1 1 - s 1 - s
-(310-8)(0.01)] = 1.9810-6. This small change gradu-
In this case, qE = m. In other words, when s = 1.0 against
ally increases f (A2) and decreases f (A1), leading eventu-
a lethal dominant mutation, the equilibrium frequency of the
ally to equilibrium allele frequencies. When equilibrium is
mutant allele is equal to the mutation frequency.
achieved by the interaction of forward and reverse mutation
Numerous examples of mutation–selection balance
rates in this population, the allele frequencies will be
have been investigated in organisms, including humans.
pE = 3 * 10-8/(2 * 10-6 + 3 * 10-8) = 0.015 and Several studies of human hereditary disease alleles reveal
qE = 2 * 10-6/(2 * 10-6 + 3 * 10-8) = 0.985 that recessive mutant alleles are maintained in populations at
frequencies predicted by calculating the mutation–selection
These are stable allele frequencies that, once achieved, balance.
will be maintained as long as the forward and reverse muta-
tion rates stay the same and no other evolutionary process
intervenes.
20.4 Gene Flow Occurs by the
Mutation–Selection Balance Movement of Organisms and Genes
Unlike the theoretical population just described, mutations between Populations
in the real world are commonly subject to natural selec-
tion. In cases where the deleterious mutation is recessive, In evolutionary terms, gene flow, also known as migration,
the mutant allele is masked by the wild-type dominant allele refers to the movement of alleles into and out of populations.
in heterozygous genotypes. Recessive mutant alleles are It can bring novel alleles into a population, it can increase
subjected to natural selection only when they occur in the the frequency of alleles already present in a population, or
738 CHAPTER 20 Population Genetics and Evolution at the Population, Species, and Molecular Levels
it can remove alleles from a population. These events can (a) The island model of migration
potentially have the immediate effect of changing allele fre-
quencies in a population. Gene flow brought about by the Island
addition of new organisms to an existing population gener-
ates a new population, identified as an admixed population, 800
consisting of members from the two formerly distinct popu- Migrants
A1 = 0.50 A1 = 1.0 A1 = 0.60
lations. In more familiar terms, you can think of gene flow A2 = 0.50 A2 = 0.0 = A2 = 0.40
as the consequence of the migration of organisms into a new n = 200 n = 1000
population or the emigration of organisms out of a popula-
tion. These organisms carry their genes with them as they Mainland Original island Admixed island
move, creating a flow of genes into or out of a population. population population population
Continent
Effects of Gene Flow
Gene flow has two principal effects on populations. First, in (b) Consequence of migration
the short run, gene flow can cause the admixed population Island Allele frequencies
A1A1 A1A2 A2A2
population in admixed population
to have a different frequency of alleles, particularly if the
starting allele frequencies in one of the participating popu- Original f (A1) = 0.60
200 0 0
lations differ from those in the other and if the number of (n = 200) f (A2) = 0.40
immigrants constitutes a large proportion of the admixed
Admixed
population. Second, in the long run, gene flow acts to equal- (n = 1000)
400 400 200
ize frequencies of alleles between populations that remain
in genetic contact by the exchange of population members Genotype
0.36 0.48 0.16
back and forth between the populations. This exchange can frequencies
in admixed
also slow genetic divergence of populations and block spe- population
ciation. Let’s look at how both of these effects are explained.
The change in allele frequencies produced in an Figure 20.7 The island model of migration.
admixed population by gene flow from population 1 into
population 2 can be described by the island model of
migration that depicts a one-way process of gene flow,
that is, from a mainland population to an island popula- populations by collecting pollen from plants in one popu-
tion. In the example illustrated in Figure 20.7a, gene flow lation of plants and depositing it on flowers in a different
changes allele frequencies by reducing f (A1) on the island population. Gene flow can also occur through the action
from 1.0 to 0.60 and increasing f (A2) from 0.0 to 0.40. In of the organisms themselves. As an example, the escape of
this example, gene flow has produced an almost instanta- farm-raised salmon from their ocean pen can lead to their
neous evolutionary change (Figure 20.7b). The admixed reproducing with wild salmon.
population has allele frequencies of f (A1) = 0.60 and Examples of gene flow in humans exist as well, but
f (A2) = 0.40, but the genotypes are not in H-W equilib- one example, harking back to events that affect the com-
rium immediately following migration. A single generation position of the present-day human genome, is of particular
of random mating, however, will bring the genotype fre- note. The Neanderthals were an archaic human lineage that
quencies into ratios consistent with the H-W equilibrium: was distributed across Europe and large parts of Asia from
A1A1 = 0.36, A1A2 = 0.48, and A2 A2 = 0.16. about 400,000 to approximately 30,000 years ago. A second
The impact of gene flow on allele frequencies in an archaic human lineage, the Denisovans, also inhabited parts
admixed population is expressed by a formula that calcu- of Europe and Asia. Beginning about 70,000 to 80,000 years
lates pN, the new value of p, as the weighted average of the ago, a new human lineage—the lineage that would displace
allele frequency among island residents and mainland immi- Neanderthals and Denisovans and give rise to all contem-
grants. The expression uses pI and pC to represent f (A 1 ) in porary human populations—migrated out of Africa and into
the original island and mainland populations, respectively. Europe and Asia. Evidence from the sequencing of ancient
The formula identifies the fraction of individuals or alleles Neanderthal DNA, ancient Denisovan DNA, and modern
from the mainland population as m, and the fraction con- human genomes reveals that the genomes of many present-
tributed by island residents as 1 - m. The value of pN as a day humans contain small amounts of DNA that originated
result of gene flow is pN = (1 - m)(pI) + (m)(pC). Apply- in Neanderthals or Denisovans. On average this DNA, a
ing this formula to our example in Figure 20.7, we find consequence of gene flow from Neanderthals and Deniso-
pN(0.20)(1.0) + (0.80)(0.50) = 0.60. vans, makes up approximately 2 to 4% of the genome in a
Examples of gene flow abound in animals and plants. living human. Application Chapter D: Human Evolutionary
For example, bees can facilitate gene flow between plant Genetics discusses more details of this analysis.
20.5 Genetic Drift Causes Allele Frequency Change by Sampling Error 739
Allele Frequency Equilibrium the next generation. If, for example, the draw of 20 alleles
and Equalization contains 12 A1 alleles and 8 A2 alleles, the allele frequen-
cies in the next generation will be f (A1) = 12/20 = 0.60
We have just seen that gene flow can produce rapid evo- and f (A2) = 8/20 = 0.40. A change of such magnitude
lutionary change in the allele frequencies of popu- can easily occur by chance in the small sample, but it is
lations. In the short term, the effect of gene flow is very unlikely to occur in the larger sample of 1000 alleles.
determined by the change in the frequency of p in the Sampling errors of the kind described for the first sam-
new gene pool of the island population. This value, ple can randomly raise or lower the frequency of an allele in
∆pI, is the difference in allele frequency before and a small population each generation. Once the allele frequen-
after migration, and is defined as ∆pI = pN - pI. Sub- cies are changed, the next generation, when it reproduces,
stituting the formula for pN and simplifying gives has the new allele frequencies as a starting point. Over
∆pI = [(1 - m)(pI) + (m)(pC)] - pI = m(pC - pI). multiple generations, the frequency of an allele in a small
Allele frequency equilibrium occurs when ∆pI = 0; thus, population will randomly fluctuate, or “drift,” sometimes
at equilibrium, m(pC - pI) = 0, indicating that p remains increasing and sometimes decreasing, due to nothing more
constant either when there is no migration (m = 0) or when than the chance deviations in small random samples.
p in the island gene pool equals the allele frequency in the Allele frequency changes due to genetic drift are ran-
mainland gene pool (pI = pC). dom. In the absence of any other evolutionary influence,
Population and evolutionary biologists use this rea- and given a sufficient number of generations, allele frequen-
soning to conclude that gene flow has a homogenizing, or cies will drift until, ultimately, one allele reaches fixation
equalizing, effect on allele frequencies among participat- at a frequency of 1.0 and all other alleles are eliminated.
ing populations. By this mechanism, gene flow maintains Figure 20.8 illustrates four different simulations of genetic
genetic contact between populations and can thus prevent drift of an allele in experimental populations and shows how
evolutionary divergence of populations. In broader evolu- the result of genetic drift for 30 generations can vary among
tionary terms, gene flow hinders the establishment of the populations that are initially identical. Each experimental
reproductive isolation that is an important component of population begins with 20 organisms and maintains that
evolutionary divergence between populations and of poten- number throughout the 30 generations. The initial starting
tial speciation. frequency of the allele is 0.50 in each population, so there
is no frequency bias that favors or disfavors the allele at the
beginning of the simulations.
20.5 Genetic Drift Causes Allele
Frequency Change by Sampling
Error Population 1
1.0
ulation is not likely to contain all alleles in exactly the same 0.6
frequencies as in the larger population. Genetic drift affects
all populations, but it is especially prominent in small popu- 0.4
lations in which a small number of gametes unite to produce Population 3
each subsequent generation. 0.2
To appreciate the cause and consequences of genetic
drift, picture a gene pool with alleles at frequencies Population 4
0.0
f (A1) = f (A2) = 0.50 from which two separate samples 0 5 10 15 20 25 30
are drawn. In sample one, 20 alleles are drawn at random,
Generation
whereas in the second sample, 1000 alleles are drawn.
These two separate draws represent the alleles that, in the Figure 20.8 Genetic drift of an allele frequency. Four
two respective cases, unite to form the next generation. In simulated populations each start with a frequency of 0.50 for a
the first sample, containing 20 alleles, each allele represents hypothetical allele whose frequency fluctuates randomly in each
5 percent (one allele out of 20) of the total for the next gen- population over 30 generations. The allele eventually becomes
eration, whereas in the 1000-allele sample, each allele only fixed in population 1, is eliminated in population 4, and is still
represents 1/1000 of the alleles in the next generation. Any present in populations 2 and 3 at distinct frequencies.
deviation from exactly 10 A1 alleles and 10 A2 alleles in the Q Based on the results shown, write a general description of
first sample will substantially change allele frequencies in the impact of genetic drift on allele frequencies in a population.
740 CHAPTER 20 Population Genetics and Evolution at the Population, Species, and Molecular Levels
The loss of genetic diversity from a genetic bottleneck Inbreeding, mating between related individuals, is a form of
can be quantified in two ways: first, by determining the per- nonrandom mating that alters the distribution of alleles into
centage of polymorphic loci in the population, and second, genotypes.
by determining the percentage of loci that are heterozygous
in an average individual. The Coefficient of Inbreeding
Genetic bottlenecks can affect single populations,
or they can affect an entire species. An example of the Inbreeding, also known as consanguineous mating (con-
latter case would be a near-extinction event such as the sanguineous means “with blood”), is mating between related
one that affected the northern elephant seal (Mirounga individuals who share a greater proportion of alleles with
angustirostris). This animal was historically distributed one another than with random members of a population. The
along the western coast of North America, in numbers principal genetic consequences of inbreeding are an increase
that exceeded 150,000 in the mid-1800s. Extensive hunt- in the frequency of homozygous genotypes in a population
ing devastated the rookeries where young elephant seals and a decrease in the frequency of heterozygous genotypes
were raised, and by 1884 fewer than 100 elephant seals relative to the frequencies expected from random matings.
remained. Some biologists have estimated that the surviv- The likelihood of homozygosity is increased because related
ing population may have been as small as 20 individuals. organisms share alleles and are thus more likely to produce
The entire remaining population bred at an isolated rook- homozygotes, especially when the alleles involved are rare
ery on Guadalupe Island, about 200 miles off the western in the general population. Inbreeding does not change allele
shore of Baja California. Elephant seal protection mea- frequencies. Instead, it systematically redistributes alleles
sures put in place by the U.S. and Mexican governments into genotypes in a manner that increases homozygosity and
in the early 1900s led to population growth and the rees- reduces heterozygosity relative to the frequencies expected
tablishment of additional rookeries. Today, the northern under H-W equilibrium.
elephant seal remains a protected species that has returned Inbreeding is a normal reproductive process for self-
to its historic population size of approximately 150,000 fertilizing plants and for some animals that reproduce by
individuals. self-fertilization. The effect of self-fertilization on geno-
In 1974, Robert Selander and his colleagues collected type proportions is shown in Table 20.6, where a hetero-
blood samples from 159 northern elephant seals from five zygous organism self-fertilizes and produces genotypes in
populations and examined 24 blood protein and enzyme generation 1 in a 1:2:1 ratio. Self-fertilization of generation
genes for evidence of genetic variation. All 24 genes were 1 individuals produces a generation 2 that has an overall
monomorphic, and the single allele of each gene was iden- increase in the frequency of both homozygous genotypes
tical in all five populations! About 20 years later, A. Rus and a decrease of one-half in the frequency of the hetero-
Hoelzel and colleagues expanded the genetic survey of zygous genotype. The decrease in heterozygous frequency
northern elephant seals to include 43 genes in 61 individu- of one-half occurs each generation. By generation 4, a little
als from the five populations. They also found no genetic more than 6 percent of the progeny are heterozygous, and
variation. Additionally, Hoelzel and colleagues examined more than 93 percent are homozygous. Note, however, that
variation of mitochondrial DNA in northern elephant seals the allele frequencies of A1 and A2 remain unchanged at
and found a low level of sequence variation in two distinc- f (A1) = f (A2) = 0.50 in each generation.
tive mitochondrial DNA haplotypes that had frequencies of Among sexually reproducing organisms, the effect of
0.725 and 0.275. The extremely limited genetic variation in inbreeding is similar, but it takes place over a larger number
northern elephant seals is wholly consistent with the histori- of generations since the proportion of organisms in a popu-
cal genetic bottleneck that left very little genetic variation in lation participating in consanguineous matings is generally
the surviving population members. low. The population geneticist Sewall Wright investigated
by adding the probability of the four complete loops (one The magnitude of inbreeding depression depends on
for each allele) that could link an allele in a common ances- the organism. Among plants that naturally reproduce by
tor to an inbred homozygous IBD descendant. In this case, self-fertilization, the inbreeding depression is small. Many
F = (1/2)6 + (1/2)6 + (1/2)6 + (1/2)6 = 1/16. The value bird species also experience only relatively minor inbreed-
can also be determined as F = 4(1/2)6 = 1/16. Genetic ing depression. This lack of negative consequence has been
Analysis 20.3 demonstrates another computation of an particularly beneficial in captive breeding programs that
inbreeding coefficient. have bred bird species such as the California condor and
First-cousin mating is a form of inbreeding that is rel- then reintroduced the birds into their natural environment.
atively common in many human societies and is common In contrast to birds and plants, however, mammals expe-
in mammals in general. It can have negative genetic out- rience severe inbreeding depression. The scientific literature
comes in the form of infants with recessive conditions due contains about 20 reports on inbreeding and inbreeding
to homozygosity for recessive alleles that are very rare in depression from captive mammal breeding programs. The
a population (i.e., q = 0.005 or less). In such cases there reports outline that inbreeding depression is a serious issue,
can be a 20- to 30-fold increase in the likelihood that a resulting in reduced reproductive success of captive ani-
first-cousin mating will produce a child with a recessive mals, reduced litter size, decreased longevity, and reduced
phenotype compared with the risk by random mating. survival of infant and juvenile animals. To maximize the
However, when the recessive allele frequency is as com- chances of success in captive breeding programs for mam-
mon as q = 0.01, for example, the chance of producing a mals, matings are carefully managed to avoid mating inbred
recessive homozygote from a first-cousin mating is only animals when possible and to minimize F by using just one
a few times more likely than the chance of producing a inbred animal in a mating when the use of an inbred animal
recessive homozygote by random mating. The effect dis- is necessary or cannot be avoided.
appears as the frequency of q in the population increases
further.
20.7 New Species Evolve by
Inbreeding Depression Reproductive Isolation
The genetic consequences of inbreeding for populations are
an increase in the frequency of homozygous genotypes and Our discussion to this point has focused on microevolution,
a decrease in the frequency of heterozygous genotypes. One that is, evolution operating at the population level. In this
immediate impact of these consequences is seen when small, section, we broaden our perspective to examine evolution at
captive populations of organisms are bred to perpetuate a the species level and above.
nearly extinct species. The increased frequency of homo- The most widely used definition of a species, and the
zygosity can lead to a phenomenon known as inbreeding definition we use for purposes of this discussion, is the
depression, the reduction in fitness of inbred organisms, biological species concept (BSC). It was developed in
often as a result of the reduced level of genetic heterozygos- 1942 by the biologist Ernst Mayr, who also made impor-
ity. The reduced fitness associated with inbreeding depres- tant contributions to the modern synthesis of evolution (see
sion can be due either to an increase in the proportion of Section 1.5). Mayr stated that from a biological perspec-
deleterious homozygous genotypes or to the higher fitness tive a species could be described as a group of organisms
of heterozygotes. capable of interbreeding with one another but isolated from
Inbreeding and inbreeding depression have real-world members of other species. By this definition, the alleles car-
consequences for the planet’s biodiversity and for efforts ried by a species stay within the confines of the species and
to preserve nearly extinct species. One of several strategies are not exchanged with other species. This definition pres-
adopted by biological scientists and others interested in pre- ents some problems for application in the real world. One
serving nearly extinct species is the design of captive breed- problem is that the BSC cannot be used when one is dealing
ing programs. These programs are part of conservation with fossilized remains or extinct species. A second problem
genetics, a branch of population genetics that designs, is the difficulty in some cases of discovering whether or not
conducts, and monitors captive breeding programs with two organisms are capable of reproducing. Third, the BSC
the intent of maintaining vanishing populations. One of the cannot be applied to organisms that do not engage in sex-
principal areas of concern for managers of captive breeding ual reproduction. And, finally, the assumption of the BSC
programs is the magnitude of inbreeding coefficients and is violated by organisms capable of interspecies hybridiza-
the danger of inbreeding depression for the captive breeding tion. A well-known example is the mating of a male don-
populations. Captive breeding program managers attempt to key (2n = 62) and a female horse (2n = 64) to produce the
avoid the negative consequences of inbreeding depression infertile hybrid known as a mule. The mule gets 31 chromo-
by designing mating strategies that include as little inbreed- somes from the donkey parent and 32 chromosomes from
ing as possible. the horse parent for a total of 63 chromosomes. Mules are
GENETIC ANALYSIS 20.3
PROBLEM The pedigree shown here depicts crosses performed as part
1 2
of an antelope captive-breeding program. Use the pedigree information I
to calculate the coefficient of inbreeding (F) for the mating of IV-1 and III-3
that produces the animal identified as V-1.
1 2 3 4
BREAK IT DOWN: Each allele II
transmission probability is 1/2.
Individual V-1 has two common
ancestors, either of whom could
be the source of an allele that is 1 2 3 4 5
IBD (p. 742).
III
1
IV
1
V
Evaluate
1. Identify the topic of this problem and the 1. This problem concerns determination of the coefficient of inbreeding (F)
nature of the required answer. for a specific mating.
2. Identify the critical information given in the 2. The pedigree depicting the common ancestry of the related animals is
problem. given.
Deduce
3. Count the number of transmission events that 3. Counting from a common ancestor to individual V-1, there are seven
must occur for an allele to be identical by transmission steps required to produce an allele that is IBD.
descent (IBD) in V-1.
4. Identify the transmission probability for each 4. For an autosomal allele, the transmission probability is 1/2.
step of transmission.
5. Identify the total number of alleles of an 5. There are two common ancestors (I-1 and I-2) for the inbred individual
autosomal gene in the common ancestors ( V-1). There are two alleles per gene in each common ancestor, for a
of V-1. total of four alleles at each locus.
Solve
6. Calculate the coefficient of inbreeding for this 6. The coefficient of inbreeding is F = 4(1/2)7 = 1/32.
pedigree.
For more practice, see Problems 33–36. Visit the Study Area to access study tools. Mastering Genetics
infertile due to their odd number of chromosomes that can- of macroevolution. First, Darwin proposed that hereditary
not properly segregate to form gametes (see Section 10.3). variation is present in all species and controls the pheno-
Given the potential difficulties of applying the BSC, typic variability in each species. Second, Darwin proposed
alternatives have been developed. One alternative is the that natural selection allows species members with favored
morphospecies concept, which defines species based phenotypic attributes to survive and reproduce in greater
exclusively on morphology. A second alternative is the numbers than species members with other phenotypes. Dar-
phylogenetic species concept, which defines a species as win described his model combining these principles as “the
the smallest recognizable group with a unique evolutionary theory of descent with modification through variation and
history. natural selection.” In other words, Darwin viewed inherited
variation and the operation of natural selection as the ele-
ments essential to the transformation of one species into
Processes of Speciation another.
Charles Darwin was the first to describe the concept that Innumerable biological investigations in the last 150
existing species evolve from preexisting species. In his years have verified and elaborated upon Darwin’s original
famous 1859 book, On the Origin of Species by Means of proposals as well as quantifying the effects and the interplay
Natural Selection, he laid out two guiding principles of spe- of each of the four evolutionary processes (natural selec-
cies formation that are still considered fundamental aspects tion, mutation, migration, and genetic drift) on speciation.
744
20.7 New Species Evolve by Reproductive Isolation 745
The clear picture of speciation that emerges from these stud- The evolutionary history of modern horses and their liv-
ies is that the evolutionary lineages leading from ancestral ing relatives, zebras and donkeys (all three being members of
organisms to descendant forms are almost never simple, the genus Equus), is an example of the typical complexity of
straight lines of descent. Instead, the evolutionary history evolutionary history (Figure 20.11). One can trace a lineage
of modern species is filled with side branches that died out leading more or less directly from Hyracotherium in the
because a species, once developed, could not adapt to new early Eocene (about 54 million years ago) to modern Equus,
environments or was displaced by competing species. It can but this would ignore the many other branches of the evolu-
be tempting to look backward into the evolutionary past and tionary tree that did not produce modern-day organisms.
identify a linear step-by-step procession leading to mod- The evolutionary tree leading to the modern species of
ern species, but this perspective minimizes the occurrence Equus illustrates the complex patterns of relationships that
of adaptive changes that led to evolutionary “dead ends.” can occur as new species evolve. The figure illustrates a
More important, the backward-looking approach ignores a phylogenetic tree that is inferred from the physical charac-
major reality of evolution: Evolutionary history is far more teristics identified in fossil remains. In identifying the evolu-
like a multibranched bush rather than like a tree with a long, tion of horses, characteristics of the skull, teeth, and hoof
straight branches connecting past and present. are particularly important in determining which ancestral
eri her us
um ium
oth aeot loph
o
Pa ropa hyn
Anchitherium
P Pac
lae l
Sinohippus
us
Hyracotherium
ipp
Monophyletic group
oh
Mi
Horse
Megahippus Hipparion
Equus Zebra
us
Donkey
ipp
Anchitherium
soh
Neohipparion
Me
Archaeohippus
Callippus
Figure 20.11 Evolution of the genus Equus. This multibranched evolutionary tree includes the
monophyletic group that includes the modern species of the genus Equus and shows a few of the nearly
200 branches of the phylogeny descending from Hyracotherium.
746 CHAPTER 20 Population Genetics and Evolution at the Population, Species, and Molecular Levels
traits present in an ancestor correlate with derived traits prevent the formation of a zygote following interspecies mat-
present in a descendant. The modern species of the genus ing. On the other hand, postzygotic mechanisms of repro-
Equus form a monophyletic group of the modern species ductive isolation result in the failure of a fertilized zygote to
and their common ancestor. survive, or result in sterile offspring of an interspecies mat-
DNA sequences can also be used to determine phyloge- ing. These mechanisms of reproductive and genetic isolation
netic relationships. Recall from the Case Study in Chapter 1 lead to allopatric speciation or sympatric speciation.
(pp. 24–26), that the relationship of an extinct relative of the
zebra called the quagga was determined by collecting DNA Allopatric Speciation In allopatric speciation, populations
from preserved quagga hides and comparing it to DNA are separated by a physical barrier. New species can develop in
from zebra species. Whether phylogenies are constructed separate geographic locations as a consequence of their repro-
using morphologic traits or DNA sequence, they share two ductive isolation. Two principal mechanisms create the separa-
essential features: (1) inherited variation controlling critical tions that lead to reproductive isolation: (1) physical separation
phenotypic variation and (2) morphology and genome of a segment of a large population by a physical barrier that
content evolve through evolutionary processes. prevents gene flow and (2) colonization of new territory (Fig-
ure 20.12). Geographic events such as the advance of a glacier,
the emergence of a mountain range, change in flow pattern
Reproductive Isolation and Speciation of a river, or erosion of a canyon are typical of the kinds of
Evolutionary change at the species level is driven by physical changes that lead to reproductive isolation and species
reproductive isolation that can result from any morpho- diversification. An example of this kind of geographic separa-
logical, behavioral, or geographic condition or set of con- tion and species development is found in the American South-
ditions that prevents one population from breeding with west, where the formation of the Grand Canyon beginning 5 to
others. Reproductively isolated populations adapt separately 6 million years ago split an ancestral species of ground squirrel
to their particular circumstances, and divergence is a likely and led to its eventual diversification into two distinct species.
consequence. In each environment, differential reproductive Today, Ammospermophilus leucurus is a gray-colored ground
success driven by natural selection allows the better-adapted squirrel found on the north rim of the Grand Canyon, whereas
organisms to leave more progeny. Reproductive isolation is squirrels on the south rim of the canyon are members of the
an important component for both cladogenesis and anagene- chestnut-colored Ammospermophilus harrisii.
sis, although the precise mechanisms of isolation may differ. The colonization model of allopatric speciation predicts
The concept of cladogenesis and reproductive isolation that new species diversify following colonization of new hab-
of species derives from work by Theodosius Dobzhansky, itats. The diversification of Drosophila species on the Hawai-
Ernst Mayr, and other evolutionary biologists who recog- ian Islands is a case study of this mechanism (Figure 20.13).
nized that new species can form when reproductive barri- The Hawaiian Islands are part of a long chain of landmasses
ers prevent the exchange of genes between populations. and submarine structures that stretch in a northwest-to-
In describing the necessity of reproductive isolation in southeast direction and are produced by the movement of the
this process, two mechanisms are identified (Table 20.7). Pacific tectonic plate over a volcanic hotspot that lies in the
Prezygotic mechanisms of reproductive isolation are those earth’s mantle beneath it. As the plate slides toward the west,
that prevent mating between members of different species or new islands are produced by volcanic activity of the hotspot.
Prezygotic Mechanisms
Behavioral isolation: Sexual behavior in different species are incompatible, or sexual attraction is lacking between them.
Gametic isolation: Mating takes place between different species, but the gametes fail to unite with one another due to
differences in gamete compatibility or to failure of male gametes to survive until fertilization of female gametes.
Geographic isolation: Species reside in separate geographic locations or are separated by geographic features that prevent
their contact.
Habitat isolation: Species inhabit different ecosystems that prevent them from coming into contact.
Mechanical isolation: Male and female genitalia or reproductive structures of different species are anatomically incompatible.
Temporal isolation: Timing of reproductive ability or receptivity in different species is incompatible.
Postzygotic Mechanisms
Hybrid breakdown: Viable and fertile interspecies hybrids form, but after the F1 generation the fitness of the progeny of
hybrids is less than that of progeny from nonhybrids.
Hybrid inviability: The fertilized zygote of an interspecies mating fails to survive gestation.
Hybrid sterility: Interspecies hybrids are viable but infertile.
20.7 New Species Evolve by Reproductive Isolation 747
Migration
The oldest of the islands are Nihau and Kauai to the north- from a wild diploid grass to its contemporary allohexaploid
west; the youngest island is Hawaii, which is still growing by form (see Figure 10.12). Animals that develop nocturnal or
volcanic eruptions of Mauna Loa and Kilauea. diurnal patterns of activity that make them more likely to
In 2005, James Bonacum and his colleagues examined encounter only those other members of the population that
genetic and morphologic data in numerous Hawaiian Dro- are active at the same time are another example of potential
sophila species to test the allopatric speciation model. They sympatric speciation. Similarly, changes in the seasonality of
found that the most closely related species occur on adja- reproduction can limit organisms to the ability to reproduce
cent islands and that the phylogenetic pattern of species for- only during certain times of the year. Organisms living in the
mation corresponds to the pattern of emergence of islands. same geographic area that do not have the same reproductive
These results provide support and documentation for the seasonality will be unable to mate.
model of allopatric speciation by colonization.