0% found this document useful (0 votes)
4 views5 pages

Duyk 1990

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
4 views5 pages

Duyk 1990

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 5

Proc. Nati. Acad. Sci.

USA
Vol. 87, pp. 8995-8999, November 1990
Genetics

Exon trapping: A genetic screen to identify candidate transcribed


sequences in cloned mammalian genomic DNA
(retroviral vectors/RNA splicing/human genetics)

GEOFFREY M. DUYK*t, SUWON KIM*, RICHARD M. MYERS§¶, AND DAVID R. COX**¶


Departments of *Pediatrics, tPsychiatry, §Physiology, and lBiochemistry/Biophysics, The University of California at San Francisco, 513 Parnassus Avenue,
San Francisco, CA 94143-0554
Communicated by Thomas P. Maniatis, July 31, 1990

ABSTRACT Identification and recovery of transcribed nome representing coding sequence, will require additional
sequences from cloned mammalian genomic DNA remains an methods. In this paper, we describe a strategy, exon trapping,
important problem in isolating genes on the basis of their that facilitates the recovery of transcribed sequences in cloned
chromosomal location. We have developed a strategy that mammalian genomic DNA through the functional identifica-
facilitates the recovery of exons from random pieces of cloned tion of cis-acting sequences required for RNA splicing.
genomic DNA. The basis of this "exon trapping" strategy is
that, during a retroviral life cycle, genomic sequences of
nonviral origin are correctly spliced and may be recovered as MATERIALS AND METHODS
a cDNA copy of the introduced segment. By using this genetic Bacteria and Cell Lines. Escherichia coli DH5a was used in
assay for cis-acting sequences required for RNA splicing, we all instances unless otherwise noted (12). E. coli GM48, a
have screened -20 kilobase pairs of cloned genomic DNA and Dam methylase-deficient strain (12), was used to prepare
have recovered all four predicted exons. pETV-SD DNA. High-efficiency competent bacterial cells
were prepared by the method of Hanahan with the modifi-
Classic genetic analysis of mouse and human has identified cations of Sambrook et al. (12). Cell lines 4f-2, PA-317, and
many mutant loci for which the biochemical basis of the given COS were grown and maintained as described (13-15).
phenotype has not been defined. Construction of detailed Plasmids. pETV-SD is described in Fig. lA. pETV-SD:
genetic and physical maps delineate the minimal region HBG(+) contains exon 2 of the HBG gene (nucleotide
within which a gene of interest may be found. The flanking positions 192-477; ref. 17) cloned into the Bcl I site of
markers demarcating the boundaries of this region are often pETV-SD in the sense orientation. pETV-SD:HBG(-) is a
separated by several hundred kilobase pairs, and candidate similar plasmid with the entire exon-intron-exon motif in the
genes are selected on the basis of their location between these opposite (antisense) orientation. pETV-SD:RGRex.5 plas-
markers. This approach has been successful in the identifi- mids contain a 1.75-kb BamHI-Bgl II fragment containing
cation of a number of genes, including those for Duchenne exon 5 of the rat glucocorticoid receptor gene (18) cloned into
muscular dystrophy (1) and cystic fibrosis (2). the Bcl I site of pETV-SD in the sense (+) or antisense (-)
In spite of these successes, the identification and recovery orientation. pETV:HLA1.5 plasmids contain a 1.527-kb
of transcribed sequences remain important technical prob- BstYI fragment containing exons 4-6 of the HLA-A2 gene
lems. Experience gained in the cloning of the genes noted (19) cloned into the Bcl I site of pETV-SD in the sense (+) or
above and efforts to recover all transcription units in the antisense (-) orientation.
mouse and human major histocompatibility complex (3-5) For construction of the pETV-SD:HLA library and sub-
have established three screening strategies for genes from clones, the plasmid pHLA-A2 (19), a 5.1-kb genomic sub-
cloned genomic DNA. In vertebrates, most constitutively clone in pUC9 containing a complete copy of the HLA-A2
expressed genes and some regulated coding sequences are allele, was digested to completion with BstYI and ligated to
marked at their 5' ends by distinctive regions containing a Bcl I-digested pETV-SD. The ligation mixture was digested
high density of hypomethylated CpG residues (6). These with Bcl I and transformed into competent E. coli DHSa. This
"CpG islands" are identified by the clustering of recognition step significantly reduces the background of colonies with no
sites for rare-cutting restriction enzymes and are confirmed inserts, since vector ligated to itself is sensitive to cleavage
by testing with methylation-sensitive and -insensitive iso- by Bcl I. Ligation of insert to vector destroys the Bcl I site,
schizomers (6). A second method for identifying coding so that recombinant molecules are not linearized. In addition,
sequences is interspecies cross-hybridization ("zooblots"; cleavage of internal Bcl I sites in the insert is prevented by in
ref. 1). Many, but not all, unique and low-copy sequences vivo Dam methylation of cleavage sites. Analysis of 100
conserved between species represent genes and can be colonies demonstrated that the library contained all expected
detected by Southern hybridization techniques. A third DNA fragments.
method is direct screening of cDNA libraries or Northern Exon Trapping. Each confluent culture (100-mm plate) of
blots with whole phage or cosmid clones. The relative merits 4*-2 cells was split 1:5 and the next day the cells were
of these three approaches have been reviewed (7). Other transfected with 20 jig of plasmid DNA (pETV-SD or one of
strategies, including DNA sequencing and genetic screens for its derivatives) in the presence of Lipofectin reagent (ref. 20;
open reading frames (8), enhancers (9), or promoters (10, 11) BRL). Three days later, cells were split 1:10 into medium
are available, but their value appears limited for identifying
genes in long segments of cloned genomic DNA. Abbreviations: HBG, human f-globin; X-Gal, 5-bromo-4-chloro-3-
Mapping and recovering the complete set of transcribed indolyl f8-D-galactopyranoside; IPITG, isopropyl P-D-thiogalactopyr-
sequences, corresponding to the predicted 2-4% of the ge- anoside; SA, splice acceptor; SD, splice donor; IVS, intervening
sequence; a-,f-Gal, a-complementing factor of E. coli P-galacto-
sidase; SV40, simian virus 40; PCR, polymerase chain reaction.
The publication costs of this article were defrayed in part by page charge tTo whom reprint requests should be addressed at: HSE 1556, Box
payment. This article must therefore be hereby marked "advertisement" 0554, University of California, 513 Parnassus Avenue, San Fran-
in accordance with 18 U.S.C. §1734 solely to indicate this fact. cisco, CA 94143.
8995
8996 Genetics: Duyk et al. Proc. Natl. Acad. Sci. USA 87 (1990)

replaced with medium and the infection protocol was re-


peated the next morning. After this second round of infec-
(cloning sites) tion, G418 was used to select for virus-producing colonies.
(+6259) Xb I-,> The virus-containing medium from these colonies was used
N
to infect COS cells by using the infection protocol described
(+5992)
above. Forty-eight hours after infection of COS cells, am-
plified episomal DNA was isolated by the procedure of Hirt
(21). The recovered episomal DNA was digested with Dpn I,
which cleaves any contaminating Dam-methylated plasmid
DNA. Episomes replicated in COS cells are insensitive to
cleavage as they are not methylated at Dam sites. The
digested mixture was used to transform E. coli DH5a cells
and transformants were selected on LB agar plates containing
kanamycin (50 Ag/ml), 5-bromo-4-chloro-3-indolyl /-D-
RI galactopyranoside (X-Gal, 0.002%), and isopropyl 8-D-
(+2248)
thiogalactopyranoside (IPTG, 0.2 mM).
Lac- transformants (white colonies) were patched onto
SI' LB/kanamycin/X-Gal/IPTG plates, transferred to nitrocel-
(+2829)
lulose filters, and hybridized to an HBG exon 1-specific
(+4321)
probe. Colonies that hybridized to the probe contained
(^ - IN .3319) plasmids that potentially underwent splicing events and were
+3737) (+3470) subsequently analyzed by DNA sequencing. Those that did
not hybridize to the probe had undergone gross rearrange-
(cloning sites) ment and were not characterized further. DNA was se-
SD( +142)
HI,Bs,B,S quenced by the dideoxynucleotide chain-termination method
+54 +192
Hind II Pvul with the Sequenase version 2.0 kit (United States Biochem-
-
I ical). DNA sequencing was primed by using the HBG exon
I-specific oligodeoxynucleotide HBG I (5'-GGAGAAGTC-
Ex.I S alpha-beta-galactosidase TGCCGTTACTG-3').
FIG. 1. Maps of pETV-SD and the exon-trap cassette. (Upper)
Polymerase Chain Reaction (PCR). A PCR assay for delin-
Plasmid pETV-SD is a derivative of pDOL- (16), a retroviral shuttle eating spliced versus unspliced products of recovered pETV-
vector with deletion of the naturally occurring splice donor (SD) and SD:HBG(+/-) was developed. Reaction conditions were as
splice acceptor (SA) sites. The polyoma origin of replication and described (22). Reactions were run for 30 cycles with the
early region were replaced with a 1.67-kilobase pair (kb) Nhe I-Pvu following parameters: denaturation at 930C, 1 min; annealing
I fragment containing a ColEI origin of replication and a chloram- at 560C, 1 min; extension at 70'C, 3 min. Primers were HBG
phenicol acetyltransferase (CAT) gene. A 583-base-pair (bp) Bgl I and HBG II (5'-CCTTCACCTTAGGGTTGCCC-3'), and
II-Sal I fragment containing the exon-trap cassette was inserted the sizes of the PCR products were determined by agarose gel
between unique BamHI and Sal I sites, resulting in the 7-kb vector electrophoresis. A PCR with a recovered clone containing a
pETV-SD. LTR, long terminal repeat; star, retroviral packaging site; "spliced" insert as template results in a 165-bp product and
open box from +1 to SD, human f3-globin (HBG) exon 1 (from +54 an "unspliced" clone results in a 700-bp product.
to + 142); black box, gene encoding the a-complementing factor of E.
coli 8-galactosidase (a-,a-GAL); SV40, simian virus 40 origin of
replication/early promoter; neo, aminoglycoside phosphoribosyl- RESULTS
transferase gene from E. coli TnS; ori, CoIEI origin of replication;
CAT, CAT gene of E. coli Tn9. The first nucleotide of the exon-trap Experimental Strategy. pETV-SD is an exon-trap vector
cassette is defined as position +1 in pETV-SD. Arrows indicate that identifies functional SA sites encoded in cloned genomic
direction of transcription. H, HindIll; HI, BamHI; B, Bcl I; Bs, DNA fragments. Since most genes undergo RNA splicing,
BstXI; S, Sall; RI, EcoRI; X, Xho I; C, Cla I; N, Nhe I; Xb, Xba I; such sites serve as identifiers for the majority of genes.
P, Pvu I; S11, Sac II. (Lower) Exon-trap cassette. Starting with the (i) Genomic DNA to be tested is "shotgun-cloned" into the
5' Bgl II site, the cassette (583 bp) consists of positions +54 to + 192 vector pETV-SD downstream from the exon trap (Fig. 1).
of the HBG gene in the sense orientation, inserted at an EcoRV site The exon trap consists of a functional SD from the HBG gene
of a polylinker, and a 404-bp fragment encoding the a-,8-Gal gene
followed sequentially by BamHI, Bst XI, Bcl I, and Sal I sites. The and an IVS incorporating the a-p-Gal gene followed by a
Bgl II site was destroyed during insertion of the exon-trap cassette multiple cloning site (Fig. 1). The vector also contains the
into the vector. The cassette contains the wild-type HBG SD with cis-essential components for retroviral replication, the SV40
exon and intron sequences previously demonstrated to be required and ColEI origins of replication, and the TnS neo gene. The
for efficient splicing (17). The SA-site complex for HBG exon II has latter confers kanamycin resistance in bacterial cells and
been deleted and replaced with the a-a-Gal gene. Ex. 1, exon 1 of the G418 resistance in animal cells.
HBG gene from positions +54 to +142; IVS, the HBG intervening (ii) Pooled plasmid DNA from this shotgun cloning is
sequence 1 from positions + 143 to + 192. The BamHI, BstXI, Bcl I, transfected into an ecotropic retroviral packaging cell line,
and Sal I sites in the cassette are unique in pETV-SD.
qi-2 (13). Retroviral packaging cell lines provide the protein
supplemented with G418 (GIBCO) at 1 mg/ml. After 3-5
products required for propagation of the vector as a retrovi-
rus. The retroviral DNA is transcribed in vivo and transcripts
days, the G418 concentration was halved and cells were derived from recombinant molecules that contain a functional
grown as a mixed population for 7 days. Approximately 500 SA may undergo a splicing event with loss of the marked IVS
foci per plate were typically observed at this time. This mixed (Fig. 2).
population of cells was grown for an additional 3 days in the (iii) Both spliced and unspliced viral RNAs are packaged
absence of G418, the medium was filtered (0.45-tkm filter; into virions, harvested from the medium, and used to infect
Nalge Co., New York), and Polybrene (8 ttg/ml; Sigma) was the amphotropic retroviral packaging cell line PA-317 (14).
added to the filtrate. This step results in an additional round of retroviral replica-
PA-317 cells at 20% confluency were infected with 10 ml of tion and produces viral stocks of increased titer capable of
the virus-containing filtrate. Four hours later, the filtrate was infecting COS cells (14, 15, 21). Splicing of cloned genomic
Genetics: Duyk et al. Proc. Natl. Acad. Sci. USA 87 (1990) 8997

SD : SA
FIG. 2. Life cycle of pETV-
SD. A functional SA (enclosed in
dashed box) has been cloned ad-
jacent to the exon-trap cassette
(vertical open box and horizontal
black box). The primary transcrip-
|Transcription tion product of the integrated pro-
virus may be packaged directly
into progeny virions or may be
SD SA spliced, with concomitant loss of
the marked IVS, and then pack-
RUS KRRRfl aged into progeny virions. When
pETV-SD contains a cloned insert
without a functional SA, the pri-
\(6 L) Splicing mary transcript is packaged into
the virion without loss of the
marked IVS. LTR, long terminal
repeat; T, retrovirus packaging
site; vertical open box, HBG exon
3 R R R~~~i
R fURflRR 1; SD, SD of HBG exon I; hori-
zontal black box: a-a8-Gal gene;
Package into progeny dashed box, randomly inserted
virions fragment containing a functional
SA: stippled box, candidate exon;
RU5, 5' leader of retrovirus tran-
script; U3R, 3' segment of retro-
Package into progeny virus mRNA: AAAAAn, poly A
virions tail.

inserts in retroviral vectors is inefficient and this second (vii) Most mutations are deletions of the exon-trap portion
round of replication increases the likelihood that a splicing of the vector (see below) and are rapidly detected by colony
event will occur (21). hybridization with an HBG probe (donor exon).
(iv) Virus isolated from this second cell line is used to infect (viii) Bona fide splicing events are verified by direct DNA
COS cells, which constitutively produce SV40 large tumor sequencing primed from within the exon of the splice donor.
(T) antigen (15). The viral RNA genome is reverse- Correct splicing is indicated by precise removal of the
transcribed and is then amplified as a circular DNA episome genetically marked IVS and joining of the HBG exon to the
due to the presence of the SV40 origin of replication in the "'trapped" exon. This candidate exon is an identifier for a
vector (Fig. 1). potential gene.
(v) The replicated episomal DNA is recovered from the Initial Tests with Vector Alone. In initial experiments, we
COS cells, digested with Dpn I, and transformed into bac- tested whether pETV-SD could be transmitted as a retrovirus
terial cells. Transformants are selected on plates containing in animal cells and recovered as a plasmid in E. coli.
Following transfection of I-2 cells with pETV-SD or its
kanamycin and X-Gal, a chromogenic substrate for,8-galac- derivatives and infection of PA-317 cells, populations of
tosidase. Hydrolysis of X-Gal by P-galactosidase produces virus-producing cells regularly yielded titers of 102-104 col-
the characteristic blue color indicative of a Lac' phenotype, ony-forming units/ml. Following infection of COS cells with
whereas colonies that do not contain functional 13-galacto- this virus, episomal DNA from a 100-mm plate of COS cells
sidase are white. typically yielded 100-1000 bacterial colonies.
(vi) Only white colonies (KanR/Lac-) are studied further. Of the 84 colonies examined, 73 were blue (KanR/Lac+;
White colonies result either from inactivation of the a-f-Gal Table 1). However, a significant number (11/84) were white
gene by a mutational event or from the loss of the gene as a (KanR/Lac-; Table 1). These white colonies could have
consequence of a splicing event during the RNA phase of the arisen by functional inactivation of the a-p-Gal gene by
retroviral life cycle (Fig. 2). mutation or by loss of function due to cryptic splicing.
Table 1. Summary of results obtained in exon trapping experiments
No. retaining exon No. with spliced
Clones No. white/total 1/no. tested products/no. sequenced
Vector 11/84 (13.1%) 2/11 (18.2%) 0/2 (01%)
HBG(+) 385/2030 (19%6) 73/82 (89%) 72/73 (98.6%)
HBG(-) 47/504 (9.3%) 7/47 (14.9%) 0/7 (0%6)
RGR5(+) 17/63 (27%) 13/17 (76.5%) 9/13 (69.2%)
RGR5(-) 19/107 (17.8%) 3/19 (15.8%) 0/3 (0%6)
HLA1.5(+) 26/47 (55.3%) 23/26 (88.5%) 23/23 (100%)
HLA1.5(-) 75/625 (12%) 1/75 (1.3%) 0/1 (0O)
HLA library 345/2686 (12.8%) 86/345 (24.9%6) 69/86 (80.2%)
Column 1: "Vector" refers to pETV-SD and the remainder of the column refers to inserts cloned into
this vector. Column 2: Ratio of recovered KanR/Lac- (white) colonies to total number of KanR (white
plus blue) colonies. Column 3: Ratio of KanR/Lac- (white) colonies that hybridize to the HBG exon
1 probe to the total number of KanR/Lac- (white) colonies tested. Column 4: The numerator represents
the number of KanR/Lac- (white) colonies that have undergone a splicing reaction resulting in the loss
of the marked IVS and ligation of the HBG exon 1 SD to a novel SA. The denominator represents the
total number of KanR/Lac- (white) colonies maintaining exon 1 of HBG that were tested.
8998 Genetics: Duyk et A Proc. Natl. Acad. Sci. USA 87 (1990)

Cryptic splicing is the utilization of RNA sequences as splice Table 2. Recovered splice junctions from HLA library
sites that are not normally used in correct processing of Class Sequence
wild-type pre-mRNA (23). Restriction mapping of plasmid SD (142) *
DNA recovered from these white colonies indicated that the pETV-SD GGCCCTGGGAG I gttggtatca
majority (9/11) were grossly rearranged (Table 1). Southern I* SA (1239)
hybridization indicated that the rearrangements deleted the A (HLA exon Ill) GGCCCTGGGAG I GTTCTCACAC
* SA (2114)
entire cassette. The 2 white colonies that were not grossly B (HLA exon IV) GGCCCTGGGAG | ACGCCCCAAA
rearranged were analyzed by DNA sequencing. In both * SA (1480)
cases, the HBG SD and exon/intron boundary were intact. C [HLA exon III (-)] GGCCCTGGGAG | GTATCTGCGG
Thus, cryptic splicing had not occurred and inactivation of SA (1688)
D (pUC9) GGCCCTGGGAG I TTGCCTGACT
the a-P-Gal gene was due to a point mutation or small
rearrangement not detected by Southern blotting or restric- Uppercase letters indicate exon sequences and lowercase letters
tion mapping. This relatively high frequency of both gross indicate intron sequences. Vertical line defines exon/intron or exon/
exon boundaries. Predicted SD and SA sites are based upon com-
rearrangements and apparent point mutation is consistent parison of DNA sequences of cDNA and genomic clones (17, 19, 27)
with previous reports (24-26). or inferred from this study. The numbers correspond to nucleotide
Tests with Defined Exons. A set of experiments was per- positions and are based upon published sequence data (17, 19, 27).
formed to determine whether the exon trapping strategy pETV-SD represents the exon/intron sequence at its SD site in the
could detect exons in segments of well-characterized cloned absence of a splicing event. This sequence corresponds to the
genes. Three derivatives of pETV-SD were constructed: wild-type exon 1/intron 1 boundary of the HBG gene (17). Groups
pETV-SD:HBG(+), pETV-SD:HLA1.5(+), and pETV- A-D correspond to the DNA sequences at the junction between the
SD:RGR(+). Analogous plasmids with the same DNA frag- donor exon derived from pETV-SD and the trapped exons. In all
ments in the antisense orientation relative to the exon trap
cases, the predicted HBG SD was used. Groups A and B correspond
to HLA exons III and IV and, in each case, the established SA sites
cassette were also constructed. These plasmid DNA mole- were used. Groups C and D correspond to cryptic SA sites. Group
cules were individually processed and analyzed as described C is found within the predicted antisense orientation of HLA exon
above. In -view of the observation that the predominant class III, and group D is found in pUC9.
of mutation in the previous experiment resulted in the loss of
the exon-trap cassette by gross rearrangement, we first exon trapping procedure. Plasmid DNA from the pooled
analyzed white colonies obtained in these experiments by library was processed as described above. White colonies
hybridization to a donor exon probe. White colonies that numbering 345 out of 2686 total colonies (12.8%) were
hybridized -to the probe were analyzed further by DNA identified and screened for gross rearrangements by colony
sequencing to determine whether they resulted from genuine hybridization (Table 1). The donor exon was maintained by
splicing events or were the consequence of a mutational 86 (24.9o) of these colonies and these were analyzed further.
event. A single "G reaction" of DNA sequencing was performed on
In each of the sense-orientation pETV-SD derivatives, plasmid DNA from these colonies, which allowed the sorting
recovered exons were correctly spliced (Table 1). For ex- into groups based upon the pattern of guanine residues.
ample, when pETV-SD:HBG(+) was used, 2030 colonies Plasmids recovered from white colonies that yield a G
were obtained and 385 (19%) were white. We further ana- sequence pattern characteristic of the donor exon/intron
lyzed 82 of these white colonies by the hybridization screen, boundary are the products of a mutation rather than a splicing
and 73 (89%) hybridized to a probe that detects the donor event. In contrast, a G pattern showing precise loss of the
exon in the exon-trap cassette. DNA sequence analysis of 10 IVS from the donor exon in the exon-trap cassette is diag-
of these 73 colonies indicated that all 10 were the result of a nostic of a genuine splicing event. G-reaction DNA sequenc-
precise splicing event. A rapid PCR assay was used to ing identified five groups of plasmids. The exon/intron
analyze the remainder of these white colonies, and the boundary was retained by 17/86 (19.8%) of these white
combined data indicated that 72/73 (98.6%) were the result of colonies and thus they arose as a consequence of mutation
bona fide splicing events (Table 1). The plasmid DNA not rather than splicing. The remaining four groups of plasmids
representing a splice event had an intact exon/intron bound- represent apparent splicing events (Table 2) and the complete
ary in the exon-trap cassette, indicating that it did not arise DNA sequence from a single member from each of these
as a consequence of cryptic splicing. groups was obtained. Plasmids in groups A (24/69) and B
pETV-SD derivatives containing HBG, HLA, and gluco- (5/69), respectively, contain HLA-2A exons III and IV. The
corticoid receptor exons in the antisense orientation relative retrieval of these exons was predicted by the experimental
to the exon-trap cassette were also tested. In all cases, white design. However, plasmids in groups C (21/69) and D (19/69)
colonies arose as a consequence of mutation rather than contain inserts that arose by cryptic splicing events. Group C
cryptic splicing events (Table 1). Furthermore, almost all plasmids contain the donor exon of the exon-trap cassette
(130/141) of these mutation events were gross rearrange- adjacent to a cryptic SA from the antisense orientation of
ments and were detected by colony hybridization. The re- exon III from HLA-2A (Table 2). The SA in plasmids from
maining 11 (7.8%) white colonies were sequenced and found group D is from the 3-lactamase gene of pUC9 (27).
to have an intact exon/intron boundary in the exon-trap
cassette, demonstrating that they arose as a consequence of DISCUSSION
mutation rather than cryptic splicing.
Exon Trapping from a Mixture of Cloned DNA Fragments. We have used exon trapping to screen -20 kb for the
We designed a reconstruction experiment to determine presence of known SA sites. The screen identified all four of
whether or not the exon trapping strategy could be used to the predicted wild-type SA sites, whereas only two cryptic
retrieve candidate exons from a mixture of cloned genomic splicing events were identified. Consistent with previous
DNA fragments. A 7.8-kb pUC 9-derived plasmid containing reports (21, 24, 26, 28), splicing is not 100% efficient in this
a 5.1-kb cloned genomic DNA fragment of the HLA-A2 gene retrovirus-based system; however, it occurs frequently
was shotgun-cloned into pETV-SD (19). The resulting plas- enough so that it is readily detected (Table 1).
mid library contained at least nine different DNA fragments The development of exon trapping is based on the work of
cloned in both orientations. On the basis of experimental Cepko et al. (24) and others (21, 26, 28), who demonstrated
design, only 2 of the 18 different classes of plasmid-derived that transmissible retroviral shuttle vectors can be used to
inserts should be recovered as spliced products during the generate and recover cDNA versions of defined genomic
Genetics: Duyk et al. Proc. Natl. Acad. Sci. USA 87 (1990) 8999

inserts. Similar strategies have been used to map splice sites discussions. This work was supported by grants from the Wills
in a DNA virus, where spliced products were identified by Foundation (to G.M.D. and to R.M.M. and D.R.C.) and from the
sequencing or restriction enzyme analysis of all recovered National Institutes of Health (to R.M.M. and D.R.C.).
clones (26). This approach is practical only for the analysis of
well-characterized genes where transcriptional orientation and 1. Monaco, A. P., Neve, R. L., Colletti-Feener, C., Bertelson,
exon/intron boundaries are known. In contrast, exon trapping C. J., Kurnit, D. M. & Kunkel, L. M. (1986) Nature (London)
was designed to suggest the presence of genes and recover 316, 336-338.
2. Rommens, J. M., lannuzzi, M. C., Kerem, B.-S., Drumm,
them from long stretches of genomic DNA. In this scheme, M. L., Melmer, G., Dean, M., Rozmahel, R., Cole, J. L.,
genomic DNA fragments are cloned adjacent to a well- Kennedy, D., Hidaka, N., Zsiga, M., Buchwald, M., Riordan,
characterized SD and a marked IVS. Potential splicing events J. R., Tsui, L.-C. & Collins, F. S. (1989) Science 245, 1059-
are initially identified by a genetic screen in bacteria that 1065.
detects loss of the marked IVS, and that eliminates the 3. Abe, K., Wei, J.-F., Wei, F.-S., Hsu, Y.-C., Uehara, H., Artz,
requirement for physical characterization of the insert DNA in K. & Bennett, D. (1988) EMBO J. 7, 3441-3449.
every recovered clone. Further analysis is required to dem- 4. Spies, T., Blanck, G., Bresnahan, M., Sands, J. & Strominger,
onstrate that a particular recovered clone represents a gene. J. L. (1989) Science 243, 214-217.
Confirmation that a candidate exon is part of a gene and not 5. Sargent, C. A., Dunham, I. & Campbell, R. D. (1989) EMBO J.
a consequence of cryptic splicing requires the identification 8, 2305-2312.
6. Bird, A. P. (1987) Trends Genet. 3, 342-347.
of a transcript. A highly specific exon probe may be recov- 7. Bell, J. (1989) Trends Genet. 5, 289-290.
ered from the exon-trap vector by using the PCR. Transcripts 8. Gray, M. R., Colot, H. V., Guarente, L. & Rosbash, M. (1982)
can be identified by using this probe to screen Northern blots Proc. Natl. Acad. Sci. USA 79, 6598-6602.
or cDNA libraries. This probe can also be used to screen a 9. Weber, F., de Villiers, J. & Shaffner, W. (1984) Cell 36,
"zoo blot" (1) to determine evolutionary conservation of the 983-992.
putative exon. Such evidence is suggestive of the presence of 10. Allen, N. D., Cran, D. G., Barton, S. C., Hettle, S., Reik, T.
a gene and is important information if initial screens fail to & Surani, M. A. (1988) Nature (London) 333, 852-855.
detect a transcript. In these instances, DNA sequence infor- 11. Gossler, A., Joyner, A. L., Rossant, J. & Skarnes, W. C.
mation obtained during analysis enhances the search for (1989) Science 244, 463-465.
12. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular
transcripts. For example, a PCR assay based on this se- Cloning: A Laboratory Manual (Cold Spring Harbor Lab., Cold
quence can be used to prescreen multiple cDNA libraries or Spring Harbor, NY), 2nd Ed.
to clone transcripts of very low abundance. 13. Mann, R., Mulligan, R. C. & Baltimore, D. (1983) Cell 33,
Although we were able to use exon trapping to recover all 153-159.
predicted exons in several different genes, some exons will 14. Miller, A. D., Law, M.-F. & Verma, I. M. (1985) Mol. Cell.
not be recovered with this method. For example, a small Biol. 5, 431-437.
percentage of known genes do not contain introns and 15. Gluzman, Y. (1981) Cell 23, 175-182.
therefore will be missed by this screen (23). In addition, some 16. Korman, A., Frantz, J. D., Strominger, J. L. & Mulligan, R. C.
splicing events are temporally regulated or tissue-specific and (1987) Proc. Natl. Acad. Sci. USA 84, 2150-2154.
17. Reed, R. & Maniatis, T. (1986) Cell 46, 681-690.
may not occur in the packaging cell lines used in exon 18. Miesfeld, R., Rusconi, S., Godowski, P. J., Maler, B. A.,
trapping. However, most genes with regulated splicing Okret, S., Wikstrom, A.-C., Gustafsson, J.-A. & Yamamoto,
events also have introns that are constitutively removed, and K. R. (1987) Cell 46, 389-399.
these would be identified by exon trapping. Finally, because 19. Koller, B. H. & Orr, H. T. (1985) J. Immunol. 134, 2727-2733.
not all DNA sequences are propagated equally well in ret- 20. Feigner, P. L., Gadek, T. R., Holm, M., Roman, R., Chan,
rovirus vectors (21), it is possible that the library of recovered H. W., Wenz, M., Northrop, J. P., Ringold, G. M. &
clones may not be fully representative of the exons in the Danielsen, M. (1987) Proc. Natl. Acad. Sci. USA 84, 7413-
starting cloned genomic DNA. Since most genes are com- 7417.
21. Brown, A. M. C. & Scott, M. R. D. (1987) Retroviral Vectors
posed of multiple exons and the identification of a gene in DNA Cloning: A Practical Approach, ed. Glover, D. M.
requires the recovery of only a single exon, this consideration (IRL, Oxford), Vol. 3, pp. 189-212.
should not be a limiting factor. 22. Sheffield, V. C., Cox, D. R., Lerman, L. S. & Myers, R. M.
Exon trapping is a genetic screen that utilizes SA sites as (1989) Proc. NatI. Acad. Sci. USA 86, 232-236.
identifiers of candidate exons within cloned mammalian ge- 23. Smith, C. W. J., Patton, J. G. & Nadal-Ginard, B. (1989) Annu.
nomic DNA sequences. These candidate exons are ideally Rev. Genet. 23, 527-577.
suited for establishing the presence of a gene in a cosmid or A 24. Cepko, C. L., Roberts, B. E. & Mulligan, R. C. (1984) Cell 37,
phage insert and facilitating the subsequent isolation of this 1053-1062.
gene. Our current experience suggests that as many as 20 25. Dougherty, J. & Temin, H. M. (1986) Mol. Cell. Biol. 63,
cosmids can be screened concurrently in a 4-week period. This 4387-4395.
26. Dostatni, N., Yaniv, M., Danos, 0. & Mulligan, R. C. (1988) J.
screen of uncharacterized cosmids will determine the utility of Gen. Virol. 69, 3093-3100.
SA signals as identifiers of candidate coding sequences. 27. Yanisch-Perron, C., Vieira, J. & Messing, J. (1985) Gene 33,
103-119.
We thank R. Cone, R. C. Mulligan, and H. Orr for providing 28. Sorge, J. & Hughes, S. H. (1982) J. Mol. Appl. Genet. 1,
materials and A. Krainer, T. Maniatis, and S. Donner for helpful 547-549.

You might also like