2016 Article 3280

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Sutton et al.

BMC Genomics (2016) 17:948


DOI 10.1186/s12864-016-3280-3

RESEARCH ARTICLE Open Access

Identification of genes for engineering the


male germline of Aedes aegypti and
Ceratitis capitata
Elizabeth R. Sutton1,2,5, Yachuan Yu3,6, Sebastian M. Shimeld1, Helen White-Cooper3* and and Luke Alphey1,2,4*

Abstract
Background: Synthetic biology approaches are promising new strategies for control of pest insects that transmit
disease and cause agricultural damage. These strategies require characterised modular components that can direct
appropriate expression of effector sequences, with components conserved across species being particularly useful.
The goal of this study was to identify genes from which new potential components could be derived for
manipulation of the male germline in two major pest species, the mosquito Aedes aegypti and the tephritid fruit fly
Ceratitis capitata.
Results: Using RNA-seq data from staged testis samples, we identified several candidate genes with testis-specific
expression and suitable expression timing for use of their regulatory regions in synthetic control constructs. We also
developed a novel computational pipeline to identify candidate genes with testis-specific splicing from this data;
use of alternative splicing is another method for restricting expression in synthetic systems. Some of the genes
identified display testis-specific expression or splicing that is conserved across species; these are particularly
promising candidates for construct development.
Conclusions: In this study we have identified a set of genes with testis-specific expression or splicing. In addition
to their interest from a basic biology perspective, these findings provide a basis from which to develop synthetic
systems to control important pest insects via manipulation of the male germline.
Keywords: Synthetic biology, Pest insect, Male germline, RNA-seq, Aedes aegypti, Ceratitis capitata

Background Such strategies require characterised modular compo-


Insects pose large problems for human health and agri- nents that can direct appropriate expression of effector
culture; several major global diseases are transmitted by sequences – protein-coding sequences or functional
insect vectors, and huge losses in food production occur RNAs, for example. Conserved components that can be
due to insect pests. used across multiple species are particularly useful.
Current strategies for insect control have a number of However, for many applications there are few if any
disadvantages, such as effects on non-target species and such components available. The goal of this study was
development of resistance to insecticides [1]. Alternative to identify genes that could provide potential components
synthetic biology approaches are being developed in for manipulation of the male germline in two major pest
which the control agent is a modified version of the pest species, the mosquito Aedes aegypti (L.) and the tephritid
insect itself. These modified insects carry a genetic fruit fly Ceratitis capitata (Wiedemann).
system that results in the death of some or all of their These species were selected because of their importance
descendants, so that when released modified insects mate to public health and agriculture, respectively. Ae. aegypti
with wild counterparts, population suppression occurs. vectors a number of viral diseases including dengue fever
[2], the most prevalent mosquito-borne viral disease, with
* Correspondence: [email protected]; [email protected] an estimated 390 million infections per year [3]. There is
3
School of Biosciences, Cardiff University, Cardiff CF10 3AX, UK
1
Department of Zoology, University of Oxford, Oxford OX1 3PS, UK
no specific therapeutic or prophylactic treatment, and no
Full list of author information is available at the end of the article licensed vaccine, meaning vector control is currently the
© The Author(s). 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Sutton et al. BMC Genomics (2016) 17:948 Page 2 of 16

only option for prevention. C. capitata (Mediterranean systems involving transcription factors (e.g. GAL4 or tTA,
fruit fly, medfly), is a widespread, economically important both widely used in insect synthetic biology [28, 29]) would
agricultural insect pest, affecting over 250 types of crop therefore require regulatory regions (promoters and/or
[4]. The choice of these two distantly related species also UTRs) that drive pre-meiotic protein expression, otherwise
allowed us to search for genes that may be conserved the transcription factor would not be translated early
across multiple species. enough to drive transcription of its target (Fig. 1). While
Genetic insect control systems require expression of promoters may control tissue specificity, it is likely that
the effector transgene in a particular tissue and/or at a timing of translation is controlled by UTRs (though in
particular developmental stage, and usually require that prokaryotes translation has been shown to be affected by
the transgene not be expressed elsewhere or at another promoter sequences [30]), so identification of both pro-
time. While some genetic control methods and strains moters and UTRs is likely to be important.
have been successfully developed based on ubiquitous or High-throughput transcriptional profiling [31] and sub-
targeted expression in somatic tissues [5–13], for several tractive hybridisation [32] studies have recently yielded
potential strategies, germline-specific transgene expres- several potential testis-specific transcripts in Ae. aegypti.
sion is required, male germline-specific expression being However, to our knowledge, no studies have been per-
of particular interest. These include sex-ratio distortion formed with sufficient time resolution to determine the
systems, which involve the release of males carrying a activity of regulatory regions at different stages of sperm-
transgene whose product selectively destroys sperm that atogenesis. Information on insect testis-specific splicing is
would result in female offspring. The resultant skewing even more sparse; testis-specific splice forms of the genes
of the sex ratio towards males would lead to population achi and vis have been discovered in D. melanogaster [33],
suppression [14–17]. Other approaches would eliminate but no testis-specific splice forms have been identified, to
sperm production [15], or lead to the death of embryos our knowledge, in Ae. aegypti, C. capitata or any other
fertilised by sperm from modified males [17]. pest insect.
Though many types of regulatory element might in In this study we performed RNA-seq on staged testis
principle be used, in practice expression is usually con- samples from Ae. aegypti and C. capitata, to identify genes
trolled by the choice of promoter. Alternative splicing with testis-specific expression peaking early in spermato-
cassettes may also be used, either with a non-specific or genesis, whose regulatory regions are therefore candidates
specific promoter. For example, sex-specific alternative for driving pre-meiotic protein expression. We also
splicing has been used to achieve female-specific expres- developed a novel computational pipeline to identify
sion in C. capitata [8], olive fly [9], pink bollworm and testis-specific splice forms that could potentially provide
diamondback moth [10], and to add additional specifi- additional tools for germline-specific genetic systems. By
city to an already sex-biased promoter in Ae. aegypti comparing results from the two species, we have attempted
[11], Ae. albopictus [12] and Anopheles stephensi [13]. to identify conserved components that may function in
Analogous components to drive germline-specific ex- constructs across multiple species. In addition to their use
pression, particularly in males, would be useful for the in applied synthetic biology, these elements are also inter-
applications described above. esting from a basic biology perspective.
Several insect genes with testis-specific expression
have been identified, often first in Drosophila melanogaster Results
(Meigen), for example β2-tubulin [18]. Homologues of RNA sequencing and read alignment
β2-tubulin have been identified and the promoters RNA sequencing was performed on eight samples, two Ae.
found to drive testis-specific expression in other species, aegypti and four C. capitata dissected testis samples repre-
including Anopheles gambiae [19], Ae. aegypti [20] and C. senting different spermatogenesis stages, an Ae. aegypti
capitata [21]. However, studies on D. melanogaster suggest gonadectomised male sample, and a C. capitata ovary
that expression timing in male germline cells must be taken sample. The two Ae. aegypti testis samples were gener-
into consideration. In D. melanogaster, transcription is re- ated by bisecting testes and will be referred to as “early”
pressed with the onset of the meiotic divisions [22, 23]. Bar- and “late”. The four C. capitata testis samples constituted
ring a few exceptions [24–26], genes whose protein product early spermatocytes, late spermatocytes, round spermatids
is required after this transcriptional repression are tran- and elongated spermatids, respectively. Sequencing was
scribed in primary spermatocytes, before the meiotic performed using the Illumina Genome Analyzer II platform
divisions; the transcripts are then stored and translated with single reads of 73 nucleotides. In total 255,090,176
as required [27]. Though not studied in detail for other reads were generated for the eight samples, corresponding
insects, the major changes to chromatin structure at to 18.6 Gb of data, with 89.1% of Ae. aegypti reads and
meiosis and subsequently suggest this may be a general 89.8% of C. capitata reads aligning to the corresponding
phenomenon. Testis-specific bipartite synthetic genetic genome (see Additional file 1 for more details).
Sutton et al. BMC Genomics (2016) 17:948 Page 3 of 16

Fig. 1 Importance of pre-meiotic protein expression in bipartite synthetic genetic systems. If transcription is repressed from meiosis onwards,
post-meiotic translation of the transcription factor in a bipartite expression system is not adequate for expression of the target transgene (a).
Expression of the transgene requires translation of the transcription factor before meiosis such that the target transgene is transcribed before
transcriptional repression at meiosis (b)

Data from C. capitata female and Ae. aegypti female elements associated with relatively strong expression are
and ovary samples from other experiments were down- desired for use in synthetic constructs; 10 FPKM is the
loaded from the Sequence Read Archive (SRA) [34]. The boundary between low and moderate expression for D.
Ae. aegypti female sample was gonadectomised; an ovary melanogaster RNA-seq data on FlyBase [35]. The thresh-
sample was therefore used in addition so that data from old for expression in samples other than testis (gonadecto-
all female tissues were present. The C. capitata female mised male, ovary and female) was not set at zero, to allow
sample was not gonadectomised, but an ovary sample for some noise in the data, but rather at 1 FPKM, based on
was still sequenced and included in the analysis, as many quantification of the known testis-specifically expressed
genes expressed in the testis could potentially also be genes can, comr, nht and Taf12L in D. melanogaster (data
expressed in the ovary, and their detection may be im- not shown).
peded by the large amount of other tissue in a whole Many potential candidates appeared to be short non-
female sample. The Ae. aegypti ovary and female sam- coding RNAs. Quantification of short non-coding RNAs is
ples were from recently fed females (~24 h post-blood likely to be inaccurate in a protocol using polyA selection.
meal), as these will include transcripts expressed during Therefore the only genes taken forward for further ana-
oogenesis, thus enabling elimination of genes expressed lysis were those that either coincided with a locus already
in both male and female gametogenesis. The number of annotated as a protein-coding gene, or novel predicted
reads in these samples and the proportion aligning to genes that were over 1 Kb in length.
the corresponding genome are shown in Additional file 1. After application of the filtering steps above, predicted
testis-specifically expressed genes with higher expression
Identification of candidate testis-specifically expressed in early spermatogenesis than in late spermatogenesis
genes were identified. For Ae. aegypti, 57 candidate early genes
Candidate testis-specifically expressed genes were iden- were identified, out of a total of 388 predicted testis-
tified from the total set of predicted genes by running a specifically expressed genes with expression above 10
custom Python script on the output of the standard FPKM in the early sample. For C. capitata, 68 candidate
TopHat-Cufflinks-Cuffdiff RNA-seq analysis pipeline, early genes were identified, out of a total of 667 pre-
and applying various filtering steps (described below) dicted testis-specifically expressed genes with expression
to maximise sensitivity whilst removing unsuitable above 10 FPKM in early spermatocytes.
genes and minimising false positives. For each species, the top ten candidates in order of
An expression level of 10 FPKM (fragments per kilobase expression level in the earliest testis sample were taken
of exon per million fragments mapped) in the early sam- forward for experimental testing. Genes encoding proteins
ple for Ae. aegypti and the early spermatocytes sample for associated with transposable elements were excluded, as
C. capitata was chosen as a threshold for candidates. A there are likely to be multiple copies of these in the
threshold was set as predicted genes with low expression genome, and it would be difficult to design PCR primers
are more likely to be false positives, and also regulatory that would target only one. For Ae. aegypti, one additional
Sutton et al. BMC Genomics (2016) 17:948 Page 4 of 16

candidate was also taken forward, as a homologue of the capitata genes (Fig. 3b), corresponding to the annotated
gene was identified as a candidate in C. capitata; candi- loci LOC101449780, LOC101457895, LOC101459689 and
dates that are conserved between species may simplify LOC101462854, were also taken forward despite some
construct generation in different species. Lists of the amplification in non-testis samples. In these cases the
candidate genes tested, and the annotated loci that they quantity of product from the non-testis samples was low,
correspond to, if any, can be seen in Additional file 2. and in some cases the product could have resulted from
amplification of contaminating gDNA.
Experimental testing of candidate testis-specifically
expressed genes qRT-PCR
RT-PCR Quantitative RT-PCR (qRT-PCR) for the candidate genes
Reverse transcriptase PCR (RT-PCR) for the selected taken forward for further testing was performed on
candidates was performed on total RNA derived from staged testis samples (early and late samples for Ae.
testis, gonadectomised male, ovary and gonadectomised aegypti, spermatocytes and spermatids samples for C.
female samples, to confirm that the candidates were testis- capitata), to confirm that the candidate genes displayed
specifically expressed in adults. For some candidates, the the desired expression pattern of higher expression early
RT-PCR results suggested that there was also expression of in spermatogenesis (Figs. 4 and 5). Gonadectomised
the gene in other tissues, mostly ovary and one candidate male, ovary and gonadectomised female samples were
failed to produce a positive result in the testis sample. also used in the qRT-PCR to quantify the level of expres-
However, the results supported the prediction of testis- sion in these tissues, if any. Candidates with a low level
specific expression for several candidates (Figs. 2 and 3), of non-testis expression may still be usable for restrict-
discussed below. ing expression to the testis, particularly in combination
Three candidate Ae. aegypti genes (Fig. 2a), correspond- with other strategies, such as use of testis-specific splicing.
ing to the annotated loci AAEL001333, AAEL009267 and The timing was confirmed for all Ae. aegypti candidates
AAEL0122239 and three candidate C. capitata genes (Fig. 4) except AAEL009267, for which the qRT-PCR failed,
(Fig. 3a), corresponding to the annotated loci LOC101449084, and for four of the C. capitata candidates (Fig. 5). For the
LOC101451785 and LOC101459316, displayed the expected other three C. capitata candidates, LOC101449084,
outcome of RT-PCR amplification from testis and no LOC101457895 and LOC101459689, no expression
amplification from other samples, and were taken for- was detected in spermatocytes. The results for all the
ward for further testing. Four additional candidate Ae. Ae. aegypti candidates except AAEL012239 suggested
aegypti genes (Fig. 2b), corresponding to the anno- some expression in non-testis tissues, but this was at a
tated loci AAEL003021, AAEL006665, AAEL010265 low level compared to that in testis, and in four of the five
and AAEL010268 and four additional candidate C. cases amplification could have resulted from contaminating

a 1 2 3 4 1 2 3 4 1 2 3 4 Lane key

Gonadectomised male

Gonadectomised female

Ladder key
1000 bp
AAEL001333 AAEL009267 AAEL0122239 800 bp
600 bp
b 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 400 bp

200 bp

* expected PCR
* product size
* could be product from
* contaminating gDNA
AAEL003021 AAEL006665 AAEL012065 AAEL010268
Fig. 2 Gels showing PCR results for Ae. aegypti expression candidates. a Candidates for which no band of the expected size for the testis sample
could be seen in non-testis samples. b Candidates for which a band of the expected size for the testis sample could be seen in a non-testis sample,
but it was faint and in the cases indicated by asterisks, could have resulted from contaminating gDNA. Expected PCR product sizes are indicated with
arrows. In some cases bands of other sizes are of the expected size for products amplified from contaminating gDNA. Other bands of unexpected sizes
may represent isoforms that were not predicted, or non-specific amplification
Sutton et al. BMC Genomics (2016) 17:948 Page 5 of 16

a 1 2 3 4 1 2 3 4 1 2 3 4 Lane key

Gonadectomised male

Gonadectomised female

Ladder key
1000 bp
LOC101449084 LOC101451785 LOC101459316 800 bp
600 bp
b 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 400 bp

200 bp

* expected PCR
* product size
could be product from
LOC101449780 LOC101457895 LOC101459689 LOC101462854 * contaminating gDNA

Fig. 3 Gels showing PCR results for C. capitata expression candidates. Presented as for Fig. 2

gDNA. For all the C. capitata candidates, no expression maximise sensitivity whilst removing unsuitable genes
was detected in non-testis samples. and minimising false positives. These filtering steps may
exclude some valid genes, but for the intended down-
Identification of candidate testis-specifically spliced genes stream application it is not necessary to identify all
Similarly to the candidate testis-specifically expressed testis-specifically spliced genes; it was more important to
genes, analysis was performed on RNA-seq data from minimise false positives.
two staged testis samples in Ae. aegypti and four staged An expression level of 10 FPKM (in the early sample
testis samples in C. capitata, along with gonadectomised for Ae. aegypti and the early spermatocytes sample for
male, ovary and female samples to identify genes with C. capitata) was chosen as a threshold for the predicted
testis-specific splice forms. testis-specific splice forms, using the same rationale as
Candidate testis-specifically spliced genes were identi- discussed for the candidate testis-specifically expressed
fied from the total set of predicted genes by running a genes. The threshold for expression of predicted testis-
custom Python script on the output of the standard specific splice forms in tissues other than testis was not
TopHat-Cufflinks-Cuffdiff RNA-seq analysis pipeline, set at zero, to allow for some noise in the data, but ra-
and applying various filtering steps (described below) to ther at 0.4 FPKM, based on quantification of the known

25

20 0.02

0.015

0.01
Relative expression

15 Early sperm
0.005 Late sperm
Male carcass
0
Ovary
10
Female carcass

0
AAEL001333 AAEL012239 AAEL003021 AAEL006665 AAEL012065 AAEL010268

Fig. 4 Relative expression levels in different tissues for Ae. aegypti expression candidates, determined using qRT-PCR. Results for AAEL012239 are
shown inset, as the expression level for this gene was too low to view at the same scale as for the other genes. * Primers could also have amplified
from gDNA, so apparent low expression in non-testis tissues could be a result of gDNA contamination
Sutton et al. BMC Genomics (2016) 17:948 Page 6 of 16

120

100

80
Relative expression

Spermatocytes
Spermatids
60
Male carcass
Ovary
Female carcass
40

20

0
LOC101451785 LOC101459316 LOC101449780 LOC101462854

Fig. 5 Relative expression levels in different tissues for C. capitata expression candidates, determined using qRT-PCR

testis-specifically spliced transcripts from the genes achi


and vis in a D. melanogaster dataset (data not shown). It a Reactions with exon-exon junction primers

was also required that at least one other splice form of


Testis
the gene was expressed in at least one other sample
(gonadectomised male, ovary or female) at a level of 10 Other
tissues
FPKM or above, to distinguish testis-specific splicing
from testis-specific expression.
In addition to the above expression thresholds, a Reactions with multiple splice form primers

threshold for exon-exon junction coverage was set to


Testis
minimise false positives; only introns with more than
10 reads spanning the exon-exon junction were taken Other
tissues
forward. False positives may also arise due to low cover-
age in a particular sample, causing incorrect assembly of a
transcript in this sample, for example with a few nucleo- b Reactions with exon-exon junction primers
tides missing at the end, and giving the appearance of
Testis
alternative splicing. To minimise this source of error, the
only introns taken forward were those differing by more Other
tissues
than 20 bp at one end at least from introns in other tran-
scripts from the same gene. Finally, only candidates for
which the predicted testis-specific intron was within an Reactions with other splice form primers
annotated gene were taken forward, to avoid false positives Other
that are in fact intergenic regions but predicted as in- tissues
trons due to incorrect merging of transcripts during Fig. 6 RT-PCR testing of candidate testis-specifically spliced genes.
assembly. Using these parameters, 27 and 33 candidate Expression of the predicted testis-specific splice form was assessed
testis-specifically spliced genes were identified for Ae. using primers designed to span the predicted testis-specific exon-exon
junction. Expression of other splice forms was assessed using additional
aegypti and C. capitata respectively.
primers targeting either multiple splice forms – both the predicted
Experimental validation of testis-specific splicing required testis-specific splice form and other splice forms – but yielding products
distinguishing between splice forms using RT-PCR. The of different sizes (a), or other splice forms only (b). Note that primers
primer design strategy used is illustrated in Fig. 6. The amplifying splice forms other than the predicted testis-specific splice
specificity of the predicted testis-specific splice forms form may still yield a product in testis samples, as these splice forms
may be expressed in the testis in addition to the testis-specific splice
was tested using primers spanning the predicted testis-
form. The splice forms illustrated here are simplified examples
specific exon-exon junction. Candidates for which primers
Sutton et al. BMC Genomics (2016) 17:948 Page 7 of 16

could also be designed common to both predicted testis- low. Candidates with a low expression in non-testis tissues
specific and other splice forms were preferred; these of the putative testis-specific splice form relative to other
allowed additional testing of testis-specificity of the pre- splice forms could potentially still be useful for the intended
dicted testis-specific splice form, as they should yield application.
products of different sizes in testis and other tissues
(Fig. 6a). There were only a small number of these, so all qRT-PCR
were taken forward for experimental testing. There were The suitability of a testis-specific intron for use in a
further candidates for which primers common to both synthetic construct as discussed above will be affected
predicted testis-specific and other splice forms could not by the proportions of different splice forms for the cor-
be designed (Fig. 6b); for each species the top five of these responding gene in the testis. There may be other splice
candidates in order of ascending intron size were taken forms expressed in the testis in addition to the testis-
forward for experimental validation. For C. capitata, specific splice form. If used to direct testis-specific ex-
one additional candidate was also taken forward, as a pression of a coding region, the higher the proportion
homologue of the gene was identified as a candidate in of the testis-specific splice form compared to other
Ae. aegypti; as mentioned above, candidates that are splice forms, the higher the proportion of primary tran-
conserved between species may simplify construct gen- scripts processed into the splice variant of interest (the
eration in different species. Lists of the candidate genes testis-specific splice variant). If most transcripts are not
tested, and the annotated loci that they correspond to, of the testis-specific splice form and retain the testis-
if any, can be seen in Additional file 2. specific intron, there may be insufficient production of
functional transgene product. In order to determine
Experimental testing of candidate testis-specifically splice form proportions in the testis for the candidates
spliced genes taken forward for further testing, qRT- PCR was per-
RT-PCR formed (Figs. 9 and 10). Gonadectomised male, ovary
RT-PCR for the selected candidates was performed on and gonadectomised female samples were also used in
testis, gonadectomised male, ovary and gonadectomised the qRT-PCR to determine the expression level of the
female samples, to confirm that the candidates were predicted testis-specific splice form in these tissues, if
testis-specifically spliced. The primer design strategy any, relative to the expression level of other splice
used is illustrated in Fig. 6. forms. While complete absence of expression of the
The PCR results varied between candidate genes. For predicted testis-specific splice form in non-testis tissues
some candidates the predicted testis-specific splice form would be preferred, candidates with a low level of non-
was not detected, for others it was detected in samples testis expression of the predicted testis-specific splice
other than testis, and for others no splice forms at all form relative to other splice forms may still be usable
were detected in samples other than testis, suggesting for synthetic biology applications, particularly in combin-
that the gene is testis-specifically expressed rather than ation with other strategies, such as use of testis-specific
differentially spliced. However, the results supported the regulatory regions, for restricting expression to the testis.
prediction of testis-specific splicing for some candidates The qRT-PCR for the C. capitata candidate introns
(Figs. 7 and 8), discussed below. within the annotated locus LOC101459514 failed to pro-
Five Ae. aegypti candidate introns (Fig. 7a), within the duce meaningful results, with calculations suggesting nega-
annotated loci AAEL000028, AAEL001898, AAEL008110, tive expression of some splice forms, so these introns were
AAEL012262 and AAEL018211, and four C. capitata excluded. Based on the qRT-PCR results for the other can-
candidate introns (Fig. 8a), within the annotated loci didates, the estimated proportion of the testis-specific splice
LOC101449153, LOC101450641, LOC101457260 and form out of all splice forms in the testis ranged from 0.4 to
LOC101459514, displayed the expected outcome of a 95% in Ae. aegypti (Fig. 9) and 0.24–69% in C. capitata
positive PCR result for the predicted testis-specific (Fig. 10). Candidates at the lower ends of these ranges are
splice form in testis only, and a positive PCR result for unlikely to be suitable for use in a synthetic construct. For
other splice forms in other tissues. These nine candidates example, the results suggest that for AAEL001898, only
were taken forward for further testing. Two additional Ae. 0.4% of mature transcripts in the testis would retain the
aegypti candidate introns (Fig. 7b), within the annotated intron, and thus only 0.4% of transcripts would be of the
loci AAEL011153 and AAEL018350, and three additional C. desired form if this intron were used to direct testis-specific
capitata candidate introns (Fig. 8b), within the annotated expression of a coding region. However, candidates at the
loci LOC101449153, LOC101452861 and LOC101459514, higher ends of the ranges are more likely to be suitable, and
were also taken forward despite positive PCR results for the will be taken forward for testing in synthetic constructs. In
predicted testis-specific splice form in non-testis samples, some cases the qRT-PCR results suggested expression of
as the quantity of product from the non-testis samples was the testis-specific splice form in non-testis samples, but this
Sutton et al. BMC Genomics (2016) 17:948 Page 8 of 16

a 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
Exon-exon
junction
primers
multiple splice form
Other splice form /

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
primers

t
o

AAEL000028 AAEL001898 AAEL008110 AAEL012262 AAEL018211


(intron 1) (intron 2)
b 1 2 3 4 1 2 3 4 Lane key
Exon-exon
junction
primers

Gonadectomised male

Gonadectomised female

Ladder key
1000 bp
800 bp
600 bp
multiple splice form
Other splice form /

1 2 3 4 1 2 3 4
400 bp
primers

200 bp

indicates expected
PCR product size
For multiple splice form primers:
AAEL011153 AAEL018350

Fig. 7 Gels showing PCR results for Ae. aegypti splicing candidates. a Candidates for which no band of the expected size for the predicted testis-specific
splice form could be seen in non-testis samples. b Candidates for which a band of the expected size for the predicted testis-specific splice form could be
seen in a non-testis sample, but it was only faint. Expected PCR product sizes are indicated with arrows. Bands of unexpected sizes may represent other
splice forms that were not predicted, or non-specific amplification

was mostly at a very low level (<1%) relative to the expres- testis-specifically expressed candidate corresponding to the
sion of other splice forms in these tissues. For the C. capi- annotated locus AAEL009267 and the C. capitata testis-
tata candidates LOC10450641 and LOC101457260 the specifically expressed candidate corresponding to the anno-
results suggested that 17–100% of the splice forms in non- tated locus LOC101459316 are homologous, and both show
testis samples were actually the predicted testis-specific homology to a D. melanogaster gene, CG7691, that was also
splice form. However, the expression of all splice forms in identified as testis-specifically expressed, with higher ex-
these non-testis samples was low compared to expression pression early in spermatogenesis. AAEL009267 is anno-
in the testis, so relative errors in quantification are likely to tated as a hypothetical protein, while LOC101459316 and
be higher. CG7691 are predicted zinc finger proteins. The expression
timing of AAEL009267 could not be confirmed due to a
Inter-species conservation failed qRT-PCR, but higher expression of LOC101459316
To determine whether any of the candidates we identified in early spermatogenesis was confirmed. The Ae. aegypti
were conserved between species, tBLASTx searches were testis-specifically spliced candidate corresponding to the
performed, using the candidate sequences from one annotated locus AAEL008110 (centrosomin) and the C.
species as queries and all transcripts predicted by Cufflinks capitata testis-specifically spliced candidate corresponding
from the other species as a database. A D. melanogaster to the annotated locus LOC101449153 (centrosomin-like)
dataset was also used as a database, to provide further con- are homologous, and both show homology to the D.
fidence in conservation, and also because more supporting melanogaster gene for centrosomin, which is involved
information is available on D. melanogaster genes. in centrosome assembly and is known to have a role in
These BLAST searches revealed one set of homologous spermatogenesis and display testis-specific splicing in this
testis-specifically expressed candidates and one set of hom- species [36]. However, it should be noted that qRT-PCR re-
ologous testis-specifically spliced candidates, with conserva- sults suggested low abundance in testis of the predicted
tion between all three species in each case. The Ae. aegypti testis-specific splice form compared to other splice forms
Sutton et al. BMC Genomics (2016) 17:948 Page 9 of 16

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
a
Exon-exon
junction
primers
multiple splice form
Other splice form /

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
primers

o
t

LOC101449153 LOC101450641 LOC101457260 LOC101459514


(intron 1) (intron 1)
1 2 3 4 1 2 3 4 1 2 3 4 Lane key
b
Exon-exon
junction
primers

Gonadectomised male

Gonadectomised female

Ladder key
1000 bp
800 bp
600 bp
multiple splice form
Other splice form /

1 2 3 4 1 2 3 4 1 2 3 4
400 bp
primers

200 bp

indicates expected
PCR product size
For multiple splice form primers:
LOC101449153 LOC101452861 LOC101459514
(intron 3) (intron 2)
Fig. 8 Gels showing PCR results for C. capitata splicing candidates. Presented as for Fig. 7

100000 120000 1000 180


90000 900 160
300 100000 300 30 4
80000 800
Relative expression

Relative expression

Relative expression

Relative expression

140
70000 700 3
200 80000 200 20 120
60000 600 2
100
50000 100 60000 100 500 10
80 1
40000 400
0 40000 0 0 60 0
30000 300
Testis Testis Testis Testis
20000 200 40
20000
10000 100 20
0 0 0 0
Testis Male Ovary Female Testis Male Ovary Female Testis Male Ovary Female Testis Male Ovary Female
carcass carcass carcass carcass carcass carcass carcass carcass

AAEL000028 AAEL001898 AAEL008110 (intron 1) AAEL012262

7 160 9

6 140 8
1.5
7
Relative expression

Relative expression

Relative expression

120
5
1.0 6
100
4 5
80 0.5 Testis-specific splice form
3 4
60 Other splice forms
0.0 3
2 Testis
40 2
1 20 1
0 0 0
Testis Male Ovary Female Testis Male Ovary Female Testis Male Ovary Female
carcass carcass carcass carcass carcass carcass

AAEL018211 (intron 2) AAEL011153 AAEL018350

Fig. 9 Relative expression levels in different tissues for predicted testis-specific and other splice forms of Ae. aegypti splicing candidates, determined
using qRT-PCR. Where expression levels in the testis are too low to view at the same scale as for the other splice forms, results for testis are shown
inset. The relative expression value of the testis-specific splice form is set at 1 in all cases. Error bars show +/− standard error of the mean for two
technical replicates
Sutton et al. BMC Genomics (2016) 17:948 Page 10 of 16

4000 180 40
3500 160 35
5
Relative expression

Relative expression

Relative expression
3000 140 4 30
120 3
2500 25
100 2
2000 20
80 1
1500 0 15
60
1000 Testis 10
40
500 20 5
0 0 0
Testis Male Ovary Female Testis Male Ovary Female Testis Male Ovary Female
carcass carcass carcass carcass carcass carcass

LOC101449153 (intron 1) LOC101450641 LOC101457260

4.0 4.5
3.5 4.0
Relative expression

Relative expression

3.0 3.5
3.0
2.5
2.5 Testis-specific splice form
2.0
2.0 Other splice forms
1.5
1.5
1.0 1.0
0.5 0.5
0.0 0.0
Testis Male Ovary Female Testis Male Ovary Female
carcass carcass carcass carcass

LOC101449153 (intron 3) LOC101452861


Fig. 10 Relative expression levels in different tissues for predicted testis-specific and other splice forms of C. capitata splicing candidates, determined
using qRT-PCR. Presented as for Fig. 9

for both AAEL008110 and LOC101449153, and thus they custom-written Python scripts to achieve this. Unlike
may not be suitable for use in synthetic constructs for the other methods for identifying differential splicing from
reasons discussed above. RNA-seq data [37], it identifies splice forms generated
by all types of splice event, not only exon skipping. The
Discussion outputs are particularly tailored for subsequent experi-
In this study, we have used RNA-seq data to identify mental testing, containing intron flanking sequences
testis-specifically expressed and spliced genes in the along with numbered exon-exon junction positions to
disease vector Ae. aegypti and the agricultural pest C. facilitate PCR primer design, and alignments for all
capitata. This genome-wide approach represents an transcripts of each gene. In addition to its application
advance on previous efforts to find regions for use in here to identify testis-specific splicing, the pipeline
insect control constructs, which attempted to identify could be applied to other sample sets, for example to
candidates on an individual basis, often based on distant identify splice forms specific to other tissues, develop-
homology to D. melanogaster genes. Using testis samples mental stages, disease states or external conditions.
corresponding to different developmental stages gave suffi- Based on RNA-seq analysis, we identified a number of
cient resolution to select testis-specifically expressed genes candidate testis-specifically expressed genes with expres-
with expression levels highest in early spermatogenesis, sion highest in early spermatogenesis – 57 in Ae. aegypti
likely to be useful for pre-meiotic protein expression in and 68 in C. capitata. These comprised a minority of
bipartite synthetic genetic systems such as those involving the total number of testis-specifically expressed genes
GAL4-UAS or tTA-tetO. with expression in early spermatogenesis – 388 for Ae.
To identify testis-specific splicing, we have developed aegypti and 667 for C. capitata, suggesting that most
a novel computational pipeline for this type of analysis. testis-specifically expressed genes do not exhibit suitable
Whilst the majority of the work is performed by the pre- expression timing, and so using samples with sufficient
existing Tuxedo suite of programs, these do not produce time resolution as we have done is important in iden-
finished analyses with regards to alternative splicing, but tifying suitable candidates for some types of synthetic
rather an intermediate output that requires further control systems. We also identified a number of candi-
computation to produce user-friendly candidate lists and date testis-specifically spliced genes – 27 in Ae. aegypti
sequences. Our pipeline combines this software with and 33 in C. capitata. Testing the top candidates with
Sutton et al. BMC Genomics (2016) 17:948 Page 11 of 16

RT-PCR validated the expression profiles of six Ae. For example, the Ae. aegypti and Ae. albopictus Actin-4
aegypti and four C. capitata testis-specifically expressed promoters have been used interchangeably to generate a
genes, and seven Ae. aegypti and five C. capitata testis- female-specific flightless phenotype in both species [12],
specifically spliced genes, although the testis-specifically and the An. gambiae β2-tubulin promoter has been used
spliced genes may not all have suitable splice form ratios. to drive testis-specific expression in an An. stephensi
The suitability of these candidates for any particular appli- transgenic sexing strain [19]. Inter-species functionality of
cation should be confirmed with functional testing. alternative splicing has also been demonstrated; female-
Our findings complement those of Akbari et al. [38], specific lethality was achieved in olive fly and D. mela-
who identified regulatory regions specific to the female nogaster using the C. capitata sex-specifically spliced
germline in Ae. aegypti. These regions could be used to tra intron [9] and in diamondback moth using the pink
direct ovary-specific expression in strategies such as Medea bollworm sex-specifically spliced dsx intron [10]. Even
and UDMEL, which have been shown to be capable of driv- if inter-species function is not conserved, identifying
ing population replacement in Drosophila [39, 40], while conserved genes that share a feature of interest facilitates
the testis-specific regulatory regions that we have identified a candidate gene approach to isolating an endogenous
could be used in alternative strategies such as sex distortion element with the desired characteristics.
and paternal effect systems. Testis-specific expression could A potential disadvantage of conserved sequences is
also be achieved with the testis-specific introns that we that they might also function in non-target species.
have identified. These could be used to achieve testis- Transfer to non-target species could theoretically occur
specific expression on their own or in combination with by hybridisation or horizontal gene transfer, though for
testis-specific regulatory elements, or even with regulatory insect species to be able to form fertile hybrids they
elements active in the testis but not testis-specific. This would need to be very closely related, and most molecular
would allow a wider choice of regulatory elements. The elements from one would likely function to some degree
genes that we have identified would also provide a choice in the other. Horizontal gene transfer between divergent
of expression levels in a synthetic construct, given the insect species is extremely rare, though detectable over
varying expression levels for the testis-specifically expressed evolutionary timescales for transposons, for example. The
genes and varying splice form ratios for the testis- consequences of such hypothetical transfer would vary
specifically spliced genes. This may be useful as differ- considerably by application, being potentially more sig-
ent applications utilising testis-specific expression may nificant for highly invasive gene drive systems, much
require different expression levels. less so for self-limiting strategies such as male-sterile
Some of the genes display testis-specific expression or systems. Our approach allows the isolation of both
splicing that is consistent between Ae. aegypti, C. capitata more- and less-conserved sequences from a species of
and D. melanogaster. Conserved genes such as this can be interest, as appropriate.
particularly useful; they can simplify construct generation
across different species, as it is possible that the same or Conclusions
similar sequences may be used for multiple species. Conser- In this study we have used RNA-seq data to identify a
vation of a regulatory element does not necessarily imply number of genes with testis-specific expression or splicing
that it will function in the same way across species; there potentially suitable to provide molecular components for
are examples where regulatory elements from one species use in synthetic control systems involving manipulation of
have failed to drive transgene expression with the same the male germline. Some genes displayed conservation of
strength or specificity in another species as in the native expression or splicing behaviour across species; these may
species, despite the presence of orthologous elements in be particularly promising candidates for further investiga-
the non-native species. For example, the D. melanogaster tion. Overall, our findings provide the beginnings of a
Actin-5C promoter was much less active in a transient comprehensive toolkit for male germline expression in
expression assay in the cricket G. bimaculatus than the synthetic control systems for pest insects.
native actin promoter [41], and displayed a more re-
stricted tissue distribution in transformed Ae. aegypti Methods
than in D. melanogaster [42]. Regulatory elements from Insects
the D. melanogaster gene sry-α failed to drive expression Ae. aegypti of the Asian wild-type strain (originating
in C. capitata [43], and in an example of attempted testis- from Jinjang, Selangor, Malaysia, colonised by the Institute
specific expression, regulatory elements from the vasa of Medical Research (Kuala Lumpur, Malaysia) in 1975,
gene in An. gambiae failed to drive expression in Ae. from which a colony at Oxitec was established in 2003)
aegypti [38]. However, there are many cases of successful [44] were reared under standard conditions, at 27 +/−2°C
inter-species function of regulatory elements for driving and 70 +/−10% relative humidity with a 12:12 h light:dark
targeted transgene expression in insect control systems. cycle. Larvae were reared in trays and fed with Tetramin®
Sutton et al. BMC Genomics (2016) 17:948 Page 12 of 16

(Tetra GmbH, Germany). Males and females for experi- cysts out of the testes and examining isolated cysts with
ments were separated as pupae. Adults were maintained a Nikon Eclipse Ti-S inverted microscope (Fig. 11d–h).
in cages with ad libitum access to a 10% sucrose solution Cysts at specific stages were identified based on cell size
supplemented with 14 U mL−1 penicillin and 14 μg mL−1 and morphology, and collected manually with a pulled-
streptomycin (Sigma-Aldrich, UK). Adult females for both out Pasteur pipette. A C. capitata ovary sample was also
colony maintenance and experimental work were fed defi- prepared, from 5 day old virgin females dissected in
brinated horse blood (TCS Biosciences Ltd., UK) 3–5 days phosphate-buffered saline. Total RNA was extracted using
after eclosion. TRIzol® (Life Technologies Ltd, Paisley, UK), according to
C. capitata of the Toliman wild-type strain (originat- the manufacturer’s instructions. The samples used for
ing from Guatemala, colonised in 1990) were reared RNA-seq are summarised in Table 1. Microscope images
under standard conditions, at 26 +/− 1°C and 65 +/− illustrating testis dissections are shown in Fig. 11.
10% relative humidity with a 12:12 h light:dark cycle. For RT-PCR, total RNA was extracted from testis and
Larvae and adults were kept in plastic containers with gonadectomised male samples, prepared from 0 to 3 day
ad libitum access to a Drosophila diet containing maize old virgin males, and from ovary and gonadectomised
meal, sucrose and yeast. Pupae were allowed to eclose in female samples, prepared from 4 to 6 day old virgin
a Petri dish containing sand. Males and females were females. Ae. aegypti females were dissected approximately
separated shortly after eclosion, before mating. 24 h post-blood meal (PBM). For qRT-PCR, total RNA
was also extracted from staged testis samples, prepared
RNA extraction and cDNA synthesis as described above for RNA sequencing, except in
For RNA sequencing, total RNA was extracted from this instance only two samples – spermatocytes and
staged testis samples, prepared from 3 day old virgin spermatids – were prepared for C. capitata. Tissues
males dissected in phosphate-buffered saline. Tissues from multiple individuals were pooled for each sample.
from multiple individuals were pooled for each sample. Samples were either stored in RNALater (Qiagen, Man-
For Ae. aegypti, two staged testis samples (referred to as chester, UK) or lysis buffer (Life Technologies Ltd, Paisley,
“early” and “late”) were prepared by bisecting testes; the UK or Norgen Bioteck Corp., Ontario, Canada) at −20°C
apical region contains cysts of male germline cells in until RNA extraction, or RNA was extracted immediately.
earlier stages of development, up to late spermatocytes Total RNA was extracted using either a Norgen Total
and the basal region contains spermatid cysts in later RNA Purification Kit (Norgen Biotek Corp., Ontario,
stages of development (Fig. 11a–c). Both of these sam- Canada) or an Ambion RNAqueous Kit (Life Technologies
ples also contained somatic cells from the testis sheath. Ltd, Paisley, UK) according to the manufacturer’s instruc-
An Ae. aegypti gonadectomised male sample was also tions. cDNA for PCR was synthesised using a RevertAid
prepared from the same males used for the testis sam- First Strand cDNA Synthesis Kit (Thermo Scientific,
ples. For C. capitata, four staged testis samples were Pittsburgh, USA) with random hexamer primers ac-
prepared – early spermatocytes, late spermatocytes, cording to the manufacturer’s instructions. The samples
round spermatids and elongated spermatids – by spilling used for RT-PCR are summarised in Table 2.

Fig. 11 Microscope images illustrating preparation of staged testis samples. a Whole testis from A. aegypti pupa. b Apical region of A. aegypti
testis after bisection; used to generate “early” sample. c Basal region of A. aegypti testis after bisection; used to generate “late” sample. d Whole
testis from C. capitata pupa. e Isolated C. capitata early spermatocytes. f Isolated C. capitata late spermatocytes. g Isolated C. capitata early
spermatids. h Isolated C. capitata late spermatids. Scale bar is 100 μm in all panels
Sutton et al. BMC Genomics (2016) 17:948 Page 13 of 16

Table 1 Samples used for RNA-seq analysis

Samples from other studies for which data were downloaded from the SRA are highlighted in grey
PBM post-blood meal

RNA sequencing Sequence data processing


Library preparation, including polyA selection, was per- The overall quality of the sequencing reads was assessed
formed using an Illumina TruSeq RNA Sample Preparation using FastQC (v0.10.1) [45]. Raw reads were processed
Kit (Illumina, UK), according to the manufacturer’s instruc- to remove adapter sequence using FASTA/Q Clipper
tions. Sequencing was performed by the NGS facility at from the FASTX_Toolkit [46] and sequences of poor
Glasgow Polyomics (University of Glasgow, UK) using the quality using Sickle [47]. Reference indexes of the Ae.
Illumina Genome Analyzer II platform, with single reads of aegypti genome assembly AaegL2 [48] (obtained from
73 nucleotides. VectorBase) and the C. capitata genome assembly
Ccap_1.0 [49] (obtained from NCBI) were constructed
using Bowtie2 [50]. Trimmed reads were aligned to these
Data from other studies indexes using TopHat2 (v2.0.9) [51]. Transcript assem-
RNA-seq data from other studies were downloaded from blies were created from the alignments using the refer-
the SRA [34]. This comprised data from a C. capitata ence annotation based transcript assembly method [52]
female sample, and data from Ae. aegypti ovary and gona- with Cufflinks (v2.1.1) [53] followed by Cuffmerge
dectomised samples, published in Akbari et al. (2013) (v1.0.0) [54]. Transcript expression in each sample was
[31]. The samples used are summarised in Table 1. quantified using Cuffdiff2 (v2.1.1) [55].

Table 2 Samples used for RT-PCR analysis

Additional samples used for qRT-PCR analysis only are highlighted in grey
PBM post-blood meal
Sutton et al. BMC Genomics (2016) 17:948 Page 14 of 16

Fig. 12 Computational pipeline for identification of candidate testis-specifically spliced genes. RNA-seq reads were mapped to the relevant reference
genome using TopHat. Transcript assemblies were generated using Cufflinks. Transcript expression was quantified using Cuffdiff. The output of these steps,
along with user-defined threshold FPKM values, was used as input for a custom Python program. Custom Python scripts in combination with bedtools
were used to output a list of candidates with associated information used for further filtering, such as exon-exon junction coverage and expression values,
as well as sequences in a convenient format for primer design – intron flanking sequences, and alignments of all splice forms for each gene

Identification of candidate testis-specifically expressed were: 95°C for 30 s, 2 cycles of 95°C for 15 s, 55°C for
and spliced genes 30 s and 72°C for 2 min, 33 cycles of 95°C for 15 s, 55°C
Candidate testis-specifically expressed and spliced genes for 15 s and 72°C for 30 s, and finally 72°C for 1 min.
were identified from the output of Cuffdiff2, and their Reactions with primers targeting RpL22 transcripts
sequences obtained, using custom Python scripts in were performed as positive controls. RT-PCR products
combination with bedtools (version 2.16.2) [56]. An out- were visualized on 1.5–2% agarose gels.
line of the use of this pipeline to identify candidate qRT-PCR was performed on an Mx3500P instrument
testis-specifically spliced genes is illustrated in Fig. 12, (Stratagene, La Jolla, USA) using iQ™ SYBR® Green
and the documentation for the Python scripts is pro- Supermix (Bio-Rad, Hemel Hempstead, UK) according
vided in Additional file 3. Filtering steps were applied as to the manufacturer’s instructions. Reactions were per-
described in the results section. For steps requiring formed with serial dilutions to determine primer effi-
alignment of sequences, Geneious (7.0.5) [57] was used. ciency. Reactions with primers targeting α-tubulin and
For the candidate testis-specifically expressed genes, 18S rRNA transcripts in Ae. aegypti and α-tubulin and
only genes with higher expression in the samples from Rps17-like transcripts in C. capitata were performed for
early stages of spermatogenesis (the early sample for Ae. normalisation. The reaction parameters were: 95°C for
aegypti; the early spermatocytes sample for C. capitata) 5 min, followed by 40 cycles of 95°C for 15 s, 55°C for
than in the samples from the later stages (the late sample 15 s and 60°C for 15 s.
for Ae. aegypti; the mean expression in the late spermato- For the candidate testis-specifically expressed genes, ex-
cytes, round spermatids and elongated spermatids samples pression was calculated relative to an expression level in
for C. capitata) were taken forward. the early (for Ae. aegypti) or early spermatocytes (for C.
capitata) samples of 1000 for the geometric mean of the
Inter-species comparison two genes used for normalisation. For the candidate
To determine whether candidates were conserved between testis-specifically spliced genes, expression was calculated
species, BLAST analysis was performed. Sequences of all relative to an expression level in the testis of 1 for the pre-
transcripts predicted by Cufflinks were extracted using a dicted testis-specific splice form.
custom Python script in combination with bedtools (ver- To confirm that the PCR results reflected the predicted
sion 2.16.2) [56]. BLAST databases were created from these candidates, PCR products were purified using a QIAquick
sequences using the makeblastdb tool. tBLASTx searches PCR Purification Kit (Qiagen, Manchester, UK) and sent
were then performed using the transcript sequences from for sequencing by GATC Biotech (Cologne, Germany).
one species to query a BLAST database from another spe- PCR primer sequences are available in Additional file 4.
cies. A threshold E value of 0.001 was used.
Additional files
Experimental testing of candidates
RT-PCR primers were designed using Primer-BLAST [58]. Additional file 1: Sequencing and alignment statistics. Table of
RT-PCR was performed on a TGradient thermocycler sequencing and alignment statistics. (XLSX 47 kb)
(Biometra, Goettingen, Germany) using a PCRBIO kit Additional file 2: Candidate genes tested experimentally. Tables of
(PCR Biosystems Ltd, London, UK), according to the candidate Ae. aegypti and C. capitata testis-specifically expressed and
testis-specifically spliced genes tested experimentally. (XLSX 42 kb)
manufacturer’s instructions. The reaction parameters
Sutton et al. BMC Genomics (2016) 17:948 Page 15 of 16

Additional file 3: Documentation for Python scripts. Documentation for 3. Bhatt S, Gething PW, Brady OJ, Messina JP, Farlow AW, Moyes CL, et al. The
custom Python scripts used to identify candidate testis-specifically global distribution and burden of dengue. Nature. 2013;496(7446):504–7.
expressed and spliced genes. (PDF 178 kb) 4. Gong P, Epton MJ, Fu G, Scaife S, Hiscox A, Condon KC, et al. A dominant
lethal genetic system for autocidal control of the Mediterranean fruitfly. Nat
Additional file 4: Primer sequences. List of primer sequences used in Biotechnol. 2005;23(4):453–6.
the study. (PDF 90 kb) 5. Phuc HK, Andreasen MH, Burton RS, Vass C, Epton MJ, Pape G, et al. Late-acting
dominant lethal genetic systems and mosquito control. BMC Biol. 2007;5:11.
Abbreviations 6. Harris AF, Nimmo D, McKemey AR, Kelly N, Scaife S, Donnelly CA, et al. Field
Ae.: Aedes; An.: Anopheles; C.: Ceratitis; D.: Drosophila; FPKM: Fragments per kilobase performance of engineered male mosquitoes. Nat Biotechnol. 2011;29:1034–7.
per million reads; Kb: Kilobase; PBM: Post-blood meal; PCR: Polymerase chain 7. Harris AF, McKemey AR, Nimmo D, Curtis Z, Black I, Morgan SA, et al.
reaction; qRT-PCR: Quantitative reverse transcriptase polymerase chain reaction; Successful suppression of a field mosquito population by sustained release
RT-PCR: Reverse transcriptase polymerase chain reaction; tTA: Tetracycline- of engineered male mosquitoes. Nat Biotechnol. 2012;30:828–30.
repressible transactivator; UAS: Upstream activating sequence 8. Fu G, Condon KC, Epton MJ, Gong P, Jin L, Condon GC, et al. Female-specific
insect lethality engineered using alternative splicing. Nat Biotechnol. 2007;
Acknowledgements 25(3):353–7.
We thank the Molecular Biology Team at Oxitec Ltd, particularly T. Dafa’alla, 9. Ant T, Koukidou M, Rempoulakis P, Gong HF, Economopoulos A, Vontas J,
for assistance with molecular work, and R. Turkel and R. Asadi for assistance et al. Control of the olive fruit fly using genetics-enhanced sterile insect
with dissections. We thank K. Matzen and S. Warner for guidance and technique. BMC Biol. 2012;10:51.
comments on the manuscript. 10. Jin L, Walker AS, Fu G, Harvey-Samuel T, Dafa'alla T, Miles A, et al.
Engineered female-specific lethality for control of pest Lepidoptera. ACS
Funding Synth Biol. 2013;2(3):160–6.
This work was supported by grant BB/L004445/1 from the UK Biotechnology 11. Fu G, Lees RS, Nimmo D, Aw D, Jin L, Gray P, et al. Female-specific flightless
and Biological Sciences Research Council (BBSRC). ERS was supported by a phenotype for mosquito control. Proc Natl Acad Sci U S A. 2010;107(10):4550–4.
BBSRC Industrial CASE studentship BB/J012696/1. LA is supported by core 12. Labbé GM, Scaife S, Morgan SA, Curtis ZH, Alphey L. Female-Specific
funding from the BBSRC to the Pirbright Institute (BBS/E/I/00001892). Flightless (fsRIDL) Phenotype for Control of Aedes albopictus. PLoS Negl
Trop Dis. 2012;6(7):e1724.
Availability of data and materials 13. Marinotti O, Jasinskiene N, Fazekas A, Scaife S, Fu G, Mattingly ST, et al.
The raw sequencing data supporting the conclusions of this article are Development of a population suppression strain of the human malaria
available in the Sequence Read Archive (SRA), under accession number vector. Malar J. 2013;12(1):142.
SRP075464 (http://www.ncbi.nlm.nih.gov/sra/SRP075464). 14. Burt A. Site-specific selfish genes as tools for the control and genetic
Other datasets supporting the conclusions of this article are included within engineering of natural populations. Proc Biol Sci. 2003;270(1518):921–8.
the article and its additional files. 15. Catteruccia F, Crisanti A, Wimmer EA. Transgenic technologies to induce
The scripts used in this article and their documentation are available at https:// sterility. Malar J. 2009;8 Suppl 2:S7.
github.com/ElizabethSutton/RNA-seq_analysis_tools. They are written in Python, 16. Galizi R, Doyle LA, Menichelli M, Bernardini F, Deredec A, Burt A, et al. A
and require Python to run. They have been tested only with Python 2.7.3. They synthetic sex ratio distortion system for the control of the human malaria
should run on any operating system with Python, but have been tested only with mosquito. Nat Commun. 2014;5:3977.
Linux. They are provided with an MIT license and are freely available to use. 17. Windbichler N, Papathanos PA, Crisanti A. Targeting the X chromosome
during spermatogenesis induces Y chromosome transmission ratio
Authors’ contributions distortion and early dominant embryo lethality in Anopheles gambiae. PLoS
HWC and LA were responsible for the initial design and co-ordination of the Genet. 2008;4(12):e1000291.
study. YY performed dissections and extraction of RNA, and library preparations 18. Kemphues KJ, Raff RA, Kaufman TC, Raff EC. Mutation in a structural gene
for sequencing. ERS performed data analysis and further laboratory work. SMS for a beta-tubulin specific to testis in Drosophila melanogaster. Proc Natl
assisted with data analysis. ERS wrote the initial draft manuscript. All authors Acad Sci U S A. 1979;76(8):3991–5.
edited the draft, and read and approved the final manuscript. 19. Catteruccia F, Benton JP, Crisanti A. An Anopheles transgenic sexing strain
for vector control. Nat Biotechnol. 2005;23(11):1414–7.
Competing interests 20. Smith RC, Walter MF, Hice RH, O'Brochta DA, Atkinson PW. Testis-specific
LA has equity interest in Intrexon Inc., which acquired Oxitec in 2015. expression of the beta2 tubulin promoter of Aedes aegypti and its application
as a genetic sex-separation marker. Insect Mol Biol. 2007;16(1):61–71.
Consent for publication 21. Scolari F, Schetelig MF, Bertin S, Malacrida AR, Gasperi G, Wimmer EA. Fluorescent
Not applicable. sperm marking to improve the fight against the pest insect Ceratitis capitata
(Wiedemann; Diptera: Tephritidae). Nat Biotechnol. 2008;25(1):76–84.
Ethics approval and consent to participate 22. Olivieri G, Olivieri A. Autoradiographic study of nucleic acid synthesis during
Not applicable. spermatogenesis in Drosophila melanogaster. Mutat Res. 1965;2(4):366–80.
23. Gould-Somero M, Holland L. The timing of RNA synthesis for
Author details spermiogenesis in organ cultures of Drosophila melanogaster testes.
1
Department of Zoology, University of Oxford, Oxford OX1 3PS, UK. 2Oxitec Wilhelm Rouxs Arch. 1974;174:133–48.
Ltd, Milton Park, Abingdon OX14 4RX, UK. 3School of Biosciences, Cardiff 24. Barreau C, Benson E, White-Cooper H. Comet and cup genes in Drosophila
University, Cardiff CF10 3AX, UK. 4The Pirbright Institute, Pirbright GU24 0NF, spermatogenesis: the first demonstration of post-meiotic transcription.
UK. 5Present address: Sistemic, West of Scotland Science Park, Glasgow G20 Biochem Soc Trans. 2008;36(Pt 3):540–2.
0SP, UK. 6Present address: The Beatson Institute for Cancer Research, CRUK, 25. Barreau C, Benson E, Gudmannsdottir E, Newton F, White-Cooper H. Post-
Glasgow G61 1BD, UK. meiotic transcription in Drosophila testes. Development. 2008;135(11):1897–902.
26. Vibranovski MD, Chalopin DS, Lopes HF, Long M, Karr TL. Direct evidence for
Received: 26 May 2016 Accepted: 9 November 2016 postmeiotic transcription during Drosophila melanogaster spermatogenesis.
Genetics. 2010;186(1):431–3.
27. Schafer M, Nayernia K, Engel W, Schafer U. Translational control in
References spermatogenesis. Dev Biol. 1995;172(2):344–52.
1. Hemingway J, Field L, Vontas J. An overview of insecticide resistance. 28. Wimmer EA. Innovations: applications of insect transgenesis. Nat Rev Genet.
Science. 2002;298(5591):96–7. 2003;4(3):225–32.
2. Gubler DJ. Epidemic dengue/dengue hemorrhagic fever as a public health, 29. Thomas DD, Donnelly CA, Wood RJ, Alphey LS. Insect population control
social and economic problem in the 21st century. Trends Microbiol. 2002; using a dominant, repressible, lethal genetic system. Science. 2000;
10(2):100–3. 287(5462):2474–6.
Sutton et al. BMC Genomics (2016) 17:948 Page 16 of 16

30. Zid BM, O’Shea EK. Promoter sequences direct cytoplasmic localization and 55. Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L.
translation of mRNAs during starvation in yeast. Nature. 2014;514(7520):117–21. Differential analysis of gene regulation at transcript resolution with RNA-seq.
31. Akbari OS, Antoshechkin I, Amrhein H, Williams B, Diloreto R, Sandler J, et al. Nat Biotechnol. 2013;31(1):46–53.
The Developmental Transcriptome of the Mosquito Aedes aegypti, an Invasive 56. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing
Species and Major Arbovirus Vector. G3 (Bethesda). 2013;3(9):1493–509. genomic features. Bioinformatics. 2010;26(6):841–2.
32. Whyard S, Erdelyan CN, Partridge AL, Singh AD, Beebe NW, Capina R. 57. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious
Silencing the buzz: a new approach to population suppression of Basic: an integrated and extendable desktop software platform for the
mosquitoes by feeding larvae double-stranded RNAs. Parasit Vectors. 2015; organization and analysis of sequence data. Bioinformatics. 2012;28(12):1647–9.
8(1):716. 58. Ye J, Coulouris G, Zaretskaya I, Cutcutache I, Rozen S, Madden TL. Primer-
33. Ayyar S, Jiang J, Collu A, White-Cooper H, White RA. Drosophila TGIF is BLAST: a tool to design target-specific primers for polymerase chain
essential for developmentally regulated transcription in spermatogenesis. reaction. BMC Bioinformatics. 2012;13:134.
Development. 2003;130(13):2841–52.
34. Sequence Read Archive. http://www.ncbi.nlm.nih.gov/sra. Accessed 24 May 2016.
35. Flybase. www.flybase.org. Accessed 24 May 2016.
36. Li K, Xu EY, Cecil JK, Turner FR, Megraw TL, Kaufman TC. Drosophila
centrosomin protein is required for male meiosis and assembly of the
flagellar axoneme. J Cell Biol. 1998;141(2):455–67.
37. Shen S, Park JW, Huang J, Dittmar KA, Lu ZX, Zhou Q, et al. MATS: a
Bayesian framework for flexible detection of differential alternative splicing
from RNA-Seq data. Nucleic Acids Res. 2012;40(8):e61.
38. Akbari OS, Papathanos PA, Sandler JE, Kennedy K, Hay BA. Identification of
germline transcriptional regulatory elements in Aedes aegypti. Sci Rep.
2014;4:3954.
39. Akbari OS, Chen CH, Marshall JM, Huang H, Antoshechkin I, Hay BA. Novel
Synthetic Medea Selfish Genetic Elements Drive Population Replacement in
Drosophila; a Theoretical Exploration of Medea-Dependent Population
Suppression. ACS Synth Biol. 2014;3(12):915–28.
40. Chen CH, Huang H, Ward CM, Su JT, Schaeffer LV, Guo M, et al. A synthetic
maternal-effect selfish genetic element drives population replacement in
Drosophila. Science. 2007;316(5824):597–600.
41. Zhang H, Shinmyo Y, Hirose A, Mito T, Inoue Y, Ohuchi H, et al.
Extrachromosomal transposition of the transposable element Minos in embryos
of the cricket Gryllus bimaculatus. Dev Growth Differ. 2002;44(5):409–17.
42. Pinkerton AC, Michel K, O’Brochta DA, Atkinson PW. Green fluorescent
protein as a genetic marker in transgenic Aedes aegypti. Insect Mol Biol.
2000;9(1):1–10.
43. Schetelig MF, Caceres C. Zacharopoulou, Franz G, Wimmer EA. Conditional
embryonic lethality to improve the sterile insect technique in Ceratitis
capitata (Diptera: Tephritidae). BMC Biol. 2009;7:4.
44. Bargielowski I, Nimmo D, Alphey L, Koella JC. Comparison of life history
characteristics of the genetically modified OX513A line and a wild type
strain of Aedes aegypti. PLoS One. 2011;6(6):e20699.
45. Andrews S. FastQC: A quality control tool for high throughput sequence
data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc. Accessed
24 May 2016.
46. Hannon Lab. FASTX Toolkit. http://hannonlab.cshl.edu/fastx_toolkit/index.
html. Accessed 24 May 2016.
47. Joshi NA, Fass JN. Sickle. A sliding-window, adaptive, quality-based trimming
tool for FastQ files. https://github.com/najoshi/sickle. Accessed 24 May 2016.
48. Nene V, Wortman JR, Lawson D, Haas B, Kodira C, Tu ZJ, et al. Genome
sequence of Aedes aegypti, a major arbovirus vector. Science. 2007;
316(5832):1718–23.
49. Papanicolaou A, Schetelig MF, Arensburger P, Atkinson PW, Benoit JB, Bourtzis
K, et al. The whole genome sequence of the Mediterranean fruit fly, Ceratitis
capitata (Wiedemann), reveals insights into the biology and adaptive evolution
of a highly invasive pest species. Genome Biol. 2016;17(1):192.
50. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2.
Nat Methods. 2012;9(4):357–9. Submit your next manuscript to BioMed Central
51. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: and we will help you at every step:
accurate alignment of transcriptomes in the presence of insertions,
deletions and gene fusions. Genome Biol. 2013;14(4):R36. • We accept pre-submission inquiries
52. Roberts A, Pimentel H, Trapnell C, Pachter L. Identification of novel • Our selector tool helps you to find the most relevant journal
transcripts in annotated genomes using RNA-Seq. Bioinformatics. 2011;
• We provide round the clock customer support
27(17):2325–9.
53. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. • Convenient online submission
Transcript assembly and quantification by RNA-Seq reveals unannotated • Thorough peer review
transcripts and isoform switching during cell differentiation. Nat Biotechnol.
• Inclusion in PubMed and all major indexing services
2010;28(5):511–5.
54. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential • Maximum visibility for your research
gene and transcript expression analysis of RNA-seq experiments with
TopHat and Cufflinks. Nat Protoc. 2012;7(3):562–78. Submit your manuscript at
www.biomedcentral.com/submit

You might also like