2023 02 27 530355v3 Full
2023 02 27 530355v3 Full
2023 02 27 530355v3 Full
1 Comparative analysis of amphibian genomes: an emerging resource for basic and applied research
5 Tiffany A. Kosch1, Andrew J. Crawford2, Rachel Lockridge Mueller3, Katharina C. Wollenberg Valero4,
6 Megan L. Power4, Ariel Rodríguez 5, Lauren A. O’Connell6, Neil D. Young1, and Lee F. Skerratt1
7
1
8 Faculty of Science, University of Melbourne, Melbourne, Australia
2
9 Departamento de Ciencias Biológicas, Universidad de los Andes, Bogotá, Colombia
3
10 Department of Biology, Colorado State University, Colorado, USA
4
11 School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
5
12 Institute of Zoology, University of Veterinary Medicine of Hannover, Hannover, Germany
6
13 Department of Biology, Stanford University, California, USA
14
16
17
18
19
20
21
22
23
24
25
26
27
28
29
1
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
30 ABSTRACT
31 Amphibians are the most threatened group of vertebrates and are in dire need of conservation
32 intervention to ensure their continued survival. They exhibit unique features including a high
33 diversity of reproductive strategies, permeable and specialized skin capable of producing toxins and
34 antimicrobial compounds, multiple genetic mechanisms of sex determination, and in some lineages,
35 the ability to regenerate limbs and organs. Although genomics approaches would shed light on these
36 unique traits and aid conservation, sequencing and assembly of amphibian genomes has lagged
37 behind other taxa due to their comparatively large genome sizes. Fortunately, the development of
38 long-read sequencing technologies and initiatives has led to a recent burst of new amphibian
39 genome assemblies. Although growing, the field of amphibian genomics suffers from the lack of
40 annotation resources, tools for working with challenging genomes, and lack of high-quality
42 genomes to evaluate their usefulness for functional genomics research. We report considerable
43 variation in genome assembly quality and completeness, and report some of the highest
45 association between transposable element content and climatic variables. Our analysis provides
46 evidence of conserved genome synteny despite the long divergence times of this group, but we also
48 discuss sequencing gaps in the phylogeny and suggest key targets for future sequencing endeavors.
50 conservation.
51
2
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
52 KEYWORDS
54 genome synteny
55 INTRODUCTION
56 Amphibians are an ancient lineage of vertebrates that predate amniotes by more than 100 million
57 years. Despite the considerable age of this lineage, amphibians are now the most threatened group
58 of vertebrates with more that 40% of species and are threatened by factors such as habitat change,
59 disease, and over-exploitation (IUCN, 2022; Scheele et al., 2019). Notably, many of these threats are
60 hard to reverse, suggesting that novel approaches that utilize genomic resources may lead to
61 improved management decisions for some of the most endangered taxa (Kosch et al., 2022; Scheele
62 et al., 2014).
63
64 We are only just beginning to understand the genetic basis of many of the unique features of
65 amphibians. Amphibians exhibit a high diversity of reproductive strategies including biphasic and
66 direct development, uniparental and biparental care, mouth and gastric brooding, and foam nesting
67 (Brown et al., 2010; Nunes-de-Almeida et al., 2021; Schulte et al., 2020). They also have specialized
68 skin capable of producing complex compounds of interest for drug discovery for the development of
69 antimicrobial drugs and analgesics (Daly et al., 2000; De Angelis et al., 2021; Liu et al., 2020).
70 Amphibians occur across habitat types from rainforests to deserts, freshwater streams to salt
71 marshes, and tropical to arctic climates (Duellman, 1999), but it is unclear how this ecological
72 diversity is reflected in genome composition. One potential way is the number of transposable
73 elements (TEs) present in the genome. TEs have a huge impact on the structure and function of
74 eukaryotic genomes, with amphibians having among the largest TE content among vertebrates.
75 There is increasing evidence that TE activity, and thus their relative proportion in genomes, is
3
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
76 influenced by abiotic factors (Pimpinelli & Piacentini, 2020). This in turn highlights their potential role
77 in the regulation of genetic mechanisms responsible for environmental adaptation (Casacuberta &
78 González, 2013; Pappalardo et al., 2021). Salamanders are an important resource for transplant and
79 regeneration research due to their ability to regenerate limbs and internal organs (Elewa et al., 2017;
80 Nowoshilow et al., 2018). Amphibians also have many of the same immune components of mammals
81 making them an important model resource for immunology (Paiola et al., 2023; Robert, 2020).
82
83 Despite the obvious value of amphibian genomes for research on ecology, evolution, medicine, and
84 improving their conservation, until recently, the generation of amphibian reference genomes has
85 been markedly slower than other vertebrates (Hotaling et al., 2021a; Womack et al., 2022). This lag
86 can be attributed to high costs and the computational challenges of assembling their often large and
87 complex genomes (Sun et al., 2020). Recent advances in sequencing technologies such as long read
88 sequencing and assembly algorithms that incorporate hybrid approaches have circumvented many of
89 these challenges leading to a surge of high quality, chromosome-level reference genomes. The next
90 challenge will be developing the tools for annotation and comparative analyses of these large
91 genomes.
92
93 In this study, we provide a synthesis of all available amphibian reference genome assemblies, 51 at
94 the time of our analysis, with the number growing every day. We evaluate assembly quality,
95 sequencing technology, gene completeness, transposable element and repeat content and its
97
4
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
99 Genomes
100 A search of the NCBI genome website using the search term “amphibians” conducted on August 25,
101 2023, revealed there were 90 amphibian genomes from 68 species. All genome files in fasta format
102 were downloaded for assessment. Sixteen salamander genomes (Pyron et al., 2024) were excluded
103 from our analyses due to their high degree of incompleteness (i.e., <10% of genome assembled). Of
104 the remaining genomes, one genome was selected for each species for subsequent analysis. If there
105 was more than one draft of a genome, the most recent draft and/or the primary haplotype was
106 selected. In cases where there were multiple versions sequenced by different groups, the best
107 genome was selected by lowest scaffold number. Entire genomes (including uncharacterized contigs
108 but excluding mitochondrial genomes) were used for assessment unless indicated otherwise.
109
110 Genome databases NCBI Genomes, NCBI RefSeq (O'Leary et al., 2016), Ensembl (Cunningham et al.,
111 2022), UCSC Genome Browser (Lee et al., 2022), and Genomes on a Tree (GoaT) (Sotero-Caio et al.,
112 2021) were searched for information on the 51 amphibian species with reference genomes including
113 chromosome number, annotation data, proteome availability, C-value, and sequencing technology.
114 Sequencing strategy was classified as “short-single” for Illumina only sequencing, “long-single” for
115 sequencing using long read technologies (e.g., PacBio and Oxford Nanopore), and “hybrid” for
116 sequencing approaches using more than one approach (e.g., PacBio and Hi-C).
117
118 A search for amphibian proteome datasets on NCBI RefSeq (O'Leary et al., 2016), Ensembl
119 (Cunningham et al., 2022), and UCSC Genome Browser (Lee et al., 2022) databases on June 24, 2022
121
5
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
122 A search of the NCBI Organelle database on 15, February 2023 using search term “amphibian”
123 resulted in 353 mitochondrial genomes belonging to 345 species (Table S11). Seventeen
124 mitochondrial genomes overlapped with the amphibian nuclear genomes analyzed in this study.
125
127 The GoaT online database (Sotero-Caio et al., 2021) was searched on August 28, 2023 to summarize
128 genomes in progress or publicly available using the search terms “tax_tree(Amphibia) AND
130 tax_rank(species) AND sequencing_status=insdc_open”. The same search terms were used to
131 summarize publicly available genomes for mammals, birds, and non-avian reptiles with the
133
135 Genome quality assessment was performed with BBMap (v. 39.01) “statswrapper.sh” bash script
136 (https://github.com/BioInfoTools/BBMap). This tool generates metrics such as genome size, contig
137 N50, and scaffold count. Benchmarking Universal Single-Copy Orthologs (BUSCO) were summarized
138 with the BUSCO tool (v. 5.1.2) (Manni et al., 2021) using the OrthoDB Tetrapoda ortholog library (v.
139 odb10) (Kriventseva et al., 2018) (N=5310 orthologs) with the prompt “-m genome”. Percentage of
140 the genome assembled to chromosomes was calculated with a custom bash script that computes the
141 genome length assigned to chromosomes and divides it by the “assembly length” value computed by
142 BBMap.
143
145 A species to family correspondence table was obtained from Jetz and Pyron (2018)
146 (https://vertlife.org/files_20170703/) and was filtered to include only the species with the longest
6
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
147 nucleotide sequence per family. This taxa subset was used to obtain a subset of 100 phylogenetic
148 trees from the posterior distribution of the Jetz and Pyron (2018) dataset, as available
149 from http://vertlife.org/phylosubsets. A consensus tree from these 100 trees was then obtained
150 using treeannotator (v2.7.5) (settings - target tree type: maximum clade credibility, node heights:
151 median burn-in percentage: 0, posterior probability limit: 0.0) (Drummond & Rambaut, 2007). The
152 species names of the tree tips were then substituted with the corresponding family names using the
154 the aid of the species to family correspondence table, which was updated with the most recent
155 classification available in AmphibiaWeb (https://amphibiaweb.org) and the Amphibian Species of the
158
160 Repeats were de novo modelled with RepeatModeler (Apptainer v. 1.2.3) (Flynn et al., 2020).
161 Genomes were then annotated using RepeatMasker (v. 4.1.2-p1) (Smit et al., 2013) with a
162 concatenated library of genome-specific repeats generated from RepeatModeler and the Dfam
163 amphibian repeat library (v. Dfam.h5) (Storer et al., 2021). Before annotation, any previous soft
164 masking of the genomes was reversed. The results were summarized using a custom bash and R
165 scripts.
166
168 Occurrence data for the 51 species were downloaded from the Global Biodiversity Information
169 Facility (GBIF) (https://www.gbif/org/; last accessed February 2024 (full DOI’s for each occurrence
170 data set in Table S5). In addition, due to the putative involvement of temperature in TE activity,
171 BioClim variables associated with temperature (Bio1-Bio11) were obtained for the 51 amphibian
7
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
172 species (Table S6). As previous studies have explored the relationship between amphibian genome
173 size and environmental variables (Liedtke et al., 2022), here we focused on the relationship between
174 temperature variables, elevation, and amphibian transposable elements. Influence of these
175 bioclimatic variables (after removing highly collinear variables, see supplementary methods) on
176 transposable element content (summarized into three groups: proportion of total transposable
177 elements (TEs), proportion of retroelements, and proportion of DNA transposons) was modelled
178 using Bayesian mixed effect models (Hadfield, 2010). To correct for body size, log transformed body
179 size was included in the model structure with log transformed Bio2 (Mean diurnal range), Bio4
180 (temperature seasonality), Bio8 (mean temperature of wettest quarter), Bio10 (mean temperature of
181 warmest quarter) and elevation. Models were also corrected for phylogenetic non-independence
182 (Figure S5, see Supplementary Methods for further information) with phylogenetically independent
183 contrasts (Felsenstein, 1985; Garland Jr et al., 1992) plotted with and (Revell, 2024).
184
186 Synteny of BUSCO genes for chromosome level assemblies was analyzed with R Package GENESPACE
187 (v. 1.1.4) (Lovell et al., 2022), which uses OrthoFinder (v. 2.5.4) (Emms & Kelly, 2019) to infer
188 orthology. Synteny was analyzed using BUSCO “full_table.tsv” results files that were reformatted for
189 GENESPACE input using a custom bash script. Synteny plots were generated for all chromosome level
190 assemblies, all anuran chromosome level assemblies, for the two salamander genomes, and for the
191 three caecilian genomes using the GENESPACE plotting tool “plot_riparian”. Chromosomes with
192 reversed orientation compared to the reference genome were inverted to improve visualization.
193
195 Regression analyses, ANOVAs, and Student’s t-tests for comparing genome quality measurements
196 were conducted with the R statistics package (v. 4.1.2) (Team, 2013) in R Studio (v. 2022.02.3) (Team,
8
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
197 2022). Genome quality measures, contigN50, and scaffold count, were log transformed prior to
198 analysis. R-scripts for statistical analysis and plotting are available on GitHub at
199 https://doi.org/10.5281/zenodo.7679280.
200
201 RESULTS
204 with a variety of sequencing technologies, including Illumina (NextSeq, HiSeq), PacBio (RS11, Sequel),
205 and Oxford Nanopore. Sequenced genomes represented 25 of 73 amphibian families with reference
206 genomes distributed unevenly across the phylogeny (Fig. 1). For example, there are only two
207 salamander genomes representing the 798 extant species, no genomes representing anuran families
208 such as Leiopelmatidae or Hyperoliidae, yet there are seven Ranidae and six Pipidae genomes (Fig. 1).
209
210 Genome assembly length ranged from 0.48 Gb in Scaphiopus couchii to 28.21 Gb in Ambystoma
211 mexicanum and was strongly positively associated with c-value estimates of genome size (F49 = 330.5,
212 p < 1 × 10-15) (Table 1, Fig. S1). Twenty-eight of these genomes were assembled to the chromosome
213 level of which the percentage of the genome assigned to chromosomes ranged from 63.88 to 99.99%
214 (Table 1). Percentage of the genome assigned to chromosomes was positively associated with contig
215 N50 (F26 = 8.6, p = 0.007) and read length (t29.2 = 3.07, p = 0.005) and negatively associated with the
216 number of scaffolds (F26 = 25.2, p < 0.00001). There are additionally mitochondrial genome
217 assemblies for 345 species of which 17 had nuclear reference genomes. Eleven of the species with
219
9
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
220 The quality of the amphibian genomes varied considerably (Table 1). Genomes generated with short-
221 read technologies were of lower quality than long-read or hybrid genome assemblies as indicated by
222 significantly lower contig N50s (F2,48 = 26.91, p < 10-6), percentage of complete Benchmarking
223 Universal Single-Copy Ortholog (BUSCO) genes (Fig. S3; F2,48 = 10.52, p < 0.001), and higher scaffold
225
226 Contig N50 ranged from 362 bp in S. couchii to 45.59 Mb in Pleurodeles waltl with a median of 611.23
227 Kb. Scaffold count varied considerably from 17 in Spea bombifrons to more than four million in
228 Bombina variegata with a median of 6.66 Kb (Table 1). Benchmarking Universal Single-Copy
229 Orthologs (BUSCO) scores ranged from 0.7 to 99.5% completeness (Tables 1, S1; Fig. 2) and were
230 positively associated with contig N50 (F49 = 82.6, p < 10-10; Fig. S2) and scaffold count (F49 = 66.04, p <
231 10-8). Most genomes had low percentages of duplicate BUSCO genes (< 6%), suggesting they may be
232 diploid except for Ranitomeya imitator and the known tetraploid species, X. laevis and X. borealis
234
235 Repeat content
236 Overall identified repeat percentage of the genomes ranged from 23% in Platyplectrum ornatum to
237 82% in Oophaga sylvatica and was positively associated with genome size (F49 = 13.24, p = 0.0006)
238 (Tables 1; Fig. S3). Repeat content varied across genomes with the anurans Pseudophrne corroboree,
239 Bombina bombina, and O. sylvatica dominated by Long Terminal Repeats (LTRs), the three caecilians
240 dominated by Long Interspersed Nuclear Elements (LINEs), and many of the ranid and bufonid
241 anurans dominated by DNA transposons (Fig. 3; Tables S2-S4). Salamander genomes Ambystoma
242 mexicanum and Pleurodeles waltl had fewer repeats than might be predicted given their large sizes
244
10
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
245 The proportion of repeats that could be classified by RepeatMasker ranged from 7.4 % in P. ornatum
246 to 47.8 % in P. corroboree (Table S1) and was positively associated with genome quality measures
247 contigN50 (F49 = 23.49, p = 0.001), scaffold count (F49 = 8.71, p = 0.005), and percent BUSCO complete
248 (F49 = 10.27, p = 0.002). The ability to classify repeats was also positively associated with read length,
249 with longer reads resulting in better classification (t 35.622 = 4.73, p < 0.001).
250
253 proportion of transposable elements and environmental variables. Controlling for phylogenetic
254 relationships (by estimating Pagel’s lambda, λ; de Villemereuil & Nakagawa, 2014), including body
255 size as a covariate (Spearman correlation with transposable element content = -0.772, p<0.001) and
256 excluding the three globally invasive species (Rhinella marina, X. laevis, and Lithobates catesbeianus)
257 our analysis revealed a significant influence (pMCMC = 0.014) of Bio8 (mean temperature of the
258 wettest quarter) on the proportion of total transposable elements (Figs 4, S5; Table S8). Inclusion of
259 these three invasive species did not change this relationship (Table S7). Further analysis indicated
260 that the relationship with Bio8 was not specific to a particular class of transposable elements, such as
261 retroelements or DNA transposons (Tables S9 and S10). Phylogenetic signal (Pagel’s lambda, λ) was
262 moderate when considering total transposable elements and retroelements (0.555; Table S7) and
263 increased when we considered retroelements and DNA transposons alone (0.616 and 0.649; Table
264 S9).
265
267 Genome synteny of BUSCO genes was highly conserved within amphibian orders (caecilians (Fig. S7),
268 caudates (Fig. S8), and anurans (Fig. S9); but was less conserved across the amphibian orders (Fig. 5,
269 S6). However, chromosome naming was inconsistent across all taxa (Figs 5, S6-S9). For example, X.
11
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
270 tropicalis chr1 is chr12 in Leptobrachium ailaonicum (but not L. leishanense ), chr2 in Bufo bufo (but
271 not Bufo gargarizans) (Fig. S9), and most of the chromosomes for the two salamander genomes (Fig.
272 S8). Orientation of chromosomes was also inconsistent, including between species of the same genus
273 (e.g., Bufo, Leptobrachium) (Fig. S59) and among the three caecilians (Fig. S7). Multiple inversions
274 were evident including between chr3 of pipids (Xenopus tropicalis and Hymenochirus boettgeri ) and
275 other anurans (chromosomes 1, 2, 3, 4, or 10), caecilians (chr3 and chr4/5/6), and even within
276 species of the same genus (chr7 Bufo gargarizans, chr 9 B. bufo; Figs 5, S7, S9). There was also
277 evidence of several chromosomal fissions including the separation of chr1 of Leptobrachium
278 leishanense into chr3 and chr6 in Pyxicephalus adspersus and into chr3 and chr7 in Engystomops
279 pustulosus; however, this chromosome remained mostly intact in the other anuran genomes (Fig.
280 S9).
281 DISCUSSION
282 In this study we analyzed 51 amphibian reference genomes from the public domain to evaluate their
283 content and usefulness for functional genetics research (Fig. 1, Table 1). There are considerably
284 fewer reference genomes for amphibians than exist for birds (N=754), mammals (N=406), and non-
285 avian reptiles (N=108). This scarcity of reference genomes results in many gaps in genome
286 representation across the amphibian tree of life including many entirely unrepresented groups and
287 with only two genomes representing the entire order Caudata (but see Myers & Pyron, 2024). The
288 unrepresented families include many of interest from a conservation perspective due to their high
289 number of IUCN RedList Critically Endangered species (e.g., Cryptobranchidae, Plethodontidae,
290 Strabomantidae, and Craugastoridae) (IUCN, 2022). However, our search of the Genomes on a Tree
291 (GoaT) database (Sotero-Caio et al., 2021) indicated that there are a further 20 amphibian genome
292 assemblies in progress (15 anurans, 5 caudates; Table S10) indicating that this resource will be
294
12
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
295 The quality and completeness of the genomes in our dataset varied considerably (e.g., Fig. 2). Much
296 of this variation can be attributed to the sequencing technology used to generate them, with short-
297 read sequencing approaches resulting in lower completeness and continuity (Fig. S2). These impacts
298 are a recognized limitation of short-read sequencing and have been reported to impact genome
299 quality in taxa from insects (Hotaling et al., 2021b) to other vertebrates (Rhie et al., 2021), but have
300 likely had a disproportionate impact in amphibian genomes due to the difficulty of assembling
301 genomes with high repeat content (Sun et al., 2020). Fortunately, most ongoing sequencing efforts
302 now use long-read or hybrid sequencing approaches (i.e., that incorporate scaffolding technologies
303 such as Hi-C sequencing), which along with improved sequencing algorithms, should result in higher
304 quality amphibian genomes (Hotaling et al., 2021a; Lawniczak et al., 2022; Rhie et al., 2021).
305
306 The variation we report here in genome quality, contiguity, and completeness may impact the value
307 of the genomes for functional genomics research. However, the improvements in all these measures
308 seen with the utilization of long read technologies or hybrid assemblies suggests that genome quality
309 will continue to improve as these approaches are used more frequently. Genome quality (i.e., high
310 continuity, contiguity, accuracy, completeness (Rhie et al., 2021)) are critical for applications such as
311 quantitative genetics where assembly errors can lead to incorrect inferences in genetic association or
312 genetic prediction. Quality also enhances the usefulness of genomes. For example, highly contiguous
313 chromosome-level assemblies decrease computational requirements for downstream analyses such
315
316 One of the most intriguing features of amphibian genomes is the huge range they exhibit in size
317 (Biscotti et al., 2019). This was exemplified in our dataset where assembly length ranged from 0.48
318 Gb in Scaphiopus couchii to 28 Gb in Ambystoma mexicanum. Why gigantic genomes exist in some
319 species, but not others, remains a key evolutionary question (Kapusta et al., 2017; Wang et al., 2021).
13
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
320 Explanations include differences in genome-level processes (e.g., insertion and deletion rates)
321 (Frahry et al., 2015; Sun et al., 2012b), development (e.g., developmental rate and complexity)
322 (Gregory, 2002; Liedtke et al., 2018), physiology (e.g., water loss) (Johnson et al., 2021), body size
323 (e.g. miniaturization) (Decena-Segarra et al., 2020), and demography (e.g., effective population size)
324 (Liedtke et al., 2018; Lynch & Walsh, 2007) (but see Mohlhenrich & Mueller, 2016). As more
325 amphibian genomes become available, these hypotheses can be more rigorously evaluated.
326
327 We report some of the largest estimates of repeat content of any vertebrate (82% in Oophaga
328 sylvatica and 77% in Rana muscosa), exceeded only by the Australian lungfish at 90% (Meyer et al.,
329 2021). As expected, genome size was correlated with repeat content affirming that much of the
330 variation in amphibian genome size is due to an excess of repeats and transposable elements rather
331 than coding regionds (Biscotti et al., 2019; Lamichhaney et al., 2021; Zuo et al., 2023).
332
333 In contrast to mammals, whose repeat landscape is mainly dominated by LTR retrotransposons (Platt
334 et al., 2018), amphibian repeat content varied considerably with some species dominated by DNA
335 transposons (as previously reported (Suda et al., 2022; Zuo et al., 2023), and others by non-LTR
336 retrotransposons including the three caecilian genomes which were dominated by LINEs. This agrees
337 with genomic data and transcriptomic data from the caecilian Ichthyophis bannanicus, where LINEs
338 were the second most abundant type of repeat (26% of the genome) behind Dictyostelium
339 intermediate repeat sequences (DIRS) (30%) (Wang et al., 2021); this is a similar percentage of LINES
340 to what we report in the three caecilian genomes in this study (19 to 26%) (Table S4).
341
342 These disparities in repeat percentage and content likely reflect differing evolutionary histories
343 among species, as indicated by three of the four congeneric species pairs in our dataset having
344 similar values (i.e., Bufo, Leptobrachium, and Xenopus; but not Oophaga). The differences we
14
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
345 observed in Oophaga pumilio and O. sylvatica are likely due to assembly quality rather than genome
346 content given that these two genomes were sequenced with different technologies and have
347 dramatically different genome qualities (e.g., contig N50s of 5.8 vs. 97.8 Kbp respectively).
348
349 A considerable proportion of the repeats could not be classified. This was likely due to incorrect
350 classification (e.g., genes categorized as repeats) and the lack of good amphibian-specific repeat
351 resources (Ou et al., 2019) for classification via nucleotide sequence homology. The majority of
352 amphibian curated repeat libraries are generated in reference to Xenopus species (e.g., Dfam); the
353 large divergence times of this genus from the other amphibian species suggests that it may be a
354 contributing factor to the lack of classification. However, we also report many unclassified repeats in
356
357 The largest genomes in our dataset from caudates, A. mexicanum and P. waltl , had fewer repeats
358 than predicted given their size (Fig S4) (Nowoshilow et al., 2018). This may be due, in part, to the
359 Dfam (Storer et al., 2021) library used for repeat annotation being anuran-based; however, we did
360 not observe this trend in the three caecilian genomes in our dataset. Also, we performed de novo
361 annotation of these genomes, which should have captured repetitive elements missing from Dfam.
362 More likely, this low number of repeats reflects low deletion rates and, thus, persistence of repeats
363 in the genome for extremely long periods of time, leading to their mutational decay into unique
364 sequences whose repetitive origin is obscured (Frahry et al., 2015; Keinath et al., 2015; Novák et al.,
366
367 We also show that amphibian species that inhabit warm climates particularly during months with
368 high precipitation have a greater proportion of transposable elements. This observed trend does not
369 appear to be driven by a specific group of transposons suggesting it may be caused by climatic
15
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
370 factors. Recent studies indicate that transposable elements exhibit greater activity in hotter climates
371 (Baduel et al., 2021) with an increasing number of studies suggesting increased transposable element
372 activity contributes to genetic diversification and facilitates species adaptation (Li et al., 2018;
373 Schrader & Schmitz, 2019; Stapley et al., 2015). The pattern observed here likewise suggests the
374 potential for heightened transposable element activity and may help explain transposable element
375 accumulation and potentially the higher evolutionary rates observed in the genomes of tropical
377
378 Our study is the first to examine chromosomal synteny across all amphibian orders. We show that
379 overall synteny of amphibian genomes is relatively conserved, particularly within orders (Figs 5 and
380 S7). This aligns with previous results from anurans that reported conserved genome organization in
381 this group (Bredeson et al., 2021; Wu et al., 2022). However, chromosome content and number
382 varied across species, which seems to have been driven by multiple occurrences of chromosomal
383 fusions and fissions (e.g., Fig. 5). Chromosomal rearrangements have occurred throughout vertebrate
384 evolution, including the hypothesized fusion of microchromosomes in the ancestor of tetrapods to
385 create the larger macrochromosomes seen in amphibians and mammals and their subsequent fission
386 to create the microchromosomes of modern birds and non-avian reptiles (Waters et al., 2021).
387
388 Some of the structural rearrangements we detected may be due to assembly errors and should be
389 evaluated in future assemblies using long-read scaffolding approaches (e.g., Oxford nanopore
390 sequencing), chromosome conformation capture technologies (e.g., Hi-C), or chromosome mapping
391 approaches (e.g., FISH). We also identified incongruities with chromosome naming and orientation
392 caused by differences in assembly methods. These were apparent even within species of the same
393 genus (e.g., Bufo). We suggest potential revisions of existing genome annotations to improve
16
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
394 congruity and that future assemblies are curated consistently against high-quality reference genomes
396
397 Conclusions
398 New sequencing technologies and assembly algorithms have resulted in a good number of genomes
399 for comparative analyses spanning the amphibian phylogeny. This has already begun to yield
400 important insights on the evolution (Lamichhaney et al., 2021; Wu et al., 2022), development
401 (Schloissnig et al., 2021; Stuckert et al., 2021), sex determination (Hime et al., 2019; Ma & Veltsos,
402 2021), and unique features (Fischer et al., 2019; Nowoshilow et al., 2018; Seidl et al., 2019) of this
404
405 The increased availability of amphibian genomes can also aid conservation efforts in this highly
406 threatened group by facilitating research on genome-wide functional diversity, which can be used to
407 inform management decisions such as genetic rescue or targeted genetic intervention for species
408 threatened by habitat loss or chytridiomycosis (Chestnut et al., 2014; Kosch et al., 2022).
409 Additionally, well-annotated genomes can be used to create eDNA assays for population monitoring
411
412 Future research efforts should focus on generating more reference genomes to fill the gaps in the
413 amphibian phylogeny and the identification of advantageous genetic traits against threats. Efforts
414 should also be made to increase the quality of genomes and expand transcriptome and annotation
415 databases. We suggest that these efforts strive to follow the recommendations of initiatives such as
416 the Earth BioGenome Project (Lawniczak et al., 2022), the Darwin Tree of Life Project (Blaxter et al.,
417 2022), and the Threatened Species Initiative (Hogg et al., 2022) to sequence at least one
418 representative from each family to ensure taxonomic coverage. Species selection should prioritize
17
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
419 species of interest for understanding valuable functional genetic traits; for example, for the purpose
422 ACKNOWLEDGEMENTS
423 T.A.K, N.D.Y, and L.F.S research were supported by The University of Melbourne’s Research
424 Computing Services and the Petascale Campus Initiative. T.A.K. and L.F.S were supported by
425 Australian Research Council Grants (FT190100462, LP200301370). K.W.V and M.L.P were funded by
426 the European Union (ERC, MolStressH2O, 101044202). Views and opinions expressed are however
427 those of the author(s) only and do not necessarily reflect those of the European Union or the
428 European Research Council Executive Agency. Neither the European Union nor the granting authority
429 can be held responsible for them. We are grateful to A. Stuckert, N. Brajuka, J. Sproul, and E. Tescari
430 for their advice on repeat modelling, and J.T. Li for providing suggestions on synteny analyses. We
432 REFERENCES
433 Aganezov, S., Yan, S. M., Soto, D. C., Kirsche, M., Zarate, S., Avdeyev, P., Taylor, D. J., Shafin,
434 K., Shumate, A., Xiao, C., Wagner, J., McDaniel, J., Olson, N. D., Sauria, M. E. G.,
435 Vollger, M. R., Rhie, A., Meredith, M., Martin, S., Lee, J., Koren, S., Rosenfeld, J. A.,
436 Paten, B., Layer, R., Chin, C. S., Sedlazeck, F. J., Hansen, N. F., Miller, D. E., Phillippy, A.
437 M., Miga, K. H., McCoy, R. C., Dennis, M. Y., Zook, J. M., & Schatz, M. C. (2022). A
438 complete reference genome improves analysis of human genetic variation. Science,
439 376(6588), eabl3533. doi:10.1126/science.abl3533
440 Baduel, P., Leduque, B., Ignace, A., Gy, I., Gil Jr, J., Loudet, O., Colot, V., & Quadrana, L.
441 (2021). Genetic and environmental modulation of transposition shapes the
442 evolutionary potential of Arabidopsis thaliana. Genome Biology, 22(1), 138.
443 Biscotti, M. A., Carducci, F., Olmo, E., & Canapa, A. (2019). Vertebrate Genome Size and the
444 Impact of Transposable Elements in Genome Evolution. In P. Pontarotti (Ed.),
445 Evolution, Origin of Life, Concepts and Methods (pp. 233-251). Cham: Springer
446 International Publishing.
447 Blaxter, M., Archibald, J. M., Childers, A. K., Coddington, J. A., Crandall, K. A., Di Palma, F.,
448 Durbin, R., Edwards, S. V., Graves, J. A. M., Hackett, K. J., Hall, N., Jarvis, E. D.,
449 Johnson, R. N., Karlsson, E. K., Kress, W. J., Kuraku, S., Lawniczak, M. K. N., Lindblad-
450 Toh, K., Lopez, J. V., Moran, N. A., Robinson, G. E., Ryder, O. A., Shapiro, B., Soltis, P.
451 S., Warnow, T., Zhang, G., & Lewin, H. A. (2022). Why sequence all eukaryotes?
18
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
19
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
20
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
540 Hotaling, S., Kelley, J. L., & Frandsen, P. B. (2021a). Toward a genome sequence for every
541 animal: Where are we now? Proceedings of the National Academy of Sciences,
542 118(52), e2109019118. doi:10.1073/pnas.2109019118
543 Hotaling, S., Sproul, J. S., Heckenhauer, J., Powell, A., Larracuente, A. M., Pauls, S. U., Kelley,
544 J. L., & Frandsen, P. B. (2021b). Long-reads are revolutionizing 20 years of insect
545 genome sequencing. Genome Biology and Evolution. doi:10.1093/gbe/evab138
546 IUCN. (2022). The IUCN Red List of Threatened Species. Version 2022-2. Accessed on [6
547 January 2023]. https://www.iucnredlist.org.
548 Jetz, W., & Pyron, R. A. (2018). The interplay of past diversification and evolutionary isolation
549 with present imperilment across the amphibian tree of life. Nature Ecology &
550 Evolution, 2(5), 850-858. doi:10.1038/s41559-018-0515-5
551 Johnson, B. B., Searle, J. B., & Sparks, J. P. (2021). Genome size influences adaptive plasticity
552 of water loss, but not metabolic rates, in lungless salamanders. Journal of
553 Experimental Biology, 224(8), jeb242196. doi:10.1242/jeb.242196
554 Kapusta, A., Suh, A., & Feschotte, C. (2017). Dynamics of genome size evolution in birds and
555 mammals. Proceedings of the National Academy of Sciences, 114(8), E1460-E1469.
556 doi:10.1073/pnas.1616702114
557 Keinath, M. C., Timoshevskiy, V. A., Timoshevskaya, N. Y., Tsonis, P. A., Voss, S. R., & Smith, J.
558 J. (2015). Initial characterization of the large genome of the salamander Ambystoma
559 mexicanum using shotgun and laser capture chromosome sequencing. Scientific
560 Reports, 5(1), 16413. doi:10.1038/srep16413
561 Kosch, T. A., Waddle, A. W., Cooper, C. A., Zenger, K. R., Garrick, D. J., Berger, L., & Skerratt,
562 L. F. (2022). Genetic approaches for increasing fitness in endangered species. Trends
563 in Ecology & Evolution, 37(4), 332-345. doi:10.1016/j.tree.2021.12.003
564 Kriventseva, E. V., Kuznetsov, D., Tegenfeldt, F., Manni, M., Dias, R., Simão, F. A., & Zdobnov,
565 E. M. (2018). OrthoDB v10: sampling the diversity of animal, plant, fungal, protist,
566 bacterial and viral genomes for evolutionary and functional annotations of orthologs.
567 Nucleic Acids Research, 47(D1), D807-D811. doi:10.1093/nar/gky1053
568 Lamichhaney, S., Catullo, R., Keogh, J. S., Clulow, S., Edwards, S. V., & Ezaz, T. (2021). A bird-
569 like genome from a frog: Mechanisms of genome size reduction in the ornate
570 burrowing frog, Platyplectrum ornatum. Proceedings of the National Academy of
571 Sciences, 118(11), e2011649118. doi:10.1073/pnas.2011649118
572 Lawniczak, M. K. N., Durbin, R., Flicek, P., Lindblad-Toh, K., Wei, X., Archibald, J. M., Baker,
573 W. J., Belov, K., Blaxter, M. L., Marques Bonet, T., Childers, A. K., Coddington, J. A.,
574 Crandall, K. A., Crawford, A. J., Davey, R. P., Di Palma, F., Fang, Q., Haerty, W., Hall, N.,
575 Hoff, K. J., Howe, K., Jarvis, E. D., Johnson, W. E., Johnson, R. N., Kersey, P. J., Liu, X.,
576 Lopez, J. V., Myers, E. W., Pettersson, O. V., Phillippy, A. M., Poelchau, M. F., Pruitt, K.
577 D., Rhie, A., Castilla-Rubio, J. C., Sahu, S. K., Salmon, N. A., Soltis, P. S., Swarbreck, D.,
578 Thibaud-Nissen, F., Wang, S., Wegrzyn, J. L., Zhang, G., Zhang, H., Lewin, H. A., &
579 Richards, S. (2022). Standards recommendations for the Earth BioGenome Project.
580 Proceedings of the National Academy of Sciences, 119(4), e2115639118.
581 doi:10.1073/pnas.2115639118
582 Lee, B. T., Barber, G. P., Benet-Pagès, A., Casper, J., Clawson, H., Diekhans, M., Fischer, C.,
583 Gonzalez, J. N., Hinrichs, A. S., Lee, C. M., Muthuraman, P., Nassar, L. R., Nguy, B.,
584 Pereira, T., Perez, G., Raney, B. J., Rosenbloom, K. R., Schmelter, D., Speir, M. L., Wick,
21
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
585 B. D., Zweig, A. S., Haussler, D., Kuhn, R. M., Haeussler, M., & Kent, W. J. (2022). The
586 UCSC Genome Browser database: 2022 update. Nucleic Acids Research, 50(D1),
587 D1115-d1122. doi:10.1093/nar/gkab959
588 Li, Z.-W., Hou, X.-H., Chen, J.-F., Xu, Y.-C., Wu, Q., González, J., & Guo, Y.-L. (2018).
589 Transposable elements contribute to the adaptation of Arabidopsis thaliana. Genome
590 Biology and Evolution, 10(8), 2140-2150. doi:10.1093/gbe/evy171
591 Liedtke, H. C., Cruz, F., Gómez-Garrido, J., Fuentes Palacios, D., Marcet-Houben, M., Gut, M.,
592 Alioto, T., Gabaldón, T., & Gomez-Mestre, I. (2022). Chromosome-level assembly,
593 annotation and phylome of Pelobates cultripes, the western spadefoot toad. DNA
594 Research, 29(3), dsac013.
595 Liedtke, H. C., Gower, D. J., Wilkinson, M., & Gomez-Mestre, I. (2018). Macroevolutionary
596 shift in the size of amphibian genomes and the role of life history and climate. Nature
597 Ecology & Evolution, 2(11), 1792-1799. doi:10.1038/s41559-018-0674-4
598 Liu, Y., Shi, D., Wang, J., Chen, X., Zhou, M., Xi, X., Cheng, J., Ma, C., Chen, T., & Shaw, C.
599 (2020). A novel amphibian antimicrobial peptide, phylloseptin-PV1, exhibits effective
600 anti-staphylococcal activity without inducing either hepatic or renal toxicity in mice.
601 Frontiers in Microbiology, 11, 565158.
602 Lovell, J. T., Sreedasyam, A., Schranz, M. E., Wilson, M., Carlson, J. W., Harkess, A., Emms, D.,
603 Goodstein, D. M., & Schmutz, J. (2022). GENESPACE tracks regions of interest and
604 gene copy number variation across multiple genomes. eLife, 11, e78526.
605 doi:10.7554/eLife.78526
606 Lynch, M., & Walsh, B. (2007). The origins of genome architecture. Sunderland: Sinauer
607 Associates.
608 Ma, W.-J., & Veltsos, P. (2021). The Diversity and Evolution of Sex Chromosomes in Frogs.
609 Genes, 12(4), 483. doi:10.3390/genes12040483
610 Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A., & Zdobnov, E. M. (2021). BUSCO
611 Update: Novel and Streamlined Workflows along with Broader and Deeper
612 Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes.
613 Molecular Biology and Evolution, 38(10), 4647-4654. doi:10.1093/molbev/msab199
614 Meyer, A., Schloissnig, S., Franchini, P., Du, K., Woltering, J. M., Irisarri, I., Wong, W. Y.,
615 Nowoshilow, S., Kneitz, S., Kawaguchi, A., Fabrizius, A., Xiong, P., Dechaud, C., Spaink,
616 H. P., Volff, J.-N., Simakov, O., Burmester, T., Tanaka, E. M., & Schartl, M. (2021).
617 Giant lungfish genome elucidates the conquest of land by vertebrates. Nature,
618 590(7845), 284-289. doi:10.1038/s41586-021-03198-8
619 Mohlhenrich, E. R., & Mueller, R. L. (2016). Genetic drift and mutational hazard in the
620 evolution of salamander genomic gigantism. Evolution, 70(12), 2865-2878.
621 Myers, E. A., & Pyron, R. A. (2024). The first complete assembly for a lungless urodelan with
622 a “miniaturized” genome, the Northern Dusky Salamander (Plethodontidae:
623 Desmognathus fuscus). bioRxiv, 2024.2004.2030.591895.
624 doi:10.1101/2024.04.30.591895
625 Novák, P., Guignard, M. S., Neumann, P., Kelly, L. J., Mlinarec, J., Koblížková, A., Dodsworth,
626 S., Kovařík, A., Pellicer, J., Wang, W., Macas, J., Leitch, I. J., & Leitch, A. R. (2020).
627 Repeat-sequence turnover shifts fundamentally in species with large genomes.
628 Nature Plants, 6(11), 1325-1329. doi:10.1038/s41477-020-00785-x
22
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
629 Nowoshilow, S., Schloissnig, S., Fei, J.-F., Dahl, A., Pang, A. W. C., Pippel, M., Winkler, S.,
630 Hastie, A. R., Young, G., Roscito, J. G., Falcon, F., Knapp, D., Powell, S., Cruz, A., Cao,
631 H., Habermann, B., Hiller, M., Tanaka, E. M., & Myers, E. W. (2018). The axolotl
632 genome and the evolution of key tissue formation regulators. Nature.
633 doi:10.1038/nature25458
634 https://www.nature.com/articles/nature25458#supplementary-information
635 Nunes-de-Almeida, C. H. L., Haddad, C. F. B., & Toledo, L. F. (2021). A revised classification of
636 the amphibian reproductive modes. Salamandra, 57(3), 413-427.
637 O'Leary, N. A., Wright, M. W., Brister, J. R., Ciufo, S., Haddad, D., McVeigh, R., Rajput, B.,
638 Robbertse, B., Smith-White, B., Ako-Adjei, D., Astashyn, A., Badretdin, A., Bao, Y.,
639 Blinkova, O., Brover, V., Chetvernin, V., Choi, J., Cox, E., Ermolaeva, O., Farrell, C. M.,
640 Goldfarb, T., Gupta, T., Haft, D., Hatcher, E., Hlavina, W., Joardar, V. S., Kodali, V. K.,
641 Li, W., Maglott, D., Masterson, P., McGarvey, K. M., Murphy, M. R., O'Neill, K., Pujar,
642 S., Rangwala, S. H., Rausch, D., Riddick, L. D., Schoch, C., Shkeda, A., Storz, S. S., Sun,
643 H., Thibaud-Nissen, F., Tolstoy, I., Tully, R. E., Vatsan, A. R., Wallin, C., Webb, D., Wu,
644 W., Landrum, M. J., Kimchi, A., Tatusova, T., DiCuccio, M., Kitts, P., Murphy, T. D., &
645 Pruitt, K. D. (2016). Reference sequence (RefSeq) database at NCBI: current status,
646 taxonomic expansion, and functional annotation. Nucleic Acids Research, 44(D1),
647 D733-745. doi:10.1093/nar/gkv1189
648 Ou, S., Su, W., Liao, Y., Chougule, K., Agda, J. R. A., Hellinga, A. J., Lugo, C. S. B., Elliott, T. A.,
649 Ware, D., Peterson, T., Jiang, N., Hirsch, C. N., & Hufford, M. B. (2019). Benchmarking
650 transposable element annotation methods for creation of a streamlined,
651 comprehensive pipeline. Genome Biology, 20(1), 275. doi:10.1186/s13059-019-1905-
652 y
653 Paiola, M., Dimitrakopoulou, D., Pavelka, M. S., & Robert, J. (2023). Amphibians as a model
654 to study the role of immune cell heterogeneity in host and mycobacterial
655 interactions. Developmental & Comparative Immunology, 139, 104594.
656 doi:https://doi.org/10.1016/j.dci.2022.104594
657 Pappalardo, A. M., Ferrito, V., Biscotti, M. A., Canapa, A., & Capriglione, T. (2021).
658 Transposable Elements and Stress in Vertebrates: An Overview. Int J Mol Sci, 22(4).
659 doi:10.3390/ijms22041970
660 Pimpinelli, S., & Piacentini, L. (2020). Environmental change and the evolution of genomes:
661 Transposable elements as translators of phenotypic plasticity into genotypic
662 variability. Functional Ecology, 34(2), 428-441.
663 Platt, R. N., 2nd, Vandewege, M. W., & Ray, D. A. (2018). Mammalian transposable elements
664 and their impacts on genome evolution. Chromosome Res, 26(1-2), 25-43.
665 doi:10.1007/s10577-017-9570-z
666 Pyron, R. A., Pirro, S., Hains, T., Colston, T. J., Myers, E. A., O'Connell, K. A., & Beamer, D. A.
667 (2024). The Draft Genome Sequences of 50 Salamander species (Caudata, Amphibia).
668 Biodiversity Genomes, 2024. doi:10.56179/001c.116891
669 Pyron, R. A., & Wiens, J. J. (2013). Large-scale phylogenetic analyses reveal the causes of high
670 tropical amphibian diversity. Proceedings of the Royal Society B: Biological Sciences,
671 280(1770), 20131622.
672 Revell, L. J. (2024). phytools 2.0: an updated R ecosystem for phylogenetic comparative
673 methods (and other things). PeerJ, 12, e16505. doi:10.7717/peerj.16505
23
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
674 Rhie, A., McCarthy, S. A., Fedrigo, O., Damas, J., Formenti, G., Koren, S., Uliano-Silva, M.,
675 Chow, W., Fungtammasan, A., Kim, J., Lee, C., Ko, B. J., Chaisson, M., Gedman, G. L.,
676 Cantin, L. J., Thibaud-Nissen, F., Haggerty, L., Bista, I., Smith, M., Haase, B.,
677 Mountcastle, J., Winkler, S., Paez, S., Howard, J., Vernes, S. C., Lama, T. M., Grutzner,
678 F., Warren, W. C., Balakrishnan, C. N., Burt, D., George, J. M., Biegler, M. T., Iorns, D.,
679 Digby, A., Eason, D., Robertson, B., Edwards, T., Wilkinson, M., Turner, G., Meyer, A.,
680 Kautt, A. F., Franchini, P., Detrich, H. W., Svardal, H., Wagner, M., Naylor, G. J. P.,
681 Pippel, M., Malinsky, M., Mooney, M., Simbirsky, M., Hannigan, B. T., Pesout, T.,
682 Houck, M., Misuraca, A., Kingan, S. B., Hall, R., Kronenberg, Z., Sović, I., Dunn, C.,
683 Ning, Z., Hastie, A., Lee, J., Selvaraj, S., Green, R. E., Putnam, N. H., Gut, I., Ghurye, J.,
684 Garrison, E., Sims, Y., Collins, J., Pelan, S., Torrance, J., Tracey, A., Wood, J., Dagnew,
685 R. E., Guan, D., London, S. E., Clayton, D. F., Mello, C. V., Friedrich, S. R., Lovell, P. V.,
686 Osipova, E., Al-Ajli, F. O., Secomandi, S., Kim, H., Theofanopoulou, C., Hiller, M., Zhou,
687 Y., Harris, R. S., Makova, K. D., Medvedev, P., Hoffman, J., Masterson, P., Clark, K.,
688 Martin, F., Howe, K., Flicek, P., Walenz, B. P., Kwak, W., Clawson, H., Diekhans, M.,
689 Nassar, L., Paten, B., Kraus, R. H. S., Crawford, A. J., Gilbert, M. T. P., Zhang, G.,
690 Venkatesh, B., Murphy, R. W., Koepfli, K.-P., Shapiro, B., Johnson, W. E., Di Palma, F.,
691 Marques-Bonet, T., Teeling, E. C., Warnow, T., Graves, J. M., Ryder, O. A., Haussler,
692 D., O’Brien, S. J., Korlach, J., Lewin, H. A., Howe, K., Myers, E. W., Durbin, R., Phillippy,
693 A. M., & Jarvis, E. D. (2021). Towards complete and error-free genome assemblies of
694 all vertebrate species. Nature, 592(7856), 737-746. doi:10.1038/s41586-021-03451-0
695 Robert, J. (2020). Experimental platform using the amphibian Xenopus laevis for research in
696 fundamental and medical immunology. Cold Spring Harbor Protocols, 2020(7), pdb.
697 top106625.
698 Saeed, M., Rais, M., Akram, A., Williams, M. R., Kellner, K. F., Hashsham, S. A., & Davis, D. R.
699 (2022). Development and validation of an eDNA protocol for monitoring endemic
700 Asian spiny frogs in the Himalayan region of Pakistan. Scientific Reports, 12(1), 5624.
701 doi:10.1038/s41598-022-09084-1
702 Scheele, B. C., Hunter, D. A., Grogan, L. F., Berger, L., Kolby, J. E., McFadden, M. S.,
703 Marantelli, G., Skerratt, L. F., & Driscoll, D. A. (2014). Interventions for Reducing
704 Extinction Risk in Chytridiomycosis-Threatened Amphibians. Conservation Biology,
705 28(5), 1195-1205. doi:10.1111/cobi.12322
706 Scheele, B. C., Pasmans, F., Skerratt, L. F., Berger, L., Martel, A., Beukema, W., Acevedo, A.
707 A., Burrowes, P. A., Carvalho, T., Catenazzi, A., De la Riva, I., Fisher, M. C., Flechas, S.
708 V., Foster, C. N., Frías-Álvarez, P., Garner, T. W. J., Gratwicke, B., Guayasamin, J. M.,
709 Hirschfeld, M., Kolby, J. E., Kosch, T. A., La Marca, E., Lindenmayer, D. B., Lips, K. R.,
710 Longo, A. V., Maneyro, R., McDonald, C. A., Mendelson, J., Palacios-Rodriguez, P.,
711 Parra-Olea, G., Richards-Zawacki, C. L., Rödel, M.-O., Rovito, S. M., Soto-Azat, C.,
712 Toledo, L. F., Voyles, J., Weldon, C., Whitfield, S. M., Wilkinson, M., Zamudio, K. R., &
713 Canessa, S. (2019). Amphibian fungal panzootic causes catastrophic and ongoing loss
714 of biodiversity. Science, 363(6434), 1459-1463. doi:10.1126/science.aav0379
715 Schloissnig, S., Kawaguchi, A., Nowoshilow, S., Falcon, F., Otsuki, L., Tardivo, P.,
716 Timoshevskaya, N., Keinath, M. C., Smith, J. J., & Voss, S. R. (2021). The giant axolotl
717 genome uncovers the evolution, scaling, and transcriptional control of complex gene
718 loci. Proceedings of the National Academy of Sciences, 118(15).
24
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
719 Schrader, L., & Schmitz, J. (2019). The impact of transposable elements in adaptive
720 evolution. Molecular Ecology, 28(6), 1537-1549.
721 Schulte, L. M., Ringler, E., Rojas, B., & Stynoski, J. L. (2020). Developments in Amphibian
722 Parental Care Research: History, Present Advances, and Future Perspectives.
723 Herpetological Monographs, 34(1), 71-97. doi:10.1655/HERPMONOGRAPHS-D-19-
724 00002.1
725 Seidl, F., Levis, N. A., Schell, R., Pfennig, D. W., Pfennig, K. S., & Ehrenreich, I. M. (2019).
726 Genome of Spea multiplicata, a Rapidly Developing, Phenotypically Plastic, and
727 Desert-Adapted Spadefoot Toad. G3: Genes|Genomes|Genetics, 9(12), 3909-3919.
728 doi:10.1534/g3.119.400705
729 Smit, A., Hubley, R., & Green, P. (2013). RepeatMasker Open-4.0. 2013-2015. Retrieved from
730 <http://www.repeatmasker.org>
731 Sotero-Caio, C., Challis, R., Kumar, S., & Blaxter, M. (2021). Genomes on a Tree (GoaT): A
732 centralized resource for eukaryotic genome sequencing initiatives. Biodiversity
733 Information Science and Standards.
734 Stapley, J., Santure, A. W., & Dennis, S. R. (2015). Transposable elements as agents of rapid
735 adaptation may explain the genetic paradox of invasive species. Molecular Ecology,
736 24(9), 2241-2252.
737 Storer, J., Hubley, R., Rosen, J., Wheeler, T. J., & Smit, A. F. (2021). The Dfam community
738 resource of transposable element families, sequence models, and genome
739 annotations. Mobile DNA, 12(1), 1-14.
740 Stuckert, A. M., Chouteau, M., McClure, M., LaPolice, T. M., Linderoth, T., Nielsen, R.,
741 Summers, K., & MacManes, M. D. (2021). The genomics of mimicry: gene expression
742 throughout development provides insights into convergent and divergent
743 phenotypes in a Müllerian mimicry system. Molecular Ecology, 30(16), 4039-4061.
744 Suda, K., Hayashi, S. R., Tamura, K., Takamatsu, N., & Ito, M. (2022). Activation of DNA
745 Transposons and Evolution of piRNA Genes Through Interspecific Hybridization in
746 Xenopus Frogs. Frontiers in Genetics, 13. doi:10.3389/fgene.2022.766424
747 Sun, C., López Arriaza, J. R., & Mueller, R. L. (2012a). Slow DNA loss in the gigantic genomes
748 of salamanders. Genome Biology and Evolution, 4(12), 1340-1348.
749 Sun, C., Shepard, D. B., Chong, R. A., López Arriaza, J., Hall, K., Castoe, T. A., Feschotte, C.,
750 Pollock, D. D., & Mueller, R. L. (2012b). LTR retrotransposons contribute to genomic
751 gigantism in plethodontid salamanders. Genome Biology and Evolution, 4(2), 168-
752 183.
753 Sun, Y.-B., Zhang, Y., & Wang, K. (2020). Perspectives on studying molecular adaptations of
754 amphibians in the genomic era. Zoological Research, 41(4), 351.
755 Team, R. (2022). RStudio: integrated development for R. Boston, MA: RStudio. In: Inc.
756 Team, R. C. (2013). R: A language and environment for statistical computing. R Foundation
757 for Statistical Computing, Vienna, Austria. http://www. R-project. org/.
758 Tymowska, J., & Fischberg, M. (1973). Chromosome complements of the genus Xenopus.
759 Chromosoma, 44(3), 335-342. doi:10.1007/BF00291027
760 Wang, J., Itgen, M. W., Wang, H., Gong, Y., Jiang, J., Li, J., Sun, C., Sessions, S. K., & Mueller,
761 R. L. (2021). Gigantic Genomes Provide Empirical Tests of Transposable Element
762 Dynamics Models. Genomics, Proteomics & Bioinformatics.
763 doi:https://doi.org/10.1016/j.gpb.2020.11.005
25
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
764 Waters, P. D., Patel, H. R., Ruiz-Herrera, A., Álvarez-González, L., Lister, N. C., Simakov, O.,
765 Ezaz, T., Kaur, P., Frere, C., Grützner, F., Georges, A., & Graves, J. A. M. (2021).
766 Microchromosomes are building blocks of bird, reptile, and mammal chromosomes.
767 Proceedings of the National Academy of Sciences, 118(45), e2112494118.
768 doi:doi:10.1073/pnas.2112494118
769 Womack, M. C., Steigerwald, E., Blackburn, D. C., Cannatella, D. C., Catenazzi, A., Che, J., Koo,
770 M. S., McGuire, J. A., Ron, S. R., Spencer, C. L., Vredenburg, V. T., & Tarvin, R. D.
771 (2022). State of the Amphibia 2020: A Review of Five Years of Amphibian Research
772 and Existing Resources. Ichthyology & Herpetology, 110(4), 638-661, 624.
773 Wu, W., Gao, Y. D., Jiang, D. C., Lei, J., Ren, J. L., Liao, W. B., Deng, C., Wang, Z., Hillis, D. M.,
774 Zhang, Y. P., & Li, J. T. (2022). Genomic adaptations for arboreal locomotion in Asian
775 flying treefrogs. Proc Natl Acad Sci U S A, 119(13), e2116342119.
776 doi:10.1073/pnas.2116342119
777 Zuo, B., Nneji, L. M., & Sun, Y.-B. (2023). Comparative genomics reveals insights into anuran
778 genome size evolution. BMC Genomics, 24(1), 379. doi:10.1186/s12864-023-09499-8
779
784 (https://www.ncbi.nlm.nih.gov/genome/).
785 Code:
786 All original code has been deposited on GitHub and is publicly available at
787 (https://doi.org/10.5281/zenodo.7679280).
788
790 Conceptualization, T.A.K., A.J.C., L.A.O., A.R., and K.C.W.V; methodology, T.A.K, N.D.Y, R.L.M, and
791 A.R.; formal analysis, T.A.K., M.L.P.; investigation, T.A.K., N.D.Y, R.L.M., K.C.W.V, M.L.P.; resources,
792 L.A.0. and A.R.; writing – original draft, T.A.K.; writing – review & editing, all authors; project
794
26
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
797
27
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
798
799 Figure 1. Phylogenetic tree of amphibian families. Amphibian families with representative genomes
800 are highlighted and numbers indicate genome counts per family. (Green) anurans, (blue) caecilians,
801 and (orange) salamanders. Engystomops pustulosus (Family) image was taken by B. Gratwicke, other
802 amphibian images were licensed to T. Kosch by Adobe Stock and Shutterstock.
28
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
803
804 Figure 2. BUSCO (Benchmarking Universal Single-Copy Orthologs) assessment results for amphibian
805 genomes.
806
807
808
809
810
811
812
813
814
29
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
815
816 Figure 3. Repeat content across the amphibian genomes. (LINEs) long interspersed nuclear elements,
817 (LTRs) long terminal repeats, and (SINEs) short interspersed nuclear elements.
818
819
820
821
822
823
824
825
826
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
827
828 Figure 4. Phylogenetic independent contrasts (PICs) between the proportion of transposable element
829 content relative to genome size and Bio8 (representing mean temperature of wettest quarter).
830
831
832
833
834
835
31
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.
836
837 Figure 5. Synteny plot of BUSCOs (Benchmarking Universal Single-Copy Orthologs) for representative
838 amphibian chromosome-level genomes. The phylogenetic tree was created with Timetree.org. The
839 reference genome is Ambystoma mexicanum. *Indicate inverted chromosomes. Chromosomes
840 without BUSCOs were excluded from the plot.
841
32
bioRxiv preprint doi: https://doi.org/10.1101/2023.02.27.530355; this version posted June 27, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under aCC-BY-NC-ND 4.0 International license.