1 s2.0 S2405880720300194 Main

Climate Services 18 (2020) 100167
Contents lists available at ScienceDirect
Climate Services
journal homepage: www.elsevier.com/locate/cliser
Original research article
GCMeval – An interactive tool for evaluation and selection of climate model T

ensembles
⁎
Kajsa M. Pardinga, , Andreas Doblera, Carol F. McSweeneyb, Oskar A. Landgrena,
Rasmus Benestada, Helene B. Erlandsena, Abdelkader Mezghania, Hilppa Gregowc, Olle Rätyc,
Elisabeth Viktord, Juliane El Zohbid, Ole B. Christensene, Harilaos Loukosf
a
Norwegian Meteorological Institute, Oslo, Norway
b
UK MetOffice, United Kingdom
c
Finnish Meteorological Institute, Helsinki, Finland
d
Climate Service Center Germany (GERICS), Helmholtz-Zentrum Geesthacht, Germany
e
Danish Meteorological Institute, Copenhagen, Denmark
f
The Climate Data Factory, Paris, France
A B S T R A C T
We present an interactive tool for selection and evaluation of global climate models. The tool is implemented as a web application using the ”Shiny” R-package and is
available at https://gcmeval.met.no. Through this tool, climate models of the CMIP5 and CMIP6 ensembles can be ranked and compared based on their re-
presentation of the present climate, with user-determined weights indicating the importance of different regions, seasons, climate variables, and skill scores. The
ranking can be used to eliminate the climate models with poorest representation of the present climate. As further guidance, the projected regional mean temperature
and precipitation changes for all climate models are displayed in a scatterplot for a chosen season, emission scenario, and time horizon. Ranks and projected changes
for a subset of climate models are compared to the whole ensemble. Subsets can be selected interactively and the tool provides on-the-fly visualizations. The
combined information of the projected climate change and model ranking can be used as an aid to select subsets of climate models that adequately represent both the
present day climate and the range of possible future outcomes. Allowing weighting of different metrics introduces a subjective element in the evaluation process and
demonstrates how this can affect the ranking. The tool also illustrates how the range of projected future climate change is sensitive to the choice of models.
weightings and resulting model rankings, and the relative spread

Practical Implications of future climate change compared to the full ensemble. In ad-
dition, results from impact or regional climate model studies can
While a climate model simulation is a single realization of a be put into context, showing how the selection of climate models
possible future, an ensemble of simulations gives a more complete influences the representation of climate change.
vision of potential climate change. Generally, including all
available simulations gives the most robust estimates of un-
certainties. For practical reasons, only a subset of models is ty-
1. Introduction
pically processed in impact or regional climate modelling studies.
This can result in a skewed and incomplete representation of
climate change. In the last few decades, an increasing number of research groups
There are many ways of selecting models, based on varying have developed and applied global climate models (GCMs) to answer
philosophies and applications, where model interdependency, questions about earth-system processes, climate change impacts, and
simulations of past and future climate, and personal experience adaptation. Many of these experiments are organized through the
can be taken into account. This selection process is often not well Coupled Model Intercomparison Projects (CMIPs, Meehl et al., 2007;
documented, weakening the authoritativeness of studies on future Taylor et al., 2012) which provide a common experiment protocol and
climate simulations and related impacts.
infrastructure, including emission scenarios, forcing data, and output
With the GCMeval tool, a model selection from the CMIP5 and
requirements. The fifth CMIP (CMIP5) included over 20 modeling
CMIP6 ensembles can be made in a transparent and reproducible
way. The selection can be documented by listing the selected groups running more than 40 models, producing several petabytes of
data (Taylor et al., 2012), and CMIP6 (Eyring et al., 2016) is already
⁎
Corresponding author.
E-mail address: [email protected] (K.M. Parding).
https://doi.org/10.1016/j.cliser.2020.100167
Available online 16 March 2020

2405-8807/ © 2020 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/BY/4.0/).
K.M. Parding, et al. Climate Services 18 (2020) 100167
larger and still increasing. GCM results are associated with un- change regionally. The purpose of GCMeval is thus 1) to serve as gui-
certainties owing to natural climate variability, unknown socio-eco- dance for a strategic climate model selection in cases where the full
nomic developments, and model differences (Hawkins and Sutton, ensemble cannot be used, 2) to explore how different user priorities
2009). Thus, there is a wide range of possible outcomes of climate (season, focus region, variable, metric) result in different model rank-
change and an ensemble of model simulations is needed to capture the ings, and 3) to evaluate the consequence of the selection of GCMs
uncertainty of the projected future. compared to using all available model simulations in terms of future
In impact and regional climate model (RCM) studies, GCM simula- climate change. An additional use of GCMeval is to evaluate “post hoc”
tions are processed and downscaled to higher spatial and temporal re- an existing subset of GCMs, similar to the evaluation of the ISI-MIP
solutions. Due to computational limitations, data availability, and ensemble by McSweeney and Jones (2016). In many cases, a subset has
compatibility, only a subset of available GCMs is typically considered in been selected for practical reasons, e.g., availability, and when applying
such studies. For instance, the Inter-sectoral Impact Model Inter-com- the data it is useful to understand how the choice of GCMs influences
parison Project (ISI-MIP, Warszawski et al., 2014) selected five models the representation of the future climate.
from the full CMIP5 ensemble, and the COordinated Regional climate GCMeval is accessible online at https://gcmeval.met.no. The source
Downscaling EXperiment for Europe (EURO-CORDEX, Jacob et al., code is open source and available at https://github.com/metno/
2014) currently provides data downscaled from about a dozen GCM gcmeval. It can be run locally as an R application (R Core Team,
simulations. Fernández et al. (2018) have shown that in European RCM 2014) and the code may be further developed to suit specific user re-
studies, certain GCMs have been favoured over others. quirements.
There are many criteria by which a subset of models can be selected,
for instance based on the skill in reproducing past climate (Pierce et al., 2. Data and method
2009) and the range of projected climate changes (Immerzeel et al.,
2013; Warszawski et al., 2014; McSweeney et al., 2015). Others have 2.1. Global climate simulations
implemented automated algorithms based on clustering of climate ex-
treme indices to identify a representative subset of climate models For CMIP5 (Taylor et al., 2012) we used projections of monthly
(Molteni et al., 2006; Cannon, 2015; Farjad et al., 2019). mean temperature and precipitation, downloaded from the KNMI Cli-
However, no small set of climate models can represent the full range mate Explorer (https://climexp.knmi.nl) which provides GCM output
of possibilities for all variables, regions and seasons. Turco et al. (2013) regridded from its native resolution to a common 2.5° × 2.5° grid. Two
showed that working with a small subset of models may result in in- future climate change scenarios were considered, following different
consistent climate change signals. For ISI-MIP, McSweeney and Jones representative concentration pathways (RCPs, Moss et al., 2010) of
(2016) found that the five models used represent 50–90% of the full greenhouse gases: RCP4.5 which is a stabilization scenario in which
range of temperature projections and 30–80% of precipitation projec- climate policies are invoked to limit emissions (Thomson et al., 2011),
tions, depending on the season and region of interest. In another study and RCP8.5 which is a business as usual scenario in which an absence of
on the effects of subsampling the CMIP5 ensemble, Mezghani et al. climate change policies leads to higher future greenhouse gas emissions
(2019) demonstrated that the difference between various combinations (Riahi et al., 2011). The included GCM simulations from CMIP5 are
of small subsets of GCM simulations can be as large as the climate displayed in Table 1. Only GCM runs available from KNMI for both
change signal itself. temperature and precipitation were included, resulting in 77 simula-
In a survey conducted in the context of the Copernicus Climate tions for RCP8.5 and 105 for RCP4.5.
Change Service tender DECM (Data Evaluation for Climate Models, Data from CMIP6 (Eyring et al., 2016) were downloaded through
C3S_51 Lot4) two types of climate data users were identified: data users the Earth System Grid Federation (ESGF) nodes. The 25 currently in-
and product users (Jacob et al., 2017; Viktor et al., 2018; Zahid et al., cluded simulations are displayed in Table 2. One future scenario was
2019). The data users were asked whether information on the perfor- considered, the shared socioeconomic pathway SSP5-8.5 (Riahi et al.,
mance of climate models would be useful for their work. 85% of all the 2017), which is a worst-case scenario in which the world fails to im-
participating data users answered yes. The data users were also asked plement any climate policy, resulting in an end-of-century radiative
which kind of information regarding climate model performance would forcing of 8.5 Wm−2 .
be useful, and the two highest scoring answers were the performance The ensembles include multiple runs from some GCMs, and the si-
compared with observations (46% of all answers) and the performance mulations are distinguished by their “rip” or “ripf” index which denotes
compared with other climate models (36% of all answers). the initial states (realization r), initialization methods (i), physics ver-
Here we present an interactive web based tool, GCMeval, serving the sions (p), and for CMIP6 the forcing data set (f) (See “rip” in Table 1 and
demand for information on climate model performance, thereby sup- “ripf” in Table 2).
porting informed decision making. The intended user of GCMeval is While the RCP4.5 and RCP8.5 simulations differ only in the scenario
embodied by the hypothetical persona “Donna Data” described in Jacob assumptions, the SSP5-8.5 simulations are based on new models and
et al. (2017). She usually works at a university or a research facility. If model versions. The resulting global temperature and precipitation
she is not working in research, she is employed by the agricultural or changes from the three scenarios are shown in Fig. 1 for all included
forestry sector or by water management. She obtains her data pre- simulations.
ferably from global research initiatives such as the CMIP5 archive. She
wants free and easy access to raw or only slightly post-processed data, 2.2. Reference data
since she starts trusting the information by conducting her own in-
vestigations, e.g., with a tool such as GCMeval. The GCMs are evaluated based on various reference data sets.
The GCMeval tool analyzes the performance of GCMs from the Currently, there are evaluation statistics calculated using precipitation
CMIP5 and CMIP6 ensembles under current climate conditions and il- and temperature data from two global reanalyses of the European
lustrates the spread of the projected temperature and precipitation Centre for Medium-Range Weather Forecasts, ERA5 (Hersbach and Dee,
changes of a subset of models relative to the whole ensemble. GCMeval 2016) and ERA-Interim (Dee et al., 2011), and precipitation from the
is inspired by the principles and concepts described in McSweeney and Global Precipitation Climatology Project (GPCP, version 2.3) data set
Jones (2016) and Gleckler et al. (2008). McSweeney and Jones (2016) (Adler et al., 2003). The data from ERA5 is available at a resolution of
concluded that at least 20 models would be required to capture 90% of about 31 km, while ERA-Interim provides data with approximately
the range of the projections, but that a strategic selection of fewer en- 80 km resolution. The GPCP data provides estimations of monthly
semble members could lead to a suitable representation of climate precipitation on a 2.5° grid, combining stations, satellites, and sounding
2
Table 1
List of the CMIP5 model runs included in the GCMeval tool (77 for RCP8.5 and 105 for RCP4.5). The rip index refers to the realization, initialization method, and
physics version used.
GCM rip GCM rip
ACCESS1–0 r1i1p1 GISS-E2-H r2i1p1∗, r2i1p2∗, r2i1p3∗

ACCESS1.3 r1i1p1 r3i1p1∗, r3i1p2∗, r3i1p3∗
bcc–csm1-1-m r1i1p1∗ r4i1p1∗, r4i1p2∗, r4i1p3∗
bcc–csm1-1 r1i1p1 r5i1p1∗, r5i1p2∗, r5i1p3∗
BNU-ESM r1i1p1 GISS-E2-R-CC r1i1p1∗
CanESM2 r1i1p1, r2i1p1, r3i1p1, GISS-E2-R r1i1p1, r1i1p2, r1i1p3
r4i1p1, r5i1p1 r2i1p1∗, r2i1p2∗, r2i1p3∗
CCSM4 r1i1p1, r2i1p1, r3i1p1 r3i1p1∗, r3i1p2∗, r3i1p3∗
r4i1p1, r5i1p1, r6i1p1 r4i1p1∗, r4i1p2∗, r4i1p3∗
CESM1-BGC r1i1p1 r5i1p1∗, r5i1p2∗, r5i1p3∗
CESM1-CAM5 r1i1p1, r2i1p1, r3i1p1∗ r6i1p1∗, r6i1p3∗
CMCC–CM r1i1p1 HadGEM2-AO r1i1p1
CMCC–CMS r1i1p1 HadGEM2-CC r1i1p1
CNRM-CM5 r1i1p1, r2i1p1†, r4i1p1† HadGEM2-ES r1i1p1, r2i1p1, r3i1p1
r6i1p1†, r10i1p1† r4i1p1
CSIRO-Mk3-6–0 r1i1p1, r2i1p1, r3i1p1, inmcm4 r1i1p1
r4i1p1, r5i1p1, r6i1p1 IPSL-CM5A-LR r1i1p1, r2i1p1, r3i1p1
r7i1p1, r8i1p1, r9i1p1, r4i1p1
r10i1p1 IPSL-CM5A-MR r1i1p1
EC-EARTH r2i1p1, r8i1p1, r9i1p1 IPSL-CM5B-LR r1i1p1
r12i1p1 MIROC-ESM-CHEM r1i1p1
FGOALS-g2 r1i1p1 MIROC-ESM r1i1p1
FIO-ESM r1i1p1, r2i1p1, r3i1p1 MIROC5 r1i1p1, r2i1p1, r3i1p1
GFDL-CM3 r1i1p1 MPI-ESM-LR r1i1p1, r2i1p1, r3i1p1
GFDL-ESM2G r1i1p1 MPI-ESM-MR r1i1p1, r2i1p1∗, r3i1p1∗
GFDL-ESM2M r1i1p1 MRI-CGCM3 r1i1p1
GISS-E2-H-CC r1i1p1∗ NorESM1-M r1i1p1
GISS-E2-H r1i1p1, r1i1p2, r1i1p3 NorESM1-ME r1i1p1
∗ Only available for RCP4.5

† Only available for RCP8.5.
Table 2 The GCMeval front-end provides a ranking of the GCMs based on

List of the 25 CMIP6 model runs included in the GCMeval tool two user-defined focus regions and weights representing the relative
(as of today), where ripf refers to the realization, initialization importance of those regions as well as different variables (temperature
method, physics version and forcing index. and precipitation), seasons (annual mean, winter, spring, summer, au-
GCM ripf tumn), and skill scores. The included skill scores are the bias, the spatial
correlation, the spatial standard deviation ratio, and the root mean
BCC-CSM2-MR r1i1p1f1 square error (RMSE) of the mean annual cycle. A description of the skill
CAMS-CSM1-0 r1i1p1f1, r2i1p1f1
scores and ranking procedure is given in Sections 2.4 and 2.5. The
CanESM5 r1i1p1f1, r2i1p1f1
CESM2-WACCM r1i1p1f1 possible weights are: “not considered” (weight 0), “important” (weight
CESM2 r1i1p1f1, r2i1p1f1 1), and “very important” (weight 2). The regions are chosen from a
EC-Earth3-Veg r1i1p1f1, r2i1p1f1 predefined list, which includes the full domain (global) and the 33 IPCC
FGOALS-g3 r1i1p1f1
AR5 reference regions defined in Christensen et al. (2013) (see http://
FIO-ESM-2-0 r1i1p1f1
GFDL-ESM4 r1i1p1f1 www.ipcc-data.org/guidelines/pages/ar5_regions.html). The reference
INM-CM4-8 r1i1p1f1 data set against which the GCM simulations are evaluated can be set
INM-CM5-0 r1i1p1f1 under ”Advanced Settings”.
IPSL-CM6A-LR r1i1p1f1 GCMeval also includes a scatterplot showing the projected mean
KACE-1-0-G r1i1p1f1
change of precipitation and temperature in the two focus regions for a
MIROC6 r1i1p1f1, r2i1p1f1
MPI-ESM1-2-HR r1i1p1f1 selected time frame (present day (1981–2010) to near future
MPI-ESM1-2-LR r1i1p1f1 (2021–2050) or far future (2071–2100)), with the option of including
MRI-ESM2-0 r1i1p1f1 box plots representing the distribution of the projected climate change
NESM3 r1i1p1f1, r2i1p1f1
(see, e.g., Fig. 1). This allows the user to evaluate their choice of models
NorESM2-LM r1i1p1f1
in terms of the projected climate change and possibly add simulations
to better represent the projected temperature and precipitation of the
observations. All three data sets are covering both land and sea and are complete ensemble.
available for the selected reference period (1981–2010). Climate models can be selected from a list in the “Ensemble selec-
tion” menu or by clicking the corresponding marker in the scatterplots
(this option does not work on mobile devices). There is also a “random”
2.3. The GCMeval tool
button that allows the user to select a random ensemble and a “best”
button to chose the best performing climate models based on the
GCMeval consists of an interactive front-end and a back-end that
ranking. To further allow the user to include or exclude climate models
contains the code to calculate statistics and extract metadata. The front-
in the base ensemble based on their expert knowledge, the user can
end tool was created in the R environment using the Shiny (Chang
deselect simulations in the “Advanced Settings”.
et al., 2017) package for web application development. The tool can be
accessed online at https://gcmeval.met.no. The source code can be
found at the Github repository https://github.com/metno/gcmeval.
3
Fig. 1. Scatterplots showing the global an-

nual mean temperature and precipitation
change from the present day (1981–2010)
to the far future (2071–2100). The figure
includes GCM runs for emission scenarios
RCP4.5 (purple) and RCP8.5 (orange) from
the CMIP5 ensemble, and SSP5-8.5 (blue)
for the CMIP6 ensemble. Each marker re-
presents one GCM run and boxplots depict
the distributions of projected climate
change for each emission scenario.
Table 4
The 10 best ranked CMIP5 simulations for case study 1 using the weights listed
in Table 3 for setup A and B.
Setup A Setup B
Rank GCM rip GCM rip
1 CCSM4 r6i1p1 MPI-ESM-LR r3i1p1

2 CCSM4 r3i1p1 MPI-ESM-LR r1i1p1
3 CESM1-CAM5 r2i1p1 MPI-ESM-LR r2i1p1
4 CCSM4 r1i1p1 MPI-ESM-MR r1i1p1
5 EC-EARTH r9i1p1 EC-EARTH r9i1p1
7 CCSM4 r5i1p1 HadGEM2-ES r1i1p1
9 CESM1-CAM5 r1i1p1 ACCESS1.3 r1i1p1
10 CCSM4 r2i1p1 CCSM4 r3i1p1
2.4. Skill scores
To evaluate the performance of the GCMs, a set of skill scores

highlighting different aspects of performance has been selected. The
following skill scores have been calculated for all available reference
Fig. 2. Definition of the regions “North Europe” (red) and “Central Europe” data sets for the common reference period 1981–2010:
(blue) used in the case studies (following Christensen et al., 2013).
• The absolute bias of the GCM compared to a reference data, i.e., the
Table 3 difference between the mean values of the two data sets.
Weights used in case study 1 on climate change in North Europe (Section 3). WA • The spatial correlation of the GCM and reference data. This metric
and WB show the importance of the parameters in setup A and B, respectively. indicates if the model and observations have a similar structure in
Very important equals weight 2, important weight 1, and not considered weight 0. space.
Parameter WA WB Type • The ratio of the spatial standard deviations of the GCM and re-
ference data (σGCM /σref ), i.e., a comparison between the spatial
North Europe very important very important Focus region variability of the data sets.
Global
Temperature
important
important
important
very important
Focus region
Variable
• The root mean square error (RMSE) of the mean annual cycle of the
GCM compared to the reference data. The RMSE represents the
Precipitation important important Variable
Annual mean important important Season averaged magnitude of the differences in the annual cycles of the
Winter (DJF) important not considered Season GCM and reference data.
Spring (MAM) important not considered Season
Summer (JJA) important very important Season
Autumn (SON) important not considered Season
Bias important very important Skill score
2.5. Ranking of the models
Spatial correlation important important Skill score
Spatial sd. ratio important important Skill score A ranking of the model simulations is made based on the user-de-
RMSE of annual cycle important important Skill score fined weights for the focus regions, variables, seasons, and skill scores.
The ranking method can be summarized as follows:
4
Fig. 3. Scatterplots showing the an-

nual mean temperature and pre-
cipitation change from the present
day (1981–2010) to the far future
(2071–2100) in North Europe (see
Fig. 2) under the RCP8.5 emission
scenario. Each marker represents a
GCM run from the CMIP5 ensemble.
The larger points with black outline
indicate a selected subset of models.
The color scale represents the ranking
of the models from best (green) to
worst (pink). The orange and white
box plots show the distributions of
the full ensemble and the selected
models, respectively. The panels
show a) the 10 best ranked GCM runs
for setup A, b) the 10 best ranked
GCM runs for setup B, c) 10 GCMs
selected to better represent the sta-
tistical characteristics of the full en-
semble with the ranking from setup
A.
5
Table 5 the annual mean climate change (present day to far future, assuming
The 10 selected CMIP5 simulations following the considerations of case study 1 RCP8.5) of the 10 best performing simulations was only 16% of the
and their ranks for setup A and B (weights as in Table 3). whole ensemble for temperature and 40% for precipitation. For B, the
GCM rip Rank (Setup A) Rank (setup B) relative spread of the 10 best ranked simulations was 30% for tem-
perature and 24% for precipitation. In both A and B, the 10 best ranked
BNU-ESM r1i1p1 57 44 GCM runs underestimate the ensemble mean temperature change by
CanESM2 r5i1p1 44 37
almost 1 °C compared to the full CMIP5 ensemble.
CCSM4 r3i1p1 2 8
EC-EARTH r9i1p1 5 5 Based on the combined information of the rankings of setup A and B
FIO-ESM r1i1p1 47 43 and the scatterplot, we can manually select a new subset of models.
GFDL-CM3 r1i1p1 23 25 Many different considerations may go into this selection. Here, we try
HadGEM2-ES r1i1p1 21 10
to fulfill the following: i) conserving the statistical properties (spread
IPSL-CM5A-MR r1i1p1 31 30
MIROC-ESM-CHEM r1i1p1 46 49
and mean) of the temperature and precipitation change within the
MPI-ESM-LR r3i1p1 16 1 ensemble; ii) not including the worst ranked models; iii) not including
more than one simulation from the same GCM. Based on these con-
straints, 10 GCMs are selected representing both setup A and B. The
Table 6 new ensemble includes models ranked from 1 to 57 (Table 5). The
CMIP5 simulations in the EURO-CORDEX 11 km ensemble for covered annual mean climate change in Northern Europe from the
emission scenarios RCP4.5 and RCP8.5. The rip is the realization, present to the far future is now 83% for temperature and 100% for
initialization method, and physics version. precipitation compared to the whole CMIP5 ensemble (Fig. 3c). It thus
GCMs rip includes the complete precipitation range but not the most extreme
temperature changes.
CanESM2 r1i1p1
CNRM-CM5 r1i1p1, r8i1p1∗
EC-EARTH r1i1p1∗, r3i1p1∗, r12i1p1 3.2. Case 2: “Post-hoc” evaluation of the EURO-CORDEX GCM ensemble
HadGEM2-ES r1i1p1
IPSL-CM5A-MR r1i1p1 The EURO-CORDEX initiative is a coordinated effort to downscale
MIROC5 r1i1p1 global simulations over Europe (Jacob et al., 2014). Here we evaluate
MPI-ESM-LR r1i1p1, r2i1p1
NorESM1-M r1i1p1
the range of future climate change in the subset of GCMs used in EURO-
CORDEX 11 km (EUR11) RCP4.5 and RCP8.5 projections (Table 6). Our
∗ Simulations not yet available in GCMeval. analysis focuses on the ”Central Europe” and ”global” regions for the
”far future” (2071–2100) time horizon. Note that GCMeval is based on
1 A ranking of the model simulations is made for the individual skill the CMIP5 projections available in the KNMI climate explorer (Table 1)
scores for each region, variable, and season. which includes all GCMs but not all realizations used in EURO-
2 The rankings are then multiplied by the corresponding weights, CORDEX. This includes 9 of 12 simulations in the EUR11 ensemble (see
and the sum is calculated for each model simulation. Table 6).
3 A new ranking of the model simulations is made based on the sum For RCP4.5, the spread in future climate change of the EUR11 en-
of the weighted rankings. semble covers 39% of the range of the CMIP5 ensemble for temperature
and 35% for precipitation (Fig. 4a). The corresponding range of future
climate change for the whole globe is 58% and 49% of the CMIP5 en-
3. Case studies semble for temperature and precipitation, respectively (Fig. 4b).
For RCP8.5, the EUR11 ensemble covers 61% of the range of future
3.1. Case 1: Ensemble selection and evaluation temperature change and 61% of precipitation change in Central Europe
compared to all CMIP5 ensemble members (Fig. 5a). For the whole
To demonstrate the value of the GCMeval tool, we show a possible globe, the covered range of projected climate change is 64% for tem-
application to select GCMs for a study of projected climate change in perature and 68% for precipitation (Fig. 5b).
the region ”North Europe” (Fig. 2). Let’s assume we are trying to find
suitable GCMs from the CMIP5 ensemble for a downscaling study fo- 4. Discussion
cusing on changes in summer temperature, and we need the ensemble
to be small enough to be processed with a regional climate model. Here The ensemble selection includes many subjective choices, both in
we are selecting an ensemble of 10 projections, although depending on deciding regions and weights, and how to use the resulting ranking and
available resources a smaller or larger subset might have been chosen. the spread of future climate change. There are many strategies, vari-
We are testing two different setups: A) weighting all variables, seasons, ables, and skill scores not included in the GCMeval tool that could have
and skill scores equally, and B) focusing on a specific variable (tem- been considered and may have changed the ranking and selection. Our
perature), season (summer), and skill score (bias). In both setups, the code is open source (https://github.com/metno/gcmeval) and can be
primary focus region is “North Europe” with a weight of 2 (very im- further developed to include other parameters, strategies, metrics, and
portant), while the global performance is included as a secondary re- reference data sets for ensemble selection and evaluation.
gion with weight 1 (important). For the other parameters, the settings Working with a small subset of models (e.g., 5–10) makes it difficult
are as listed in Table 3. to preserve the statistical properties of a large ensemble of available
To derive the ranking, GPCP data was used as reference for pre- GCMs. If for practical reasons it is unavoidable, one should be careful
cipitation and ERA5 data for temperature. For A, the 10 best performing not to over-interpret statistical quantities like the standard deviation or
simulations include several CCSM4, CESM1-CAM5, CESM1-BGC, and quantiles when evaluating a small ensemble. In such cases, it might be
EC-EARTH runs (Table 4). For B, focusing on summer temperature, the more appropriate to consider the subset of model projections as a set of
selection now includes several MPI-ESM-LR and EC-EARTH, as well as scenarios that sample the range of projections, rather than representing
MPI-ESM-MR, CCSM4, CESM1-CAM5 and HadGEM2-ES runs. the whole ensemble. There are many possible strategies to maximize
The scatterplots of future climate change in North Europe (Fig. 3a the spread of climate change within the subset of models, e.g., to select
and b) show that a selection based solely on the model ranking does not from the four corners and the center of the scatterplot (Lutz et al., 2016;
represent the full range of possible climate change. For A, the spread of Ruane and McDermid, 2017). However, since increasing temperatures
6
Fig. 4. Scatterplots showing the mean temperature and precipitation change from the present day (1981–2010) to the far future (2071–2100) following RCP4.5 in a)
Central Europe (see Fig. 2) and b) globally. Each marker represents a GCM run from the CMIP5 ensemble. The larger transparent markers with black outline indicate
GCMs currently downscaled in EURO-CORDEX. The purple and white box plots show the distributions of the full ensemble and the selected models, respectively.
are, at least on a global scale, associated with increasing precipitation, Due to the variety of user aims and interests, there is no objective
the wet-cold and dry-warm corners are not as well defined as the wet- way to define a single optimal sub-ensemble, even for a specific region.
warm and cold-dry corners. In regions where there is a strong corre- The grounds for selection should be clearly documented and the se-
lation between temperature change and precipitation change (as in case lected models should be put in context with the whole ensemble. This
study 1 of this paper) one might select models from the wet-warm and can be done either by explicitly taking some extreme models into ac-
cold-dry end, and some runs from the middle. It would also be possible count, or discussing the possible outcome when taking them into ac-
to implement an automated algorithm in GCMeval to select a subset of count.
models, e.g., by clustering or taking the representation of past and fu- The CMIP5 and CMIP6 ensembles are ensembles of opportunity and
ture climate change into consideration. A drawback of automated se- the members are not all independent of each other (Knutti et al., 2013).
lection is that it hides the subjective nature of the ensemble selection They include several runs from the same GCMs, and different GCMs can
process. be related because of shared code and development history. Ensuring
Although the ranking of models provides some information about independence would require a more in depth model genealogy and
the ability of the GCMs to reproduce aspects of the past climate, it is not clustering (e.g., as in Knutti et al., 2013) that is beyond the scope of our
obvious how this ranking should be used. A good representation of work. It is important to keep in mind that, to have model biases in-
present day climate does not guarantee good quality in future projec- dependent of each other, common formulations describing the basic
tions (Raäisaänen, 2007; Knutti et al., 2010; McSweeney et al., 2015). physics should be shared among the models, while aspects which are
Furthermore, the ranking is relative to the other models and does not more uncertain should be different. However, it may be difficult to say
alone indicate whether a model is ”good enough” to be included. It is which parts of the models are part of a common core, and which are
possible that all the models are performing almost equally excellent or more uncertain and should be unique.
terrible and we could not tell from the ranking. In the case studies presented, the limited coverage of climate change
7
Fig. 5. As Fig. 4, but for RCP8.5.
in the subsets is partly due to the FIO-ESM projections (r1i1p1, r2i1p1, toolbox allows users to write their own applications for data analysis
and r3i1p1) which are located more towards the cold/dry corner of the and visualization using Python and in this sense also acts as a devel-
scatterplots than the rest of the CMIP5 ensemble in North and Central opment environment. The toolbox has been designed to serve a broad
Europe (Figs. 4a, 5a and 3). For instance in Central Europe, the FIO- audience and while examples of how to write applications exist, it re-
ESM simulations are the only ones to indicate a decreasing temperature quires some skill and knowledge of climate data and programming.
under RCP4.5 (Fig. 4). In North Europe, these projections display Another example of a set of interactive tools are provided in the Climate
considerably smaller changes in temperature and precipitation com- analytics web page (https://climateanalytics.org/tools), where easy-to-
pared to the other models (Fig. 3). Thus, including or excluding FIO- use web-tools have been provided for non-expert users. However,
ESM has a large impact on the range of projected climate change cov- global model simulations are not covered by these applications, as they
ered by the ensemble. The reason why FIO-ESM produced such different are designed for illustrating regional climate change for specific sectors
results compared to the other CMIP5 members, especially for North and geographical locations. GCMeval fills a gap in the realm of online
Europe, is the collapse of the Atlantic Meridional Overturning Circu- tools providing on-the-fly visualization of performance and future
lation in the projections (Drijfhout et al., 2015). Although this may projections from global climate models with no programming required.
seem unlikely and one could argue that such obvious outliers are not
representative of the larger ensemble and should be excluded, it is not
5. Conclusion
physically impossible. Hence, there is a strong case to include this
possibility in the subset.
We present the interactive tool GCMeval with the intention of
There are other interactive tools available for exploring climate
helping users of GCM data to make informed decisions about an en-
model data. One of the largest recent efforts to provide online climate
semble selection. The tool demonstrates and quantifies two important
model analysis is the development of the CDS toolbox, built as part of
points: i) how the selection of models influences the representation of
the Climate Data Store (https://cds.climate.copernicus.eu). The CDS
future climate projections, ii) how the ranking of the models depends
8
on the focus region, season, variable, skill score, and the importance Köhler, M., Matricardi, M., McNally, A.P., Monge-Sanz, B.M., Morcrette, J.J., Park,
assigned to theses aspects. B.K., Peubey, C., de Rosnay, P., Tavolato, C., Thépaut, J.N., Vitart, F., 2011. The ERA-
Interim reanalysis: configuration and performance of the data assimilation system. Q.
When there is a need to select GCM simulations from a large en- J. R. Meteorol. Soc. 137, 553–597. https://doi.org/10.1002/qj.828.
semble, for instance in impact or regional climate modelling, the choice Drijfhout, S., Bathiany, S., Beaulieu, C., Brovkin, V., Claussen, M., Huntingford, C.,
of GCMs is a trade off between good performance in the past and Scheffer, M., Sgubin, G., Swingedouw, D., 2015. Catalogue of abrupt shifts in inter-
governmental panel on climate change climate models. Proc. Nat. Acad. Sci. 112,
projected climate change. Selecting only the best performing models E5777–E5786.
will likely limit the spread of projected climate change, while con- Eyring, V., Bony, S., Meehl, G.A., Senior, C.A., Stevens, B., Stouffer, R.J., Taylor, K.E.,
sidering the whole ensemble may not be possible, for instance due to 2016. Overview of the coupled model intercomparison project phase 6 (cmip6) ex-
perimental design and organization. Geosci. Model Dev. 9, 1937–1958.
computational limitations or unavailable data. GCMeval provides in- Farjad, B., Gupta, A., Sartipizadeh, H., Cannon, A.J., 2019. A novel approach for selecting
formation about both the representation of past climate and future extreme climate change scenarios for climate change impact studies. Sci. Total
climate change, but it is ultimately up to the user to decide what to do Environ. 678, 476–485.
Fernández, J., Frías, M., Cabos, W., Cofi no, A., Domínguez, M., Fita, L., Gaertner, M.,
with this information.
García-Díez, M., Gutiérrez, J., Jiménez-Guerrero, P., et al., 2018. Consistency of
climate change projections from multiple global and regional model intercomparison
CRediT authorship contribution statement projects. Clim. Dyn. 1–18.
Gleckler, P.J., Taylor, K.E., Doutriaux, C., 2008. Performance metrics for climate models.
J. Geophys. Res. 113https://doi.org/10.1029/2007JD008972. URL:http://doi.wiley.
Kajsa M. Parding: Software, Methodology, Validation, Formal com/10.1029/2007JD008972.
analysis, Writing - original draft. Andreas Dobler: Software, Hawkins, E., Sutton, R., 2009. The potential to narrow uncertainty in regional climate
Methodology, Validation, Formal analysis, Writing - original draft. predictions. Bull. Amer. Meteor. Soc. 90, p1095.
Hersbach, H., Dee, D., 2016. ERA5 reanalysis is in production. ECMWF Newsletter 147.
Carol F. McSweeney: Conceptualization, Methodology, Writing - re- ECMWF. Reading, United Kingdom. URL: www.ecmwf.int/sites/default/files/
view & editing. Oskar A. Landgren: Software, Validation, Formal elibrary/2016/16299newsletterno147spring2016.pdf.].
analysis, Writing - original draft. Rasmus Benestad: Supervision, Immerzeel, W.W., Pellicciotti, F., Bierkens, M.F.P., 2013. Rising river flows throughout
the twenty-first century in two Himalayan glacierized watersheds. Nat. Geosci. 6,
Funding acquisition, Writing - review & editing. Helene B. Erlandsen: 742–745. https://doi.org/10.1038/ngeo1896.
Writing - review & editing. Abdelkader Mezghani: Methodology, Jacob, D., Otto, J., Viktor, E., 2017. Deliverable 1.4: Analysis of survey. Technical Report
Software, Writing - review & editing. Hilppa Gregow: Project admin- C3S_D51_LOT4.1.4.4_201710_ Analysis of Survey_v1, Official reference number ser-
vice contract: 2016/C3S_51_Lot4_FMI/SC1. Finnish Meteorological Institute, avail-
istration, Funding acquisition, Writing - review & editing. Olle Räty: able upon request from C3S.
Software, Methodology, Writing - review & editing. Elisabeth Viktor: Jacob, D., Petersen, J., Eggert, B., Alias, A., Christensen, O.B., Bouwer, L.M., Braun, A.,
Investigation, Writing - review & editing. Juliane El Zohbi: Colette, A., Déqué, M., Georgievski, G., et al., 2014. Euro-cordex: new high-resolution
climate change projections for european impact research. Regional Environ. Change
Investigation, Writing - review & editing. Ole B. Christensen: Writing -
14, 563–578.
review & editing. Harilaos Loukos: Writing - review & editing. Knutti, R., Furrer, R., Tebaldi, C., Cermak, J., Meehl, G.A., 2010. Challenges in combining
projections from multiple climate models. J. Clim. 23, 2739–2758.
Declaration of Competing Interest Knutti, R., Masson, D., Gettelman, A., 2013. Climate model genealogy: generation CMIP5
and how we got there. Geophys. Res. Lett. 40, 1194–1199. https://doi.org/10.1002/
grl.50256.
The authors declare that they have no known competing financial Lutz, A.F., ter Maat, H.W., Shrestha, A.B., Wester, P., Immerzeel, W.W., 2016. Selecting
interests or personal relationships that could have appeared to influ- representative climate models for climate change impact studies: an advanced en-
velope-based selection approach. Int. J. Clim. 36, 3988–4005. https://doi.org/10.
ence the work reported in this paper. 1002/joc.4608.
McSweeney, C.F., Jones, R.G., 2016. How representative is the spread of climate pro-
Acknowledgements jections from the 5 CMIP5 GCMs used in ISI-MIP? Clim. Serv. 1, 24–29. https://doi.
org/10.1016/j.cliser.2016.02.001.
McSweeney, C.F., Jones, R.G., Lee, R.W., Powell, D.P., 2015. Selecting CMIP5 GCMs for
We acknowledge the World Climate Research Programme, which, downscaling over multiple regions. Clim. Dyn. 44, 3237–3260. https://doi.org/10.
through its Working Group on Coupled Modelling, coordinated and 1007/s00382-014-2418-8.
Meehl, G.A., Covey, C., Delworth, T., Latif, M., McAvaney, B., Mitchell, J.F.B., Stouffer,
promoted CMIP5 and CMIP6. We thank the climate modeling groups for
R.J., Taylor, K.E., 2007. The WCRP CMIP3 multimodel dataset: a new era in climate
producing and making available their model output, the Earth System change research. Bull. Am. Meteor. Soc. 88, 1383–1394.
Grid Federation (ESGF) for archiving the data and providing access, and Mezghani, A., Dobler, A., Benestad, R., Haugen, J.E., Parding, K.M., Piniewski, M.,
Kundzewicz, Z.W., 2019. Sub-sampling impact on the climate change signal over
the multiple funding agencies who support CMIP5, CMIP6 and ESGF.
Poland based on simulations from statistical and dynamical downscaling. J. Appl.
Meteorol. Climatol. https://doi.org/10.1175/JAMC-D-18-0179.1.
References Molteni, F., Buizza, R., Marsigli, C., Montanin, A., Nerozzi, F., Paccagnella, T., 2006. A
strategy for high-resolution ensemble prediction. I: Definition of representative
members and global-model experiments. Q. J. R. Meteorol. Soc. 127, 2069–2094.
Adler, R.F., Huffman, G.J., Chang, A., Ferraro, R., Xie, P.P., Janowiak, J., Rudolf, B., https://doi.org/10.1002/qj.49712757612.
Schneider, U., Curtis, S., Bolvin, D., et al., 2003. The version-2 Global Precipitation Moss, R.J., Edmonds, J.A., Hibbard, K.A., Manning, M.R., Rose, S.K., van Vuuren, D.P.,
Climatology Project (GPCP) monthly precipitation analysis (1979–present). J. Carter, T.R., Emori, S., Kainuma, M., Kram, T., Meehl, G.A., Mitchell, J.F.B.,
Hydrometeorol. 4, 1147–1167. https://doi.org/10.1175/1525-7541(2003) Nakicenovic, N., Riahi, K., Smith, S.J., Stouffer, R.J., Thomson, A.M., Weyant, J.P.,
004<1147:TVGPCP>2.0.CO;2. Wilbanks, T.J., 2010. The next generation of scenarios for climate change research
Cannon, A.J., 2015. Selecting GCM scenarios that span the range of changes in a multi- and assessment. Nature 463, 747–756.
model ensemble: application to CMIP5 climate extremes indices. J. Clim. 28, Pierce, D.W., Barnett, T.P., Santer, B.D., Glecker, P.J., 2009. Selecting global climate
1260–1267. https://doi.org/10.1175/JCLI-D-14-00636.1. models for regional climate change studies. PNAS 106, 8441–8446. https://doi.org/
Chang, W., Cheng, J., Allaire, J., Xie, Y., McPherson, J., 2017. shiny: Web Application 10.1073/pnas.0900094106.
Framework for R. URL: https://CRAN.R-project.org/package=shiny. r package R Core Team, 2014. R: A Language and Environment for Statistical Computing. R
version 1.0.3. Foundation for Statistical Computing, Vienna, Austria.
Christensen, J., Krishna Kumar, K., Aldrian, E., An, S.I., Cavalcanti, I., de Castro, M., Raäisaänen, J., 2007. How reliable are climate models? Tellus A: Dynamic Meteorology
Dong, W., Goswami, P., Hall, A., Kanyanga, J., Kitoh, A., Kossin, J., Lau, N.C., and Oceanography 59, 2–29.
Renwick, J., Stephenson, D., Xie, S.P., Zhou, T., 2013. Climate Phenomena and their Riahi, K., Rao, S., Krey, V., Cho, C., Chirkov, V., Fischer, G., Kindermann, G., 2011. RCP
Relevance for Future Regional Climate Change. In: Stocker, T., Qin, D., Plattner, G.K., 8.5-A scenario of comparatively high greenhouse gas emissions. Clim. Change 109.
Tignor, M., Allen, S., Boschung, J., Nauels, A., Xia, Y., Bex, V., Midgley, P. (Eds.), https://doi.org/10.1007/s10584-011-0149-y.
Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to Riahi, K., Van Vuuren, D.P., Kriegler, E., Edmonds, J., O’neill, B.C., Fujimori, S., Bauer, N.,
the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Calvin, K., Dellink, R., Fricko, O., et al., 2017. The shared socioeconomic pathways
Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA. and their energy, land use, and greenhouse gas emissions implications: an overview.
book section 14, pp. 1217–1308.https://doi.org/10.1017/CBO9781107415324.028. Global Environ. Change 42, 153–168.
Dee, D.P., Uppala, S.M., Simmons, A.J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Ruane, A.C., McDermid, S.P., 2017. Selection of a representative subset of global climate
Balmaseda, M.A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A.C.M., van de Berg, models that captures the profile of regional changes for integrated climate impacts
L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A.J., assessment. Earth. Perspectives 4. https://doi.org/10.1186/s40322-017-0036-4.
Haimberger, L., Healy, S.B., Hersbach, H., Hólm, E.V., Isaksen, L., Kållberg, P., Taylor, K.E., Stouffer, R.J., Meehl, G.A., 2012. An overview of CMIP5 and the experiment
9
design. Bull. Am. Meteorol. Soc. 93, 485–498. https://doi.org/10.1175/BAMS-D-11- Survey results. Technical Report. DECM (C3S_51 Lot4). URL: https://www.gerics.
00094.1. URL:http://journals.ametsoc.org/doi/abs/10.1175/BAMS-D-11-00094.1. de/imperia/md/content/csc/projekte/decm_survey_summary_shading_corr.pdf.
Thomson, A.M., Calvin, K.V., Smith, S.J., Kyle, P., Volke, A., Patel, P., Delgado-Arias, S., Warszawski, L., Frieler, K., Huber, V., Piontek, F., Serdeczny, O., Schewe, J., 2014. The
Bond-Lamberty, B., Wise, M.A., Clarke, L.E., Edmonds, J.A., 2011. RCP4. 5: a inter-sectoral impact model intercomparison project (ISI-MIP): project framework.
pathway for stabilization of radiative forcing by 2100. Clim. Change 109. https://doi. PNAS 111, 3228–3232. https://doi.org/10.1073/pnas.1312330110.
org/10.1007/s10584-011-0151-4. Zahid, M., El Zohbi, J., Viktor, E., Rechid, D., Schuck-Zøller, S., Keup-Thiel, E., Jacob, D.,
Turco, M., Sanna, A., Herrera, S., Llasat, M.C., Gutiérrez, J.M., 2013. Large biases and 2019. What does quality mean to climate data users/providers and how to enable
inconsistent climate change signals in ENSEMBLES regional projections. Clim. them to evaluate the quality of climate model data and derived products? submitted
Change 120, 859–869. to Handbook of Climate Services, under review.
Viktor, E., Otto, J., Brune, M., Jacob, D., 2018. Data Evaluation for Climate Models, Key
10

1 s2.0 S2405880720300194 Main

Uploaded by

Copyright:

Available Formats

1 s2.0 S2405880720300194 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S2405880720300194 Main

Uploaded by

Copyright:

Available Formats

Climate Services 18 (2020) 100167

Contents lists available at ScienceDirect

Original research article

GCMeval – An interactive tool for evaluation and selection of climate model T

weightings and resulting model rankings, and the relative spread

Available online 16 March 2020

ACCESS1–0 r1i1p1 GISS-E2-H r2i1p1∗, r2i1p2∗, r2i1p3∗

∗ Only available for RCP4.5

Table 2 The GCMeval front-end provides a ranking of the GCMs based on

Fig. 1. Scatterplots showing the global an-

Rank GCM rip GCM rip

1 CCSM4 r6i1p1 MPI-ESM-LR r3i1p1

2.4. Skill scores

To evaluate the performance of the GCMs, a set of skill scores

Fig. 3. Scatterplots showing the an-

Fig. 5. As Fig. 4, but for RCP8.5.

You might also like