Thesis
Thesis
Thesis
KIEL
Semester: 6
Student ID: 1110282
First Supervisor: Prof. Dr. Markus Haas
Second Supervisor: Prof. Dr. Stephan Reitz
Master’s Thesis
for the Master’s Program
MSc Quantitative Finance
September 2019
Contents
List of Abbreviations X
1 Introduction 1
3 Methodology 11
4 Models implementation 15
4.1 SETAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 LSTAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 SLFN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
I
5 Forecast result 23
6 Conclusion 32
References 35
Appendix A Tables 38
Appendix B Figures 58
II
List of Tables
III
5.15 P-values of Giacomini-White test with quadratic loss, LSTAR
against AR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
IV
A.5 Jarque-Bera test on return . . . . . . . . . . . . . . . . . . . . . 39
A.8 Threshold grid search in the SETAR model, DJI time series . . 40
A.9 Threshold grid search in the SETAR model, NASDAQ time series 40
A.10 Threshold grid search in the SETAR model, NYSE time series 40
A.11 Threshold grid search in the SETAR model, S&P time series . 41
A.21 Threshold grid search in the LSTAR model, DJI time series . . 43
A.22 Threshold grid search in the LSTAR model, NASDAQ time series 43
A.23 Threshold grid search in the LSTAR model, NYSE time series . 44
A.24 Threshold grid search in the LSTAR model, S&P time series . 44
V
A.27 LSTAR parameter estimate for the NYSE time series . . . . . . 45
A.33 Grid search for SLFN model estimation and selection for DJI
return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
A.34 Grid search for SLFN model estimation and selection for NAS-
DAQ return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
A.35 Grid search for SLFN model estimation and selection for NYSE
return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
A.36 Grid search for SLFN model estimation and selection for S&P
return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
VI
A.43 SETAR detailed in-sample fit . . . . . . . . . . . . . . . . . . . 50
VII
List of Figures
VIII
B.18 SLFN NASDAQ return forecasts comparison . . . . . . . . . . 64
IX
List of Abbreviations
X
S&P Standard and Poor
STAR Smooth Transition Autoregressive
TAR Threshold Autoregressive
XI
1. Introduction
For obvious reasons, research on stock return predictability has been for long
a focal standpoint of many academics and finance experts. On the experts’
side, building models that would provide reliable return forecasts is vital in
order to advise and enhance investment strategies. On the academics’ side, the
topic of stock return predictability leads to the efficiency market hypothesis.
In fact, the ability to understand the nature of stock return predictability
has some major implications for tests of market efficiency in the sense that
understanding the nature of stock predictability is useful in building realistic
asset pricing models in order to better explain return time series (Rapach &
Zhou 2013, p. 330 ).
Predicting stock return has proven to be a very tedious task. Stock return
series inherently contain some unpredictable component so that even the best
forecasting model can only explain a small part of their behaviour. Timmer-
mann (2018, p. 2) argues that competition among market participants implies
that if a successful model in predicting return is discovered, it will be readily
adopted by different traders and the dissemination of the model would cause
the price to move unpredictably. Nevertheless, some studies suggest that stock
returns are to some extent predictable either from their past story or by using
publicly available information. Rational asset pricing theory for instance sug-
gests that stock return predictability can result from exposure to time-varying
aggregate risk; therefore, if a successful model can capture this time-varying
aggregate risk premium, it will remain successful over time (Rapach & Zhou
2013,p. 330).
1
be autocorrelated. This time-varying property of volatility is also referred to as
volatility clustering. Moreover, raw return appears to display no or only little
autocorrelation. Without excluding the possibility of nonlinear relation,
this means that the linear relation between consecutive returns is very small.
Likewise, the existence of frequent structural breaks and behavioural changes
in financial time series is well documented (see Franses and Dijk 2000, p. 5-19
for more details).
Stylized facts of return have led to a need for models that would better reflect
those features. While the theory of linear models is well established, linear
models fail to capture the nonlinear characteristics of financial time series and
do not produce reliable forecasts. Therefore, we are going to consider, still
under massive research, non-linear time series models, not because we believe
that they provide desired out-of-sample forecast performance, but simply be-
cause they can capture some of the nonlinear patterns in return and might
provide enhanced out-of-sample forecast vis-a-vis linear models. Besides, we
are going to consider artificial neural network models, which have proven to
yield useful results in a range of applications in various fields.
The main aim of this thesis is to compare the predictive power of some of
the most prominent non-linear time series models and artificial neural net-
work models in forecasting returns of stock indexes. More explicitly, we are
going to examine whether there is a predictive gain or loss in using nonlin-
ear models in lieu of a benchmark linear model (the autoregressive model,
AR) in forecasting returns of four major stock indexes, namely the DJI, the
NASDAQ, the NYSE and the S&P 500. Our attention is restricted to multi-
step point forecasts. We use three different models, namely the self-exciting
threshold autoregressive (SETAR), the logistic smooth transition autoregres-
sive (LSTAR) and the single hidden layer feedforward network (SLFN). We
restrict our attention to past returns as the only explanatory variable in or-
der to find out about return predictability from its own history. Our work
is going to be structured as follows: section 2 is going to be dedicated the
topic of forecasting with nonlinear models and forecasts comparison, section
3, the methodology, discusses different models used to generate forecasts, sec-
2
tion 4 concerns models implementation, the results are going to be presented
in section 5 and in section 6 a conclusion is going to complete the work.
where F (xt ; θ) is the skeleton of the model under consideration. The one-step
ahead forecast is obtained like in the linear model as
f
rt+1 = E (rt+1 |It ) = F (xt+1 ; θ) . (2.2)
Equation 2.2 is hence an unbiased forecast of rt+1 given the information set
0
It . The relevant information is contained in xt+1 = 1, rt , rt−1 , . . . , rt−(p−1) .
Turning to longer time horizon forecast, obtaining E (rt+h |It ) for h > 1 be-
comes more involved. We use the 2 steps ahead to illustrate this problem. We
have
nh i o n o
f
rt+2|t = E (rt+2 |It ) = E F xft+2 ; θ + εt+2 |It = E F xft+2 ; θ |It
(2.3)
where xft+2 = 1, xft+1|t + εt+1 , rt , . . . , rt−(p−2) ). The exact expression for 2.3
is
n o Z ∞
f f
rt+2|t = E F xt+2 ; θ |It = F xft+2 ; θ dφ(z)dz. (2.4)
−∞
3
• The naı̈ve approach: this approach comes down to simply setting εt+1 =
0 and using the skeleton, but it yields biased forecasts. We have
fn
rt+2|t = F xft+2
n
;θ (2.5)
0
where xft+2 = 1, rt+1|t
f
, rt , . . . , rt−(p−2)
In the context of time series forecasting, there is a gradual change from fore-
casting with a linear model to forecasting with nonlinear models. Few authors
have been interested in studying whether using nonlinear models could im-
prove upon forecasts obtained by linear models. The results are quite mixed.
4
classes of models: linear autoregressions with and without unit root pre-test,
exponential smoothing, ANN and STAR models. Autoregressions with unit
root pre-test achieve the best overall performance. However, the study sug-
gests that this performance can be increased if forecast combination with other
methods is used.
Bradley and Jansen (2004) model stock return and production as nonlinear and
regime-dependent variables. Various nonlinear models are used to generate
out-of-sample forecasts of both variables and compare the result to forecasts
from the linear model. The finding is that the linear model performs as well
or better than any of the nonlinear specifications for stock returns. For the
industrial production, two of the nonlinear specifications provide better results
5
than the linear model.
Timo Teräsvirta, van Dijk, and Medeiros (2005) provide an empirical study,
trying to answer the question whether careful modelling can improve forecast
accuracy of nonlinear models in contrast to linear ones. 47 monthly macro-
economics variables of the G7 economies are examined and 3 models are con-
sidered: the linear autoregressive, the smooth transition autoregressive and
artificial neural networks. The findings are mixed for the ANN in the sense
that ANN obtained using Bayesian regularization perform better than the AR
but only in the long term. On the other hand, the STAR model outperforms
linear autoregressive models demonstrating that a careful modelling of the
nonlinear model is necessary.
Lim and Hooy (2013) study the source and the persistence of nonlinear pre-
dictability in various stock markets of G7 countries. Evidence of local nonlin-
ear predictability is detected by applying the BDS test to autoregression AR-
filtered return in rolling estimation windows. In order to identify the source
of nonlinear predictability, the BDS test is applied to AR-GARCH-filtered re-
turn in rolling windows. Taking into account conditional heteroskedasticity,
evidence of nonlinear predictability is brought out during some short-time in-
tervals in all markets, thus contradicting the weak form of market efficiency
hypothesis.
6
root mean squared forecast error (RMSFE) and the mean absolute forecast
error (MAFE), respectively given by
v
u
u TX
+P
RM SF E = P
t −1 e2t (2.8)
t=T +1
TX
+P
M AF E = P −1 |et | (2.9)
t=T +1
where et is the forecast error. The better model is the one which has the
smaller loss. In order to make the comparison easy, we use an approach that
relies on relative accuracy, which means that for each model we have
l (rt , r̂t )
R (rt , r̂t ) = (2.10)
L (rt , r̂t )
where l (rt , r̂t ) is the loss from a given nonlinear model and L (rt , r̂t ) is the
loss from the benchmark model. An R(rt , r̂t ) > 1 means that the AR model
performs better than the compared nonlinear model and vice versa.
7
the forecaster at the time of the prediction, such as the estimation procedure
to choose. Of course, evaluating forecasting methods rather than models is
important as all the elements of the method can affect the forecast perfor-
mance. Obviously, the main reason why the GW test is used in this thesis and
its main advantage is that it can be applied to compare forecasts from nested
models. We should keep in mind, however, that the GW test assumes that
forecasts are obtained using rolling window estimators, and this can lead to
a substantial decrease of statistical power (Elliott and Timmermann 2016, p.
104). To the best of my knowledge, there is no published paper that employ
the GW test to compare forecasts from nonlinear models against forecasts
from a benchmark nested linear model and this research is probably the first
to do so in this regard.
So far, the methods presented above to assess forecast accuracy are based on
measures of the distance between forecasts and realizations; put differently,
they concentrate on the magnitude of forecast errors and can be considered
as quantitative measures of forecast accuracy. However, regime switching mo-
dels, with the idea of moving from one state of the world to another, may be
better suited for predicting future movements of time series. Thus, one way of
capturing this idea is to use an evaluation criterion based on how often the sign
of return is correctly predicted. We refer to such procedure as a qualitative
measure of forecast accuracy and consider two market timing tests to evaluate
how well models can predict return movements. The first test considered in
this context is the Directional Accuracy (DA) test of Pesaran and Timmer-
mann (1992) and then the Excess Profitability (EP) test of Anatolyev and
Gerko (2005). Both tests are described below. Such methods of evaluating
forecasts can be of interest to investors who might be interested in knowing
the future direction of returns rather than the magnitude of their changes.
8
opposite.
HO : E[Lt+h (Yt+h , fˆm,t t) − Lt+h (Yt+h , ĝm,t ) |It ] ≡ E [∆Lm,t+h |It ] = 0 (2.11)
9
where P̂ is the proportion of time that the sign of the true withheld value is
correctly predicted, P̂∗ is the estimate of the probability of correctly predicting
the sign of the true withheld value and V (.) is a consistent estimate of the
sample variance. We need to mention that if all the signs of the true withheld
values or of forecasts are the same, the PT will be undefined. For more details,
we refer to Pesaran & Timmermann (1992).
AT − B T a
EP = p N (0, 1) (2.15)
V̂ EP
where BT = ( T1 Tt=1 sign(r̂t ))( T1 Tt=1 rt ),AT is the expected one-period re-
P P
turn of the trading strategy that buys stock whose predicted return is positive
and sells stock whose predicted return is negative, and V̂EP is the estimate of
the variance of AT − BT (see Anatolyev & Gerko (2005) for more details).
10
3. Methodology
This section reviews different models whose forecasts are going to be studied.
For each model, 3 parts are going to be discussed: model representation, model
estimation, and model selection. The logarithmic return rt is employed and is
calculated as
rt = ln (Pt ) − ln (Pt−1 ) (3.1)
where Pt is the stock price at time t. Throughout this thesis, the stock index
return will often simply be referred to as return.
where φ0j = (φ0,j , φ1,j . . . φp,j ) and xt = (1, rt−1 , . . . , rt−p ) . From this it is
clear to see that estimators of the parameter φ = (φ01 , φ02 ) in the two-regimes
switching model can be obtained by CLS as
11
n
!−1 n
!
X X
φ(c)
b = xt (c)xt (c)0 xt (c)rt . (3.4)
t=1 t=1
value c. Moreover, xt (c) = (x0t I [rt−d ≤ c] , x0t I [rt−d > c]) (see Franses and Dijk
(2000, p. 84) for more details). The threshold value c is chosen so that the
residual variance is minimized i.e.
argmin
(ĉ) = σ̂ 2 (ĉ) (3.5)
c∈C
with C being the set of all allowable threshold values. C should be such that
each regime contains enough observation and a popular choice is to leave at
least 15% of observation in each regime. Chan (1993) shows that this proce-
dure produces a consistent estimate of c. Equation 3.4 would then become
n
!−1 n !
X X
0
φ(ĉ) =
b xt (ĉ)xt (ĉ) xt (ĉ)rt (3.6)
t=1 t=1
1
G (rt−1 ; γ, c) = (3.8)
1 + exp (−γ [rt−1 − c])
In the LSTAR model, the focus lies in estimating the parameter vector θ =
(φ0 1 , φ0 2 , γ, c)0 . The estimation of parameters is done by nonlinear least
square (NLS), i.e.
12
n
X
θ̂ = argmin [yt − F (xt ; θ)]2 . (3.9)
θ t=1
where xt (γ, c) = (x0t (1 − G (yt−1 ; γ, c)) , x0t G (yt−1 ; γ, c))0 . In order to find the
optimal estimates of γ and c, we perform a two-dimensional grid search over
different combinations of γ and c and select the pair of estimates for which
the residual variance is minimized. For the threshold variable, the delay is set
to d = 1 for the same reason as in the SETAR model and it is incremented by
one each time the algorithm to obtain estimates of γ and c do not converge.
For both the SETAR and the LSTAR model, we assume that their residuals
are standard normally distributed. Thus, their estimates can be interpreted
as maximum likelihood (ML) estimates (Franses and Dijk 2000, p. 84 and p.
90). When it comes to lag order selection, our aim being to forecast, we use
the Akaike information criterion (AIC). This approach is preferred over an
alternative existant one, which consists of studying the ACF and the PACF,
because it takes into account lags that are jointly significant. Given some
upper bound for the number of lags p1 and p2 in each regime and given the
set C, the selected lag order is the one which minimizes the AIC. An obvious
drawback of this approach is that it is computationally demanding as the
model has to be estimated for different combinations of p1 and p2 . Lastly, the
BFGS algorithm is applied to solve both models.
13
The model described by 3.11 consist of three different layers. First, an in-
put layer consisting of input units. Input units are multiplied by connection
strength γi0 . The second type of layer is the hidden layer, which is composed
of q hidden units and the activation functions G (.). The third one, the output
layer, is in our case the response variable in equation 3.11.
rt = g (xt , ξ) + ηt (3.12)
where g (xt , ξ) is a continuous function, it can be shown that 3.11 can appro-
ximate any function g (xt , ξ) to any desired degree of accuracy given that the
number of hidden units is sufficiently large. Mathematically, rewriting 3.11 as
rt = F (xt ; θ) + εt (3.13)
it can be proved that for any continuous function g (xt , ξ), every compact
subset K of RK , and every δ > 0, there is an ANN F (xt ; θ) so that
(for reference and detail see Franses and van Dijk (2000, p. 208), Cybenko
(1989, p. 308-312), Hornik, Stinchcombe, and White (1990, p. 556-557)).
Consequently, the SLFN can be used to approximate any nonlinear relation-
ship between rt and its lagged variables. Parameter estimation can be accom-
plished by minimizing the residual sum of the squares function as
n
X
θ̂ = argmin [yt − F (xt ; θ)]2 (3.16)
θ t=1
Any conventional nonlinear least square algorithm can be used to solve 3.16
and obtain estimates of θ. While it is common to solve 3.16 by residual
backpropagation, this approach requires to carefully chose a stopping rule to
stop training the model and can therefore easily lead to overfitting. Thus,
we stick to the BFGS algorithm as with TAR models. It is also important
14
to point out that ANN models are usually thought of as an approximation
model rather than models that capture the underlying data generating process.
Consequently, ANN models are inherently misspecified (Franses and Dijk 2000,
p. 217-218). Another drawback of ANN models is the risk of overfitting. By
increasing the number of hidden units in the hidden layer, it is possible to
obtain an almost perfect in-sample fit. However, a perfect in-sample fit does
not guarantee an improved out-of-sample forecast performance. For model
selection, currently, no universal rules exist in selecting the most appropriate
model for practical application. We are going come back to this issue later on.
Finally, for the activation function G (.) , we use the logistic function given by
1
G (zt ; γ) = (3.17)
1 + exp (−γzt )
4. Models implementation
15
in this case is 19 days) is calculated for each stock index. The return data
are divided into two sets: the first set, starting from 01/2000 to 12/2017, is
used for model specification and estimation and the second set, covering the
forecast period of January 2018, is used to evaluate the predictive accuracy of
the model. Figure 4.1 shows plots of different returns. One can observe the
autocorrelated volatility (volatility clustering).
and employ the BDS test for this purpose. This test uses the correlogram
integral to analyse the spatial dependence of the series by embedding the
observed data in m-space. The result of the test is reported from table A.1
to table A.4 in the appendix. For the four time series data, the test strongly
rejects the null hypothesis of linearity for all the combinations of m embedded
dimensions, and epsilon. The next important propriety of the time series
investigated is stationarity. This was tested through the augmented Dickey-
Fuller test and the Phillips-Perron test. Testing stationarity is relevant to our
work in the sense that regressing nonstationary data gives spurious regression
and makes no sense. Nonstationarity implies a permanent deviation from
equilibrium, which is hard to interpret economically. P-values for both of the
tests are reported in table A.6 of the appendix, and the null of nonstationarity
is rejected at 5% significance level.
Table 4.1 reports some descriptive statistics of the data. The measure of kur-
tosis especially suggests excess kurtosis and the estimated skewness reflects the
asymmetry that was discussed earlier, coming from the fact that large negative
returns occur more often than large positive returns. In fact, the null hypo-
thesis of normality was tested against the alternative of nonnormality through
the Jarque-Bera test, which uses the fact that a normally distributed random
variable has a zero skewness and a kurtosis equals to 3. The null hypothesis
16
of normality is rejected at 5% significance level, meaning that returns series
are not normally distributed. The result of the Jarque-Bera test is reported in
table A.5 in the appendix. Moreover, Q-Q plots and histogram plots compa-
ring the empirical distribution of returns to the normal distribution in order
to visualize nonnormality are also reported in figure B.1 and figure B.2 in the
appendix. The ACF and PACF functions are shown in figure B.7 to figure B.8
in the appendix. The immediate observation are the smoothly decaying auto-
correlations which is a property that can be better captured by autoregressive
models. After examining some statistical properties of the data, we now turn
to the implementation of different models.
17
4.1 SETAR
Put differently, we test the null of linearity against the alternative of a two-
regimes SETAR nonlinearity. The main issue in testing for SETAR type non-
linearity springs from unidentified parameter nuisance in the null hypothesis.
This means that the SETAR model contains an extra parameter, the threshold,
which is not restricted under the null hypothesis and which is not present in
the linear model. Thus, the asymptotic distribution of the test statistic tends
to be non-standard, with no available analytical expression as conventional
statistical theory cannot be applied. Chan (1991) defines a likelihood ratio to
test the restriction in the null hypothesis. Using the threshold nonlinearity
test of Chan (1991), the null hypothesis of linearity is strongly rejected at 5%
significance level (for all return time series; see table A.7 in the appendix).
We now proceed to the next stage of estimating the SETAR model following
the method outlined in the methodology section.
In order to estimate the model, we first need to specify the maximum autore-
gressive order for both regimes. Having high maximum order for both regimes
would be beneficial in the sense that it would provide a wider scope for model
selection. However, this comes at the cost of being considerably computa-
tionally time consuming since the estimation of a SETAR model involves grid
approximations in order to find the optimal threshold c. Therefore, the maxi-
mum order for both regimes is set at 4. The algorithm employed to estimate
the model searches a wide range of possible threshold values within regimes
with sufficient 15% number of observations in each regime as suggested by
18
Chan (1993). The AIC is used to select the most appropriate model. These
results of the grid search, for all the return series, are reported in table A.8
in the appendix and different plots of the grid searches are reported in figure
B.3 in the appendix. In figure 4.2, we show regime-switching plots of all the
return time series. The SETAR model is estimated via CLS and the results
are reported in table A.12 to table A.15 in the appendix.
Figure 4.2: Regime switching plots of differents return time series, columnwise DJI, NAS-
DAQ, NYSE, S&P, in the SETAR model
19
4.2 LSTAR
The first step in modelling with the LSTAR consists of testing the existence of
a LSTAR nonlinearity in the data. Again, the main issue in this procedure is
the unidentified parameter nuisance problem that we encountered in the case
of the SETAR model, which in the LSTAR model sterms from the fact that the
threshold c and the parameter γ in the logistic function are not identified under
the null hypothesis. Luukkonen, Saikkonen and Teräsvirta (1988) circumvent
this problem by using a third order Taylor expansion of the logistic function
around γ = 0. The logic behind setting γ = 0, is that for γ = 0, the LSTAR
model collapses to an AR model. The auxiliary regression can be written as
where Xt = (1, yt−1 , yt−2 . . . yt−p ), zt is the threshold variable and βi are func-
tions of the original model parameters. Thus, the null of linearity and the
alternative of LSTAR nonlinearity which can be written as
H0 : γ = 0
(4.4)
H : γ 6= 0
1
20
autoregressive order is 4. These results of the grid search, for all the return
series, are reported in table A.21 to table A.24 in the appendix. In figure 4.3,
we present regime-switching plots of all the return time series. The LSTAR
model is estimated via NLS and the results are reported in table A.25 to A.28
in the appendix.
Figure 4.3: Regime switching plots of differents return time series, columnwise DJI, NAS-
DAQ, NYSE, S&P, in the LSTAR model
21
none could be validated by the Eitrheim-Teräsvirta test for remaining nonli-
nearity, suggesting that considering additional regime(s) would be beneficial
in terms of exploiting nonlinearity in the data. Results of the test are reported
in table A.29 to table A.32
4.3 SLFN
For model specification, as pointed out before, there are no universally ac-
cepted schemes. Two decisions have to be made, namely the number of input
units and the number of neurons. For the number of input units, we argue that
return time series display day-of-the-week seasonality and we therefore set it
equal to five, each input unit representing an open day of the week (Saturday
and Sunday are excluded). The next step is to decide on the number of neu-
rons. For the fixed number of input units, 5 lags in this case, the SLFN was
estimated one hundred times, with the number of neurons varying from 1 to
100. The estimated models where subsequently tested with the Teraesvirta’s
neural network test for neglected nonlinearity and the null hypothesis of li-
nearity in mean, against the alternative of nonlinearity in mean, was strongly
rejected at 5% significance level, at each specification of the SLFN, with all the
100 p-values arbitrarily close to zero. Results for this experiment are reported
in table A.38 in the appendix. Such results do not come as a surprise if we
refer to Franses and Dijk (2000, p 217-218) who note that ANN are inherently
misspecified. Teraesvirta’s neural network test for neglected nonlinearity uses
the same trick as the LSTAR test in order to circumvent the parameter nui-
22
sance problem- a Taylor expansion of the activation function (more details on
the test are provided in Teräsvirta, Lin, and Granger (1993)). This approach
would have allowed us to select the model which most strongly fails to reject
the null of linearity in mean (higher p-value). Unfortunately, all models are
misspecified according to the test above. The Jarque-Bera test also shows that
residuals are not normally distributed (table A.37).
5. Forecast result
As noted previously, multi-step forecasts in nonlinear time series model are ob-
tained via 4 different methods: the naı̈ve, Monte Carlo, bootstrap and block
bootstrap. We also recall that we have defined quantitative measures of fit,
which included R2 , adjusted R2 , RMSFE, MAFE, the GW test, and quali-
tative measures of fit, which included the directional accuracy test and the
excess profitability test. R2 and adjusted R2 are for in-sample fit, while the
others are for out-of-sample fit. In total, 52 forecasts were generated. More
precisely, 16 forecasts were genarated through each nonlinear model and 4
forecasts were generated through the AR model.
First, the SETAR model was fitted to the data, and forecasts were obtained.
R2 and adjusted R2 in the AR model were greater than in the SETAR model
for all return time series. This suggest that the AR model explain the volatility
in return better than the SETAR model; those results can be found in table
5.1. Turning to the out-of-sample fit, the SETAR model uniformly dominates
the AR model in the naı̈ve approach, both in terms of RMSFE and MAFE.
23
For the bootstrap, the SETAR dominates again, except in the MAFE of DJI
return. In the block bootstrap, there is no clear winner as the results are
mixed. On the other hand, the AR model provide a better forecast than the
SETAR model in terms of MAFE in the Monte Carlo approach, while the
results are mixed for the RMSFE. The GW test with a quadratic loss function
fails to reject the null of equal predictive ability of all the SETAR forecasts
compared to the forecast from the AR model at 5% significance level. The
same result is found with the absolute loss function, with the only exception
at naı̈ve forecast of the NYSE return. This suggests that the gain or loss in
precision by forecasting return with the SETAR model compared to the AR
model is not significant. The results of the relative loss are reported in table
5.2 to table 5.5 and the results of the GW test are reported in table 5.6.
Table 5.1: The in-sample fit differential between SETAR and AR model suggest that the AR
model explains a bigger amount of variance in return than the SETAR model (SETAR-AR)
24
Table 5.2: Relative out-of-sample fit (loss from SETAR-Naı̈ve/loss from AR)
Table 5.3: Relative out-of-sample fit (loss from SETAR-bootstrap/loss from AR)
Table 5.4: Relative out-of-sample fit (loss from SETAR-block bootstrap/loss from AR)
Table 5.5: Relative out-of-sample fit (loss from SETAR-mc/loss from AR)
Table 5.6: P-values of Giacomini-White test with quadratic loss, SETAR against AR
Table 5.7: P-values of Giacomini-White test with absolute loss, SETAR against AR
25
Table 5.9: P-values of excess profit test on SETAR and AR forecasts
Turning to the LSTAR model, the results are quite mixed for the in-sample fit.
The AR model performs better than the LSTAR model in explaining return
volatility in the DJI and NASDAQ returns for both R2 and adjusted R2 ,
while the SETAR model dominate in the NYSE and S&P returns. In-sample
fit results are provided in table 5.10. Coming to the out-of-sample forecasts,
forecasts from the AR model uniformly dominate the LSTAR naı̈ve approach
in terms of both RMSFE and MAFE. The results obtained in the bootstrap
favor the LSTAR in terms of MAFE, while there is no winner in terms of
RMSFE. In the block bootstrap and Monte Carlo approach, there is also no
winner. The first impression in using those quantitative measures seems to
slightly favour the AR model in general. However, most importantly, the GW
test using a quadratic loss function tells us that the loss differentials between
forecasts from both models are not significant at 5% significance level. The
same result is found in the case of the absolute loss function, with the only
exception being in the naı̈ve forecast of DJI. Again, the conclusion is similar
to the SETAR, that is, overall, the loss differential from forecasts from both
models is not significant. Results for the RMSFE and MAFE can be found
in table 5.11 to table 5.14, and the p-values of the GW test are reported in
table 5.15 and table 5.16 respectively for both the quadratic and absolute loss
function.
26
Table 5.10: For the in-sample fit differential between LSTAR and AR model, the results are
rather mixed
Table 5.11: Relative out-of-sample fit (loss from LSTAR-/loss from AR)
Table 5.12: Relative out-of-sample fit (loss from LSTAR-bootstrap/loss from AR)
Table 5.13: Relative out-of-sample fit (loss from LSTAR-block bootstrap/loss from AR)
Table 5.14: Relative out-of-sample fit (loss from LSTAR-Monte Carlo/loss from AR)
Table 5.15: P-values of Giacomini-White test with quadratic loss, LSTAR against AR
27
Table 5.16: P-values of Giacomini-White test with absolute loss, LSTAR against AR
28
Qualitative measures of out-of-sample fit also produce results that are closely
related to the previous ones. The directional accuracy test fails to reject the
null of independence between actual and predicted return, for all the data, at
5% significant level, meaning that the SLFN also fails to predict the future
signs of returns. The null of conditional independence is also not rejected
in the excess profitability test, meaning that we cannot generate profit solely
based on the sign of the generated forecasts. P-values of both the directional
accuracy and the excess profitability test are respectively reported in table
5.26 and 5.26.
Table 5.19: For the in-sample fit differential between SLFN and AR model, the results are
rather mixed
Table 5.20: Relative out-of-sample fit (loss from SLFN-Naı̈ve/loss from AR)
Table 5.21: Relative out-of-sample fit (loss from SLFN-bootstrap/loss from AR)
Table 5.22: Relative out-of-sample fit (loss from SLFN-Block bootstrap/loss from AR)
Table 5.23: Relative out-of-sample fit (loss from SLFN-Monte Carlo/loss from AR)
29
Table 5.24: P-values of Giacomini-White test with quadratic loss, SLFN against AR
Table 5.25: P-values of Giacomini-White test with absolute loss, SLFN against AR
In summary, for the in-sample fit, the AR model did better than the SETAR
model in explaining volatility in return time series, but when compared to the
LSTAR and the SLFN model, the results are rather mixed. Turning to the out-
of-sample fit, most importantly, the GW test suggests that overall, none of the
nonlinear models performs better or worse than the AR model in forecasting
return as the null of equal predictive ability could not be rejected in more than
90% of the cases. It can also be observed that all relative measures of out-
of-sample performance are very close to 1, already hinting at equal predictive
accuracy. Furthermore, findings from the directional accuracy and the excess
profit test suggested that future return signs could not be forecasted reliably
and no profit could be derived by simply basing the investment decision on
the predicted return signs. These results are in line with those of a few other
authors who studied the same topics, such as Bradley and Jansen (2004) and
Ferrara, Marcellino, and Mogliani (2015) among others.
30
Few reasons can be identified to explain the failure of nonlinear models in
forecasting return or in outperforming linear models:
• The third reason can be due to the fact that using only two regimes,
is not enough to capture a significant portion of nonlinearity. In fact,
the test of remaining nonlinearity in the LSTAR model have rejected
the null of linearity suggested the presence of nonlinearity in residuals.
The same results was found for the test of neglected nonlinearity in the
SLFN.
• The fourth reason and arguably the most important one is the efficient
market hypothesis, which simply supports in its strong form the idea
that the market is informationally efficient and that no predictability
can be exploited to generate risk-adjusted profit.
31
might for instance be interested in knowing how much volatility in return is
explained by a given model. For this purpose, we provide R and R-squares for
different models and time series in table A.43 to table A.45 in the appendix,
and indivual RMSFE and MAFE for different forecast methods are reported
in table A.46 to table A.57 in the appendix. Lastly, plots of in-sample fit for
all the models are reported in figure B.21 to figure B.24.
Few suggestions can be taken into account in order to improve the current
results. First, applying rolling window estimation could be beneficial as this
mitigates the effect of parameter change in the time series (Teräsvirta, Dijk,
Medeiros 2005, p. 772). Furthermore, considering a forecast combination
approach could lead to better results as it is extensively shown in literature
that combining forecasts provides better forecast performance than using a
single best forecast. This is in line with the recommendation of Stock and
Watson (1999). Moreover, as it was found that simply using two regimes
does not exploit nonlinearity in return well, particularly in the LSTAR model,
using more than two regimes can turn out to be beneficial. Also, applying
Bayesian methods might be appealling as they provide a coherent framework
for handling model instability, model uncertainty and parameter estimation
error (Teräsvirta 2018, p. 13). Lastly, considering additional explanatory
variables, such as price-earnings, aggregate output, dividend price, interest
rate, dividend pay-out among others in order to exploit potential information
they might contain, might be helpful as well.
6. Conclusion
32
linear models, and compare the resulting forecasts to assess whether there is
a predictive gain in using nonlinear models in lieu of linear models. Thus, we
generated forecasts from the SETAR, the LSTAR and the SLFN models and
we compared them to forecasts generated from the AR model.
We came to the conclusion that nonlinear models, just like the AR model,
fail to reliably forecast stock index return. In fact, through some extensive
testing with the GW test, we could not find evidence of a significant loss
differential between using nonlinear models or the AR model. Moreover, using
the directional accuracy test, we found that overall none of the models could
forecast the sign of future returns, and the excess profit test has lead to the
conclusion that no profit can be derived by an investment strategy that buy
stocks whose predicted sign is positive and sell stocks whose predicted sign is
negative. Also, all the models only explain a small fraction of return volatility.
We provided some reasons that could justify our findings and discussed some
potential solutions in order to improve the current results.
Having conducted this research, we are well aware that we did not discuss
several important nonlinear models which could be used to generate out-of-
sample forecast. Such models include Markov-switching models, and time
varying smooth transition models among others. We also refrained from trea-
ting seasonality rigorously and it was only used as an argument in choosing
the number of input units in the SLFN. Finally, multivariate nonlinear mo-
dels were not considered as research on this topic is very recent and not much
developed yet. However, constructing multivariate nonlinear models is very
important for future research. Other interesting future studies could be:
33
so to speak the standard normal distribution, will from time to time
yield values that exceed conventional critical values and leads to rejec-
tion of the mean zero hypothesis. This data snoopinp problem can be
circumvented by using the test for superior predictive ability provided
by Hansen (2005).
• to examine the interval forecast and the density forecast which could
provide more information than simply analysing points forecasts. This
could be achieved respectively by methods such as the conditional cove-
rage test of Christoffersen (1998), which concerns the percentage of the
observations that falls in the 95% forecast confidence intervals, and the
density forecast likelihood ratio test of Berkowitz (2001) among others.
• and lastly to track model change in order to deal with model or para-
meter instability by for instance allowing time varying slop coefficients
in different models (Teräsvirta 2018, p. 12)
The explosion in the use of nonlinear models in recent years has been obvious.
ANNs models constitute a very powerful class of machine learning models and
will probably remain a key tool for forecasting in the next decades. For the
threshold principle, it is expected to make worthwhile contributions in time se-
ries over the next years; especially in nonstationary nonlinear modelling, panel
time series modelling, and spatial-temporal series modelling among others
(Tong 2011).
34
References
35
Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy.
Journal of Business and Economic Statistics, 13 (3), 253–263.
Eitrheim, Ø., & Teräsvirta, T. (1996). Testing the adequacy of smooth tran-
sition autoregressive models. Journal of Econometrics, 74 (1), 59–75.
Elliott, G., & Timmermann, A. (2016). Forecasting in economics and finance.
Annual Review of Economics, 8 , 81–110.
Ferrara, L., Marcellino, M., & Mogliani, M. (2015). Macroeconomic forecasting
during the great recession: The return of non-linearity? International
Journal of Forecasting, 31 (3), 664–679.
Franses, P. H., Van Dijk, D., et al. (2000). Non-linear time series models in
empirical finance. Cambridge University Press.
Giacomini, R., & Rossi, B. (2010). Forecast comparisons in unstable environ-
ments. Journal of Applied Econometrics, 25 (4), 595–620.
Giacomini, R., & White, H. (2006). Tests of conditional predictive ability.
Econometrica, 74 (6), 1545–1578.
Hall, P., Horowitz, J. L., & Jing, B.-Y. (1995). On blocking rules for the
bootstrap with dependent data. Biometrika, 82 (3), 561–574.
Hansen, P. R. (2005). A test for superior predictive ability. Journal of Business
& Economic Statistics, 23 (4), 365–380.
Hornik, K., Stinchcombe, M., & White, H. (1990). Universal approximation
of an unknown mapping and its derivatives using multilayer feedforward
networks. Neural networks, 3 (5), 551–560.
Lim, K.-P., & Hooy, C.-W. (2013). Non-linear predictability in G7 stock index
returns. The Manchester School , 81 (4), 620–637.
Lundbergh, S., & Teräsvirta, T. (2002). Forecasting with smooth transition
autoregressive models. A companion to economic forecasting, 485–509.
Luukkonen, R., Saikkonen, P., & Teräsvirta, T. (1988). Testing linearity
against smooth transition autoregressive models. Biometrika, 75 (3),
491–499.
Mandelbrot, B. (1963). New methods in statistical economics. Journal of
political economy, 71 (5), 421–440.
Marcellino, M. (2004). Forecasting EMU macroeconomic variables. Interna-
tional Journal of Forecasting, 20 (2), 359–372.
36
Newey, W. K., & West, K. D. (1987). A simple, positive semi-definite, hete-
roskedasticity and autocorrelation consistent covariance matrix. Econo-
metrica: Journal of the Econometric Society, 703–708.
Pesaran, M. H., & Timmermann, A. (1992). A simple nonparametric test
of predictive performance. Journal of Business & Economic Statistics,
10 (4), 461–465.
Rapach, D., & Zhou, G. (2013). Forecasting stock returns. In Handbook of
economic forecasting (Vol. 2, pp. 328–383). Elsevier.
Sarantis, N. (1999). Modeling non-linearities in real effective exchange rates.
Journal of international money and finance, 18 (1), 27–45.
Teräsvirta, T. (2018). Nonlinear models in macroeconometrics. In Oxford
research encyclopedia of economics and finance.
Teräsvirta, T., & Anderson, H. M. (1992). Characterizing nonlinearities in
business cycles using smooth transition autoregressive models. Journal
of applied Econometrics, 7 (S1), S119–S136.
Teräsvirta, T., Lin, C.-F., & Granger, C. W. (1993). Power of the neural
network linearity test. Journal of time series analysis, 14 (2), 209–220.
Teräsvirta, T., Tjøstheim, D., Granger, C. W. J., et al. (2010). Modelling
nonlinear economic time series. Oxford University Press Oxford.
Teräsvirta, T., Van Dijk, D., & Medeiros, M. C. (2005). Linear models,
smooth transition autoregressions, and neural networks for forecasting
macroeconomic time series: A re-examination. International Journal of
Forecasting, 21 (4), 755–774.
Timmermann, A. (2018). Forecasting methods in finance. Annual Review of
Financial Economics, 10 , 449–479.
Tong, H. (1990). Non-linear time series: a dynamical system approach. Oxford
University Press.
Tong, H. (2011). Threshold models in time series analysis—30 years on.
Statistics and its Interface, 4 (2), 107–118.
Zivot, E., & Wang, J. (2007). Modeling financial time series with s-plus
R
37
A. Tables
Table A.1: BDS test on DJI return time series, the null of linearity is strongly rejected at
5% significance level for all the combinations of m embedded dimensions, and epsilon
Table A.2: BDS test on NASDAQ return time series, the null of linearity is strongly rejected
at 5% significance level for all the combinations of m embedded dimensions, and epsilon
38
Table A.3: BDS test on NYSE return time series, the null of linearity is strongly rejected at
5% significance level for all the combinations of m embedded dimensions, and epsilon
Table A.4: BDS test on NYSE return time series, the null of linearity is strongly rejected at
5% significance level for all the combinations of m embedded dimensions, and epsilon
Table A.5: Jarque-Bera test on return, the null of normality is rejected at 5% significance
level, suggesting that returns are not normally distributed
39
Table A.6: ADF and PP test p-value. The null of nonstationarity is rejected at 5% significant
level in both test, leading to the conclusion that the data are stationary
Table A.7: The null of linearity is rejected at 5% significance level, suggesting the presence
of SETAR nonlinearity
Table A.8: Threshold grid search in the SETAR model, DJI time series
Table A.9: Threshold grid search in the SETAR model, NASDAQ time series
Table A.10: Threshold grid search in the SETAR model, NYSE time series
40
Table A.11: Threshold grid search in the SETAR model, S&P time series
Table A.12: SETAR parameter estimates for the DJI time series
Table A.13: SETAR parameter estimate for the NASDAQ time series
41
Table A.14: SETAR parameter estimate for the NYSE time series
Table A.15: SETAR parameter estimate for the S&P time series
Table A.16: The Jarque-Bera test on SETAR residuals suggests that residuals are not nor-
mally distributed
Table A.17: The LSTAR nonlinearity test on DJI time series strongly rejects the null hypo-
thesis of linearity, implying the presence of LSTAR nonlinearity
42
Table A.18: The LSTAR nonlinearity test on NASDAQ time series strongly rejects the null
hypothesis of linearity, implying the presence of LSTAR nonlinearity
Table A.19: The LSTAR nonlinearity test on NYSE time series strongly rejects the null
hypothesis of linearity, implying the presence of LSTAR nonlinearity
Table A.20: The LSTAR nonlinearity test on S&P time series strongly rejects the null
hypothesis of linearity, implying the presence of LSTAR nonlinearity
Table A.21: Threshold grid search in the LSTAR model, DJI time series
Table A.22: Threshold grid search in the LSTAR model, NASDAQ time series
43
Table A.23: Threshold grid search in the LSTAR model, NYSE time series
Table A.24: Threshold grid search in the LSTAR model, S&P time series
Table A.25: LSTAR parameter estimate for the DJI time series
44
Table A.26: LSTAR parameter estimate for the NASDAQ time series
Table A.27: LSTAR parameter estimate for the NYSE time series
45
Table A.29: The LSTAR remaining nonlinearity test on LSTAR DJI residuals strongly rejects
the null hypothesis of 2-regimes adequacy (linearity), in favor of a 3-regimes LSTAR model,
implying the presence of remaining nonlinearity
Table A.30: The LSTAR remaining nonlinearity test on LSTAR NASDAQ residuals strongly
rejects the null hypothesis of 2-regimes adequacy (linearity), in favor of a 3-regimes LSTAR
model, implying the presence of remaining nonlinearity
Table A.31: The LSTAR remaining nonlinearity test on LSTAR NYSE residuals strongly
rejects the null hypothesis of 2-regimes adequacy (linearity), in favor of a 3-regimes LSTAR
model, implying the presence of remaining nonlinearity
Table A.32: The LSTAR remaining nonlinearity test on LSTAR S&P residuals strongly
rejects the null hypothesis of 2-regimes adequacy (linearity), in favor of a 3-regimes LSTAR
model, implying the presence of remaining nonlinearity
Table A.33: Grid search for SLFN model estimation and selection for DJI return
46
Table A.34: [Grid search for SLFN model estimation and selection for NASDAQ return
Table A.35: [Grid search for SLFN model estimation and selection for NYSE return
Table A.36: [Grid search for SLFN model estimation and selection for S&P return
Table A.37: Jarque-Bera test on SLFN residuals, normality is strongly rejected at 5% signi-
ficance level
47
Table A.38: Teräsvirta’s neural network test for neglected nonlinearity suggests that no
specification out of the 100 is correct.
Table A.39: AR estimated parameter for DJI return, Ljung-Box test on residual and AIC
48
Table A.40: AR estimated parameter for NASDAQ return, Ljung-Box test on residual and
AIC
Table A.41: AR estimated parameter for NYSE return, Ljung-Box test on residual and AIC
49
Table A.42: AR estimated parameter for S&P return, Ljung-Box test on residual and AIC
Table A.43: SETAR detailed in-sample fit. AR in-sample fit and in-sample fit differentials
are also reported in oder to make comparison easy
50
Table A.44: LSTAR detailed in-sample fit. AR in-sample fit and in-sample fit differentials
are also reported in oder to make comparison easy
Table A.45: ANN detailed in-sample fit. AR in-sample fit and in-sample fit differentials are
also reported in oder to make comparison easy
51
Table A.46: SETAR, naı̈ve detailed out-of-sample fit. RMSFE and MAFE for different
approaches, and they relative value to the ARs are also reported in oder to make comparison
easy
Table A.47: SETAR, bootstrap detailed out-of-sample fit. RMSFE and MAFE for different
approaches, and they relative value to the ARs are also reported in oder to make comparison
easy
52
Table A.48: SETAR, block bootstrap detailed out-of-sample fit. RMSFE and MAFE for
different approaches, and they relative value to the ARs are also reported in oder to make
comparison easy
Table A.49: SETAR, Monte Carlo detailed out-of-sample fit. RMSFE and MAFE for dif-
ferent approaches, and they relative value to the ARs are also reported in oder to make
comparison easy
53
Table A.50: LSTAR, naı̈ve detailed out-of-sample fit. RMSFE and MAFE for different
approaches, and they relative value to the ARs are also reported in oder to make comparison
easy
Table A.51: LSTAR, bootstrap detailed out-of-sample fit. RMSFE and MAFE for different
approaches, and they relative value to the ARs are also reported in oder to make comparison
easy
54
Table A.52: LSTAR, block bootstrap detailed out-of-sample fit. RMSFE and MAFE for
different approaches, and they relative value to the ARs are also reported in oder to make
comparison easy
Table A.53: LSTAR, Monte Carlo detailed out-of-sample fit. RMSFE and MAFE for dif-
ferent approaches, and they relative value to the ARs are also reported in oder to make
comparison easy
55
Table A.54: SFLN, naı̈ve detailed out-of-sample fit. RMSFE and MAFE for different ap-
proaches, and they relative value to the ARs are also reported in oder to make comparison
easy
Table A.55: SLFN, bootstrap detailed out-of-sample fit. RMSFE and MAFE for different
approaches, and they relative value to the ARs are also reported in oder to make comparison
easy
56
Table A.56: SLFN, block bootstrap detailed out-of-sample fit. RMSFE and MAFE for
different approaches, and they relative value to the ARs are also reported in oder to make
comparison easy
Table A.57: SLFN, Monte Carlo detailed out-of-sample fit. RMSFE and MAFE for different
approaches, and they relative value to the ARs are also reported in oder to make comparison
easy
57
B. Figures
58
Figure B.3: SETAR grid search. Columnwise DJI, NASDAQ, NYSE and S&P
Figure B.4: Q-Q plots of SETAR residuals suggest that residuals are not normally distributed
59
Figure B.5: ACF and PACF of DJI return. The time series displays a smoothly declining
PACF, a propriety that is well captured by autoregressive models
Figure B.6: ACF and PACF of NASDAQ return. The time series displays a smoothly
declining PACF, a propriety that is well captured by autoregressive models
Figure B.7: ACF and PACF of NYSE return. The time series displays a smoothly declining
PACF, a propriety that is well captured by autoregressive models
60
Figure B.8: ACF and PACF of S&P return. The time series displays a smoothly declining
PACF, a propriety that is well captured by autoregressive models
Figure B.9: Comparison of DJI return forecasts from SETAR and AR to the true realised
values. Both models do not provide reliable forecasts
Figure B.10: Comparison of NASDAQ return forecasts from SETAR and AR to the true
realised values. Both models do not provide reliable forecasts
61
Figure B.11: Comparison of NYSE return forecasts from SETAR and AR to the true realised
values. Both models do not provide reliable forecasts
Figure B.12: Comparison of S&P return forecasts from SETAR and AR to the true realised
values. Both models do not provide reliable forecasts
Figure B.13: Comparison of DJI return forecasts from LSTAR and AR to the true realised
values. Both models do not provide reliable forecasts
62
Figure B.14: Comparison of NASDAQ return forecasts from LSTAR and AR to the true
realised values. Both models do not provide reliable forecasts
Figure B.15: Comparison of NYSE return forecasts from LSTAR and AR to the true realised
values. Both models do not provide reliable forecasts
Figure B.16: Comparison of S&P return forecasts from LSTAR and AR to the true realised
values. Both models do not provide reliable forecasts
63
Figure B.17: Comparison of DJI return forecasts from SLFN and AR to the true realised
values. Both models do not provide reliable forecasts
Figure B.18: Comparison of NASDAQ return forecasts from SLFN and AR to the true
realised values. Both models do not provide reliable forecasts
Figure B.19: Comparison of NYSE return forecasts from SLFN and AR to the true realised
values. Both models do not provide reliable forecasts
64
Figure B.20: Comparison of S&P return forecasts from SLFN and AR to the true realised
values. Both models do not provide reliable forecasts
65
Figure B.22: Plots of SETAR in-sample fit
66
Figure B.24: Plots of ANN in-sample fit
67
C. Information on R Codes
Before running the codes, it is crucially important to first install the Rmark-
down library and set the working directory in the first chunk of
codes in each R file. Data are provided in a document called ”data” which
contain 8 excel files, that is 4 files containing data used for model estimation
for each stock index return and 4 other files used for out-out-sample forecast
comparison.
• General: contains codes for general return times series properties. Those
include stationarity tests, the BDS test, ACF, PACF and so on.
Libraries used here are: knitr, quantmod, tseries,psych, stats
68
For each nonlinear model, the main steps taken after uploading the data and
computing return closely follow this order (not exactely):
• estimating the nonlinear model and picking the setting for which AIC is
minimum,
• predicting return,
• comparing forecasts.
Below table C.1 provide an example of how the codes and the results in the
HTLM files looks like.
69
D. Declaration of Authorship
Kiel, 30.09.2019
70