Depvar Indepvars Numlist Numlist: Arima - ARIMA, ARMAX, and Other Dynamic Regression Models
Depvar Indepvars Numlist Numlist: Arima - ARIMA, ARMAX, and Other Dynamic Regression Models
com
arima ARIMA, ARMAX, and other dynamic regression models
Syntax
Basic syntax for a regression model with ARMA disturbances
arima depvar indepvars , ar(numlist) ma(numlist)
Full syntax
arima depvar indepvars if in weight , options
options Description
Model
noconstant suppress constant term
arima(# p ,# d ,# q ) specify ARIMA(p, d, q ) model for dependent variable
ar(numlist) autoregressive terms of the structural model disturbance
ma(numlist) moving-average terms of the structural model disturbance
constraints(constraints) apply specified linear constraints
collinear keep collinear variables
Model 2
sarima(# P ,# D ,# Q ,# s ) specify period-#s multiplicative seasonal ARIMA term
mar(numlist, #s ) multiplicative seasonal autoregressive term; may be repeated
mma(numlist, #s ) multiplicative seasonal moving-average term; may be repeated
Model 3
condition use conditional MLE instead of full MLE
savespace conserve memory during estimation
diffuse use diffuse prior for starting Kalman filter recursions
p0(# | matname) use alternate prior for starting Kalman recursions; seldom used
state0(# | matname) use alternate state vector for starting Kalman filter recursions
SE/Robust
vce(vcetype) vcetype may be opg, robust, or oim
1
2 arima ARIMA, ARMAX, and other dynamic regression models
Reporting
level(#) set confidence level; default is level(95)
detail report list of gaps in time series
nocnsreport do not display constraints
display options control column formats, row spacing, and line width
Maximization
maximize options control the maximization process; seldom used
coeflegend display legend instead of statistics
You must tsset your data before using arima; see [TS] tsset.
depvar and indepvars may contain time-series operators; see [U] 11.4.4 Time-series varlists.
by, fp, rolling, statsby, and xi are allowed; see [U] 11.1.10 Prefix commands.
iweights are allowed; see [U] 11.1.6 weight.
coeflegend does not appear in the dialog box.
See [U] 20 Estimation and postestimation commands for more capabilities of estimation commands.
Menu
Statistics > Time series > ARIMA and ARMAX models
Description
arima fits univariate models with time-dependent disturbances. arima fits a model of depvar on
indepvars where the disturbances are allowed to follow a linear autoregressive moving-average (ARMA)
specification. The dependent and independent variables may be differenced or seasonally differenced
to any degree. When independent variables are included in the specification, such models are often
called ARMAX models; and when independent variables are not specified, they reduce to BoxJenkins
autoregressive integrated moving-average (ARIMA) models in the dependent variable. Multiplicative
seasonal ARMAX and ARIMA models can also be fit. Missing data are allowed and are handled using
the Kalman filter and methods suggested by Harvey (1989 and 1993); see Methods and formulas.
In the full syntax, depvar is the variable being modeled, and the structural or regression part of
the model is specified in indepvars. ar() and ma() specify the lags of autoregressive and moving-
average terms, respectively; and mar() and mma() specify the multiplicative seasonal autoregressive
and moving-average terms, respectively.
arima allows time-series operators in the dependent variable and independent variable lists, and
making extensive use of these operators is often convenient; see [U] 11.4.4 Time-series varlists and
[U] 13.9 Time-series operators for an extended discussion of time-series operators.
arima typed without arguments redisplays the previous estimates.
Options
Model
noconstant; see [R] estimation options.
arima(# p ,# d ,# q ) is an alternative, shorthand notation for specifying models with ARMA disturbances.
The dependent variable and any independent variables are differenced # d times, and 1 through # p
lags of autocorrelations and 1 through # q lags of moving averages are included in the model. For
example, the specification
arima ARIMA, ARMAX, and other dynamic regression models 3
is equivalent to
. arima y, arima(2,1,3)
The latter is easier to write for simple ARMAX and ARIMA models, but if gaps in the AR or MA
lags are to be modeled, or if different operators are to be applied to independent variables, the
first syntax is required.
ar(numlist) specifies the autoregressive terms of the structural model disturbance to be included in
the model. For example, ar(1/3) specifies that lags of 1, 2, and 3 of the structural disturbance
be included in the model; ar(1 4) specifies that lags 1 and 4 be included, perhaps to account for
additive quarterly effects.
If the model does not contain regressors, these terms can also be considered autoregressive terms
for the dependent variable.
ma(numlist) specifies the moving-average terms to be included in the model. These are the terms for
the lagged innovations (white-noise disturbances).
constraints(constraints), collinear; see [R] estimation options.
If constraints are placed between structural model parameters and ARMA terms, the first few
iterations may attempt steps into nonstationary areas. This process can be ignored if the final
solution is well within the bounds of stationary solutions.
Model 2
sarima(# P ,# D ,# Q ,#s ) is an alternative, shorthand notation for specifying the multiplicative seasonal
components of models with ARMA disturbances. The dependent variable and any independent
variables are lag-# s seasonally differenced #D times, and 1 through # P seasonal lags of autoregressive
terms and 1 through # Q seasonal lags of moving-average terms are included in the model. For
example, the specification
. arima DS12.y, ar(1/2) ma(1/3) mar(1/2,12) mma(1/2,12)
is equivalent to
. arima y, arima(2,1,3) sarima(2,1,2,12)
mar(numlist, # s ) specifies the lag-# s multiplicative seasonal autoregressive terms. For example,
mar(1/2,12) requests that the first two lag-12 multiplicative seasonal autoregressive terms be
included in the model.
mma(numlist, # s ) specified the lag-# s multiplicative seasonal moving-average terms. For example,
mma(1 3,12) requests that the first and third (but not the second) lag-12 multiplicative seasonal
moving-average terms be included in the model.
Model 3
condition specifies that conditional, rather than full, maximum likelihood estimates be produced.
The presample values for t and t are taken to be their expected value of zero, and the estimate
of the variance of t is taken to be constant over the entire sample; see Hamilton (1994, 132).
This estimation method is not appropriate for nonstationary series but may be preferable for long
series or for models that have one or more long AR or MA lags. diffuse, p0(), and state0()
have no meaning for models fit from the conditional likelihood and may not be specified with
condition.
4 arima ARIMA, ARMAX, and other dynamic regression models
If the series is long and stationary and the underlying data-generating process does not have a long
memory, estimates will be similar, whether estimated by unconditional maximum likelihood (the
default), conditional maximum likelihood (condition), or maximum likelihood from a diffuse
prior (diffuse).
In small samples, however, results of conditional and unconditional maximum likelihood may
differ substantially; see Ansley and Newbold (1980). Whereas the default unconditional maximum
likelihood estimates make the most use of sample information when all the assumptions of the model
are met, Harvey (1989) and Ansley and Kohn (1985) argue for diffuse priors often, particularly in
ARIMA models corresponding to an underlying structural model.
The condition or diffuse options may also be preferred when the model contains one or more
long AR or MA lags; this avoids inverting potentially large matrices (see diffuse below).
When condition is specified, estimation is performed by the arch command (see [TS] arch),
and more control of the estimation process can be obtained using arch directly.
condition cannot be specified if the model contains any multiplicative seasonal terms.
savespace specifies that memory use be conserved by retaining only those variables required for
estimation. The original dataset is restored after estimation. This option is rarely used and should
be used only if there is not enough space to fit a model without the option. However, arima
requires considerably more temporary storage during estimation than most estimation commands
in Stata.
diffuse specifies that a diffuse prior (see Harvey 1989 or 1993) be used as a starting point for the
Kalman filter recursions. Using diffuse, nonstationary models may be fit with arima (see the
p0() option below; diffuse is equivalent to specifying p0(1e9)).
By default, arima uses the unconditional expected value of the state vector t (see Methods and
formulas) and the mean squared error (MSE) of the state vector to initialize the filter. When the
process is stationary, this corresponds to the expected value and expected variance of a random draw
from the state vector and produces unconditional maximum likelihood estimates of the parameters.
When the process is not stationary, however, this default is not appropriate, and the unconditional
MSE cannot be computed. For a nonstationary process, another starting point must be used for the
recursions.
In the absence of nonsample or presample information, diffuse may be specified to start the
recursions from a state vector of zero and a state MSE matrix corresponding to an effectively
infinite variance on this initial state. This method amounts to an uninformative and improper prior
that is updated to a proper MSE as data from the sample become available; see Harvey (1989).
Nonstationary models may also correspond to models with infinite variance given a particular
specification. This and other problems with nonstationary series make convergence difficult and
sometimes impossible.
diffuse can also be useful if a model contains one or more long AR or MA lags. Computation
of the unconditional MSE of the state vector (see Methods and formulas) requires construction
and inversion of a square matrix that is of dimension {max(p, q + 1)}2 , where p and q are the
maximum AR and MA lags, respectively. If q = 27, for example, we would require a 784-by-784
matrix. Estimation with diffuse does not require this matrix.
For large samples, there is little difference between using the default starting point and the diffuse
starting point. Unless the series has a long memory, the initial conditions affect the likelihood of
only the first few observations.
arima ARIMA, ARMAX, and other dynamic regression models 5
p0(# | matname) is a rarely specified option that can be used for nonstationary series or when an
alternate prior for starting the Kalman recursions is desired (see diffuse above for a discussion
of the default starting point and Methods and formulas for background).
matname specifies a matrix to be used as the MSE of the state vector for starting the Kalman filter
recursions P1|0 . Instead, one number, #, may be supplied, and the MSE of the initial state vector
P1|0 will have this number on its diagonal and all off-diagonal values set to zero.
This option may be used with nonstationary series to specify a larger or smaller diagonal for P1|0
than that supplied by diffuse. It may also be used with state0() when you believe that you
have a better prior for the initial state vector and its MSE.
state0(# | matname) is a rarely used option that specifies an alternate initial state vector, 1|0 (see
Methods and formulas), for starting the Kalman filter recursions. If # is specified, all elements of
the vector are taken to be #. The default initial state vector is state0(0).
SE/Robust
vce(vcetype) specifies the type of standard error reported, which includes types that are robust to
some kinds of misspecification (robust) and that are derived from asymptotic theory (oim, opg);
see [R] vce option.
For state-space models in general and ARMAX and ARIMA models in particular, the robust or
quasimaximum likelihood estimates (QMLEs) of variance are robust to symmetric nonnormality
in the disturbances, including, as a special case, heteroskedasticity. The robust variance estimates
are not generally robust to functional misspecification of the structural or ARMA components of
the model; see Hamilton (1994, 389) for a brief discussion.
Reporting
level(#); see [R] estimation options.
detail specifies that a detailed list of any gaps in the series be reported, including gaps due to
missing observations or missing data for the dependent variable or independent variables.
nocnsreport; see [R] estimation options.
display options: vsquish, cformat(% fmt), pformat(% fmt), sformat(% fmt), and nolstretch;
see [R] estimation options.
Maximization
maximize options: difficult, technique(algorithm spec), iterate(#), no log, trace,
gradient, showstep, hessian, showtolerance, tolerance(#), ltolerance(#),
nrtolerance(#), gtolerance(#), nonrtolerance(#), and from(init specs); see [R] maxi-
mize for all options except gtolerance(), and see below for information on gtolerance().
These options are sometimes more important for ARIMA models than most maximum likelihood
models because of potential convergence problems with ARIMA models, particularly if the specified
model and the sample data imply a nonstationary model.
Several alternate optimization methods, such as BerndtHallHallHausman (BHHH) and Broyden
FletcherGoldfarbShanno (BFGS), are provided for ARIMA models. Although ARIMA models are
not as difficult to optimize as ARCH models, their likelihoods are nevertheless generally not quadratic
and often pose optimization difficulties; this is particularly true if a model is nonstationary or
nearly nonstationary. Because each method approaches optimization differently, some problems
can be successfully optimized by an alternate method when one method fails.
Setting technique() to something other than the default or BHHH changes the vcetype to vce(oim).
6 arima ARIMA, ARMAX, and other dynamic regression models
The following options are all related to maximization and are either particularly important in fitting
ARIMA models or not available for most other estimators.
technique(algorithm spec) specifies the optimization technique to use to maximize the
likelihood function.
technique(bhhh) specifies the BerndtHallHallHausman (BHHH) algorithm.
technique(dfp) specifies the DavidonFletcherPowell (DFP) algorithm.
technique(bfgs) specifies the BroydenFletcherGoldfarbShanno (BFGS) algorithm.
technique(nr) specifies Statas modified NewtonRaphson (NR) algorithm.
You can specify multiple optimization methods. For example,
technique(bhhh 10 nr 20)
requests that the optimizer perform 10 BHHH iterations, switch to NewtonRaphson for 20
iterations, switch back to BHHH for 10 more iterations, and so on.
The default for arima is technique(bhhh 5 bfgs 10).
gtolerance(#) specifies the tolerance for the gradient relative to the coefficients. When
|gi bi | gtolerance() for all parameters bi and the corresponding elements of the
gradient gi , the gradient tolerance criterion is met. The default gradient tolerance for arima
is gtolerance(.05).
gtolerance(999) may be specified to disable the gradient criterion. If the optimizer becomes
stuck with repeated (backed up) messages, the gradient probably still contains substantial
values, but an uphill direction cannot be found for the likelihood. With this option, results can
often be obtained, but whether the global maximum likelihood has been found is unclear.
When the maximization is not going well, it is also possible to set the maximum number of
iterations (see [R] maximize) to the point where the optimizer appears to be stuck and to inspect
the estimation results at that point.
from(init specs) allows you to set the starting values of the model coefficients; see [R] maximize
for a general discussion and syntax options.
The standard syntax for from() accepts a matrix, a list of values, or coefficient name value
pairs; see [R] maximize. arima also accepts from(armab0), which sets the starting value for
all ARMA parameters in the model to zero prior to optimization.
ARIMA models may be sensitive to initial conditions and may have coefficient values that
correspond to local maximums. The default starting values for arima are generally good,
particularly in large samples for stationary series.
The following option is available with arima but is not shown in the dialog box:
coeflegend; see [R] estimation options.
arima ARIMA, ARMAX, and other dynamic regression models 7
Introduction
arima fits both standard ARIMA models that are autoregressive in the dependent variable and
structural models with ARMA disturbances. Good introductions to the former models can be found in
Box, Jenkins, and Reinsel (2008); Hamilton (1994); Harvey (1993); Newton (1988); Diggle (1990);
and many others. The latter models are developed fully in Hamilton (1994) and Harvey (1989), both of
which provide extensive treatment of the Kalman filter (Kalman 1960) and the state-space form used
by arima to fit the models. Becketti (2013) discusses ARIMA models and Statas arima command,
and he devotes an entire chapter explaining how the principles of ARIMA models are applied to real
datasets in practice.
Consider a first-order autoregressive moving-average process. Then arima estimates all the pa-
rameters in the model
yt = xt + t structural equation
t = t1 + t1 + t disturbance, ARMA(1, 1)
where
is the first-order autocorrelation parameter
is the first-order moving-average parameter
t i.i.d. N (0, 2 ), meaning that t is a white-noise disturbance
You can combine the two equations and write a general ARMA(p, q) in the disturbances process as
It is also common to write the general form of the ARMA model more succinctly using lag operator
notation as
(Lp )(yt xt ) = (Lq )t ARMA(p, q)
where
(Lp ) = 1 1 L 2 L2 p Lp
(Lq ) = 1 + 1 L + 2 L2 + + q Lq
and Lj yt = ytj .
For stationary series, full or unconditional maximum likelihood estimates are obtained via the
Kalman filter. For nonstationary series, if some prior information is available, you can specify initial
values for the filter by using state0() and p0() as suggested by Hamilton (1994) or assume an
uninformative prior by using the diffuse option as suggested by Harvey (1989).
8 arima ARIMA, ARMAX, and other dynamic regression models
ARIMA models
Pure ARIMA models without a structural component do not have regressors and are often written
as autoregressions in the dependent variable, rather than autoregressions in the disturbances from a
structural equation. For example, an ARMA(1, 1) model can be written as
Other than a scale factor for the constant term , these models are equivalent to the ARMA in the
disturbances formulation estimated by arima, though the latter are more flexible and allow a wider
class of models.
To see this effect, replace xt in the structural equation above with a constant term 0 so that
yt = 0 + t
= 0 + t1 + t1 + t
= 0 + (yt1 0 ) + t1 + t
= (1 )0 + yt1 + t1 + t (1b)
Equations (1a) and (1b) are equivalent, with = (1 )0 , so whether we consider an ARIMA model
as autoregressive in the dependent variable or disturbances is immaterial. Our illustration can easily
be extended from the ARMA(1, 1) case to the general ARIMA(p, d, q) case.
. use http://www.stata-press.com/data/r13/wpi1
. arima wpi, arima(1,1,1)
(setting optimization to BHHH)
Iteration 0: log likelihood = -139.80133
Iteration 1: log likelihood = -135.6278
Iteration 2: log likelihood = -135.41838
Iteration 3: log likelihood = -135.36691
Iteration 4: log likelihood = -135.35892
(switching optimization to BFGS)
Iteration 5: log likelihood = -135.35471
Iteration 6: log likelihood = -135.35135
Iteration 7: log likelihood = -135.35132
Iteration 8: log likelihood = -135.35131
ARIMA regression
Sample: 1960q2 - 1990q4 Number of obs = 123
Wald chi2(2) = 310.64
Log likelihood = -135.3513 Prob > chi2 = 0.0000
OPG
D.wpi Coef. Std. Err. z P>|z| [95% Conf. Interval]
wpi
_cons .7498197 .3340968 2.24 0.025 .0950019 1.404637
ARMA
ar
L1. .8742288 .0545435 16.03 0.000 .7673256 .981132
ma
L1. -.4120458 .1000284 -4.12 0.000 -.6080979 -.2159938
Note: The test of the variance against zero is one sided, and the two-sided
confidence interval is truncated at zero.
Examining the estimation results, we see that the AR(1) coefficient is 0.874, the MA(1) coefficient
is 0.412, and both are highly significant. The estimated standard deviation of the white-noise
disturbance is 0.725.
This model also could have been fit by typing
. arima D.wpi, ar(1) ma(1)
The D. placed in front of the dependent variable wpi is the Stata time-series operator for differencing.
Thus we would be modeling the first difference in WPI from the second quarter of 1960 through
the fourth quarter of 1990 because the first observation is lost because of differencing. This second
syntax allows a richer choice of models.
.08
.06
100
.04
.02
75
0
50
.02
.04
25
On the basis of the autocorrelations, partial autocorrelations (see graphs below), and the results of
preliminary estimations, Enders identified an ARMA model in the log-differenced series.
. ac D.ln_wpi, ylabels(-.4(.2).6)
. pac D.ln_wpi, ylabels(-.4(.2).6)
0.60
0.60
Partial autocorrelations of D.ln_wpi
0.40
0.40
Autocorrelations of D.ln_wpi
0.20
0.20
0.00
0.00
0.20
0.20
0.40
0.40
0 10 20 30 40 0 10 20 30 40
Lag Lag
Bartletts formula for MA(q) 95% confidence bands 95% Confidence bands [se = 1/sqrt(n)]
In addition to an autoregressive term and an MA(1) term, an MA(4) term is included to account
for a remaining quarterly effect. Thus the model to be fit is
We can fit this model with arima and Statas standard difference operator:
. arima D.ln_wpi, ar(1) ma(1 4)
(setting optimization to BHHH)
Iteration 0: log likelihood = 382.67447
Iteration 1: log likelihood = 384.80754
Iteration 2: log likelihood = 384.84749
Iteration 3: log likelihood = 385.39213
Iteration 4: log likelihood = 385.40983
(switching optimization to BFGS)
Iteration 5: log likelihood = 385.9021
Iteration 6: log likelihood = 385.95646
Iteration 7: log likelihood = 386.02979
Iteration 8: log likelihood = 386.03326
Iteration 9: log likelihood = 386.03354
Iteration 10: log likelihood = 386.03357
ARIMA regression
Sample: 1960q2 - 1990q4 Number of obs = 123
Wald chi2(3) = 333.60
Log likelihood = 386.0336 Prob > chi2 = 0.0000
OPG
D.ln_wpi Coef. Std. Err. z P>|z| [95% Conf. Interval]
ln_wpi
_cons .0110493 .0048349 2.29 0.022 .0015731 .0205255
ARMA
ar
L1. .7806991 .0944946 8.26 0.000 .5954931 .965905
ma
L1. -.3990039 .1258753 -3.17 0.002 -.6457149 -.1522928
L4. .3090813 .1200945 2.57 0.010 .0737003 .5444622
Note: The test of the variance against zero is one sided, and the two-sided
confidence interval is truncated at zero.
In this final specification, the log-differenced series is still highly autocorrelated at a level of 0.781,
though innovations have a negative impact in the ensuing quarter (0.399) and a positive seasonal
impact of 0.309 in the following year.
Technical note
In one way, the results differ from most of Statas estimation commands: the standard error of
the coefficients is reported as OPG Std. Err. The default standard errors and covariance matrix
for arima estimates are derived from the outer product of gradients (OPG). This is one of three
asymptotically equivalent methods of estimating the covariance matrix of the coefficients (only two of
which are usually tractable to derive). Discussions and derivations of all three estimates can be found
in Davidson and MacKinnon (1993), Greene (2012), and Hamilton (1994). Bollerslev, Engle, and
Nelson (1994) suggest that the OPG estimates are more numerically stable in time-series regressions
when the likelihood and its derivatives depend on recursive computations, which is certainly the case
for the Kalman filter. To date, we have found no numerical instabilities in either estimate of the
covariance matrixsubject to the stability and convergence of the overall model.
12 arima ARIMA, ARMAX, and other dynamic regression models
Most of Statas estimation commands provide covariance estimates derived from the Hessian of
the likelihood function. These alternate estimates can also be obtained from arima by specifying the
vce(oim) option.
This is an additive seasonal ARIMA model, in the sense that the first- and fourth-order MA terms work
additively: (1 + 1 L + 4 L4 ).
Another way to handle the quarterly effect would be to fit a multiplicative seasonal ARIMA model.
A multiplicative SARIMA model of order (1, 1, 1) (0, 0, 1)4 for the ln(wpit ) series is
In the notation (1, 1, 1) (0, 0, 1)4 , the (1, 1, 1) means that there is one nonseasonal autoregressive
term (1 1 L) and one nonseasonal moving-average term (1 + 1 L) and that the time series is
first-differenced one time. The (0, 0, 1)4 indicates that there is no lag-4 seasonal autoregressive term,
that there is one lag-4 seasonal moving-average term (1 + 4,1 L4 ), and that the series is seasonally
differenced zero times. This is known as a multiplicative SARIMA model because the nonseasonal
and seasonal factors work multiplicatively: (1 + 1 L)(1 + 4,1 L4 ). Multiplying the terms imposes
nonlinear constraints on the parameters of the fifth-order lagged values; arima imposes these constraints
automatically.
To further clarify the notation, consider a (2, 1, 1) (1, 1, 2)4 multiplicative SARIMA model:
where denotes the difference operator yt = yt yt1 and s denotes the lag-s seasonal
difference operator s yt = yt yts . Expanding (3), we have
where
zet = 4 zt = (zt zt4 ) = zt zt1 (zt4 zt5 )
and zt = yt xt if regressors are included in the model, zt = yt 0 if just a constant term is
included, and zt = yt otherwise.
arima ARIMA, ARMAX, and other dynamic regression models 13
(Lp )s (LP )d D q Q
s zt = (L )s (L )t
where
s (LP ) = (1 s,1 Ls s,2 L2s s,P LP s )
s (LQ ) = (1 + s,1 Ls + s,2 L2s + + s,Q LQs )
(Lp ) and (Lq ) were defined previously, d means apply the operator d times, and similarly
for Ds . Typically, d and D will be 0 or 1; and p, q , P , and Q will seldom be more than 2 or 3. s
will typically be 4 for quarterly data and 12 for monthly data. In fact, the model can be extended to
include both monthly and quarterly seasonal factors, as we explain below.
If a plot of the data suggests that the seasonal effect is proportional to the mean of the series, then
the seasonal effect is probably multiplicative and a multiplicative SARIMA model may be appropriate.
Box, Jenkins, and Reinsel (2008, sec. 9.3.1) suggest starting with a multiplicative SARIMA model with
any data that exhibit seasonal patterns and then exploring nonmultiplicative SARIMA models if the
multiplicative models do not fit the data well. On the other hand, Chatfield (2004, 14) suggests that
taking the logarithm of the series will make the seasonal effect additive, in which case an additive
SARIMA model as fit in the previous example would be appropriate. In short, the analyst should
probably try both additive and multiplicative SARIMA models to see which provides better fits and
forecasts.
Unless diffuse is used, arima must create square matrices of dimension {max(p, q + 1)}2 , where
p and q are the maximum AR and MA lags, respectively; and the inclusion of long seasonal terms can
make this dimension rather large. For example, with monthly data, you might fit a (0, 1, 1)(0, 1, 2)12
2
SARIMA model. The maximum MA lag is 2 12 + 1 = 25, requiring a matrix with 26 = 676 rows
and columns.
. use http://www.stata-press.com/data/r13/air2
(TIMESLAB: Airline passengers)
. generate lnair = ln(air)
. arima lnair, arima(0,1,1) sarima(0,1,1,12) noconstant
(setting optimization to BHHH)
Iteration 0: log likelihood = 223.8437
Iteration 1: log likelihood = 239.80405
(output omitted )
Iteration 8: log likelihood = 244.69651
ARIMA regression
Sample: 14 - 144 Number of obs = 131
Wald chi2(2) = 84.53
Log likelihood = 244.6965 Prob > chi2 = 0.0000
OPG
DS12.lnair Coef. Std. Err. z P>|z| [95% Conf. Interval]
ARMA
ma
L1. -.4018324 .0730307 -5.50 0.000 -.5449698 -.2586949
ARMA12
ma
L1. -.5569342 .0963129 -5.78 0.000 -.745704 -.3681644
Note: The test of the variance against zero is one sided, and the two-sided
confidence interval is truncated at zero.
In (2), for example, the coefficient on t13 is the product of the coefficients on the t1 and t12
terms (0.224 0.402 0.557). arima labeled the dependent variable DS12.lnair to indicate
that it has applied the difference operator and the lag-12 seasonal difference operator 12 to
lnair; see [U] 11.4.4 Time-series varlists for more information.
We could have fit this model by typing
. arima DS12.lnair, ma(1) mma(1, 12) noconstant
For simple multiplicative models, using the sarima() option is easier, though this second syntax
allows us to incorporate more complicated seasonal terms.
The mar() and mma() options can be repeated, allowing us to control for multiple seasonal
patterns. For example, we may have monthly sales data that exhibit a quarterly pattern as businesses
purchase our product at the beginning of calendar quarters when new funds are budgeted, and our
product is purchased more frequently in a few months of the year than in most others, even after we
control for quarterly fluctuations. Thus we might choose to fit the model
Although this model looks rather complicated, estimating it using arima is straightforward:
. arima DS4S12.sales, ar(1) mar(1, 4) mar(1, 12) ma(1) mma(1, 4) mma(1, 12)
If we instead wanted to include two lags in the lag-4 seasonal AR term and the first and third (but
not the second) term in the lag-12 seasonal MA term, we would type
. arima DS4S12.sales, ar(1) mar(1 2, 4) mar(1, 12) ma(1) mma(1, 4) mma(1 3, 12)
However, models with multiple seasonal terms can be difficult to fit. Usually, one seasonal factor
with just one or two AR or MA terms is adequate.
ARMAX models
Thus far all our examples have been pure ARIMA models in which the dependent variable was
modeled solely as a function of its past values and disturbances. Also, arima can fit ARMAX models,
which model the dependent variable in terms of a linear combination of independent variables, as
well as an ARMA disturbance process. The prais command (see [TS] prais), for example, allows
you to control for only AR(1) disturbances, whereas arima allows you to control for a much richer
dynamic error structure. arima allows for both nonseasonal and seasonal ARMA components in the
disturbances.
OPG
consump Coef. Std. Err. z P>|z| [95% Conf. Interval]
consump
m2 1.122029 .0363563 30.86 0.000 1.050772 1.193286
_cons -36.09872 56.56703 -0.64 0.523 -146.9681 74.77062
ARMA
ar
L1. .9348486 .0411323 22.73 0.000 .8542308 1.015467
ma
L1. .3090592 .0885883 3.49 0.000 .1354293 .4826891
Note: The test of the variance against zero is one sided, and the two-sided
confidence interval is truncated at zero.
We find a relatively small money velocity with respect to consumption (1.122) over this period,
although consumption is only one facet of the income velocity. We also note a very large first-order
autocorrelation in the disturbances, as well as a statistically significant first-order moving average.
We might be concerned that our specification has led to disturbances that are heteroskedastic or
non-Gaussian. We refit the model by using the vce(robust) option.
arima ARIMA, ARMAX, and other dynamic regression models 17
Semirobust
consump Coef. Std. Err. z P>|z| [95% Conf. Interval]
consump
m2 1.122029 .0433302 25.89 0.000 1.037103 1.206954
_cons -36.09872 28.10477 -1.28 0.199 -91.18306 18.98561
ARMA
ar
L1. .9348486 .0493428 18.95 0.000 .8381385 1.031559
ma
L1. .3090592 .1605359 1.93 0.054 -.0055854 .6237038
Note: The test of the variance against zero is one sided, and the two-sided
confidence interval is truncated at zero.
We note a substantial increase in the estimated standard errors, and our once clearly significant
moving-average term is now only marginally significant.
Dynamic forecasting
Another feature of the arima command is the ability to use predict afterward to make dynamic
forecasts. Suppose that we wish to fit the regression model
yt = 0 + 1 xt + yt1 + t
ybf =
c0 +
c1 xf + byf 1
Most importantly, here predict will use the actual value of y at period f 1 in computing the
forecast for time f . Thus, if we use regress or prais, we cannot make forecasts for any periods
beyond f = T + 1 unless we have observed values for y for those periods.
If we instead fit our model with arima, then predict can produce dynamic forecasts by using
the Kalman filter. If we use the dynamic(f ) option, then for period f predict will compute
ybf =
c0 +
c1 xf + byf 1
18 arima ARIMA, ARMAX, and other dynamic regression models
by using the observed value of yf 1 just as predict after regress or prais. However, for period
f + 1 predict newvar, dynamic(f ) will compute
ybf +1 =
c0 +
c1 xf +1 + bybf
using the predicted value of yf instead of the observed value. Similarly, the period f + 2 forecast
will be
ybf +2 =
c0 +
c1 xf +2 + bybf +1
Of course, because our model includes the regressor xt , we can make forecasts only through periods
for which we have observations on xt . However, for pure ARIMA models, we can compute dynamic
forecasts as far beyond the final period of our dataset as desired.
For more information on predict after arima, see [TS] arima postestimation.
Video example
Time series, part 5: Introduction to ARMA/ARIMA models
Stored results
arima stores the following in e():
Scalars
e(N) number of observations
e(N gaps) number of gaps
e(k) number of parameters
e(k eq) number of equations in e(b)
e(k eq model) number of equations in overall model test
e(k dv) number of dependent variables
e(k1) number of variables in first equation
e(df m) model degrees of freedom
e(ll) log likelihood
e(sigma) sigma
e(chi2) 2
e(p) significance
e(tmin) minimum time
e(tmax) maximum time
e(ar max) maximum AR lag
e(ma max) maximum MA lag
e(rank) rank of e(V)
e(ic) number of iterations
e(rc) return code
e(converged) 1 if converged, 0 otherwise
arima ARIMA, ARMAX, and other dynamic regression models 19
Macros
e(cmd) arima
e(cmdline) command as typed
e(depvar) name of dependent variable
e(covariates) list of covariates
e(eqnames) names of equations
e(wtype) weight type
e(wexp) weight expression
e(title) title in estimation output
e(tmins) formatted minimum time
e(tmaxs) formatted maximum time
e(chi2type) Wald; type of model 2 test
e(vce) vcetype specified in vce()
e(vcetype) title used to label Std. Err.
e(ma) lags for moving-average terms
e(ar) lags for autoregressive terms
e(mari) multiplicative AR terms and lag i=1... (# seasonal AR terms)
e(mmai) multiplicative MA terms and lag i=1... (# seasonal MA terms)
e(seasons) seasonal lags in model
e(unsta) unstationary or blank
e(opt) type of optimization
e(ml method) type of ml method
e(user) name of likelihood-evaluator program
e(technique) maximization technique
e(tech steps) number of iterations performed before switching techniques
e(properties) b V
e(estat cmd) program used to implement estat
e(predict) program used to implement predict
e(marginsok) predictions allowed by margins
e(marginsnotok) predictions disallowed by margins
Matrices
e(b) coefficient vector
e(Cns) constraints matrix
e(ilog) iteration log (up to 20 iterations)
e(gradient) gradient vector
e(V) variancecovariance matrix of the estimators
e(V modelbased) model-based variance
Functions
e(sample) marks estimation sample
ARIMA model
The model to be fit is
yt = xt + t
Xp q
X
t = i ti + j tj + t
i=1 j=1
Some of the s and s may be constrained to zero or, for multiplicative seasonal models, the products
of other parameters.
and
vt Q 0
N 0,
wt 0 R
We maintain the standard Kalman filter matrix and vector notation, although for univariate models
yt , wt , and R are scalars.
1 2 . . . p1 p
1 0 ... 0 0
F=
0 1 ... 0 0
0 0 ... 1 0
t1
0
...
vt =
...
...
0
A0 =
H0 = [ 1 1 2 . . . q ]
wt = 0
The Kalman filter representation does not require the moving-average terms to be invertible.
The estimator of yt is
bt|t1 = xt + H0 t|t1
y
which implies an innovation or prediction error
bt = yt y
bt|t1
Mt = H0 Pt|t1 H + R
with MSE
Pt = Pt|t1 Pt|t1 HM1 0
t H Pt|t1 (7)
This expression gives the full set of Kalman filter recursions.
22 arima ARIMA, ARMAX, and other dynamic regression models
When the series is stationary, conditional on xt , the initial conditions for the filter can be
considered a random draw from the stationary distribution of the state equation. The initial values of
the state and the state MSE are the expected values from this stationary distribution. For an ARIMA
model, these can be written as
1|0 = 0
and
vec(P1|0 ) = (Ir2 F F)1 vec(Q)
where vec() is an operator representing the column matrix resulting from stacking each successive
column of the target matrix.
If the series is not stationary, the initial state conditions do not constitute a random draw from a
stationary distribution, and some other values must be chosen. Hamilton (1994) suggests that they be
chosen based on prior expectations, whereas Harvey suggests a diffuse and improper prior having a
state vector of 0 and an infinite variance. This method corresponds to P1|0 with diagonal elements of
. Stata allows either approach to be taken for nonstationary seriesinitial priors may be specified
with state0() and p0(), and a diffuse prior may be specified with diffuse.
Given the outputs from the Kalman filter recursions and assuming that the state and observation
vectors are Gaussian, the likelihood for the state-space model follows directly from the resulting
multivariate normal in the predicted innovations. The log likelihood for observation t is
1
ln(2) + ln(|Mt |) b0t M1
lnLt = t bt
2
This command supports the Huber/White/sandwich estimator of the variance using vce(robust).
See [P] robust, particularly Maximum likelihood estimators and Methods and formulas.
Missing data
Missing data, whether a missing dependent variable yt , one or more missing covariates xt , or
completely missing observations, are handled by continuing the state-updating equations without any
contribution from the data; see Harvey (1989 and 1993). That is, (4) and (5) are iterated for every
missing observation, whereas (6) and (7) are ignored. Thus, for observations with missing data,
t = t|t1 and Pt = Pt|t1 . Without any information from the sample, this effectively assumes
that the prediction error for the missing observations is 0. Other methods of handling missing data
on the basis of the EM algorithm have been suggested, for example, Shumway (1984, 1988).
arima ARIMA, ARMAX, and other dynamic regression models 23
George Edward Pelham Box (19192013) was born in Kent, England, and earned degrees
in statistics at the University of London. After work in the chemical industry, he taught and
researched at Princeton and the University of Wisconsin. His many major contributions to statistics
include papers and books in Bayesian inference, robustness (a term he introduced to statistics),
modeling strategy, experimental design and response surfaces, time-series analysis, distribution
theory, transformations, and nonlinear estimation.
Gwilym Meirion Jenkins (19331982) was a British mathematician and statistician who spent
his career in industry and academia, working for extended periods at Imperial College London
and the University of Lancaster before running his own company. His interests were centered on
time series and he collaborated with G. E. P. Box on what are often called BoxJenkins models.
The last years of Jenkins life were marked by a slowly losing battle against Hodgkins disease.
References
Ansley, C. F., and R. J. Kohn. 1985. Estimation, filtering, and smoothing in state space models with incompletely
specified initial conditions. Annals of Statistics 13: 12861316.
Ansley, C. F., and P. Newbold. 1980. Finite sample properties of estimators for autoregressive moving average models.
Journal of Econometrics 13: 159183.
Baum, C. F. 2000. sts15: Tests for stationarity of a time series. Stata Technical Bulletin 57: 3639. Reprinted in
Stata Technical Bulletin Reprints, vol. 10, pp. 356360. College Station, TX: Stata Press.
Baum, C. F., and T. Room. 2001. sts18: A test for long-range dependence in a time series. Stata Technical Bulletin
60: 3739. Reprinted in Stata Technical Bulletin Reprints, vol. 10, pp. 370373. College Station, TX: Stata Press.
Baum, C. F., and R. I. Sperling. 2000. sts15.1: Tests for stationarity of a time series: Update. Stata Technical Bulletin
58: 3536. Reprinted in Stata Technical Bulletin Reprints, vol. 10, pp. 360362. College Station, TX: Stata Press.
Baum, C. F., and V. L. Wiggins. 2000. sts16: Tests for long memory in a time series. Stata Technical Bulletin 57:
3944. Reprinted in Stata Technical Bulletin Reprints, vol. 10, pp. 362368. College Station, TX: Stata Press.
Becketti, S. 2013. Introduction to Time Series Using Stata. College Station, TX: Stata Press.
Berndt, E. K., B. H. Hall, R. E. Hall, and J. A. Hausman. 1974. Estimation and inference in nonlinear structural
models. Annals of Economic and Social Measurement 3/4: 653665.
Bollerslev, T., R. F. Engle, and D. B. Nelson. 1994. ARCH models. In Vol. 4 of Handbook of Econometrics, ed.
R. F. Engle and D. L. McFadden. Amsterdam: Elsevier.
Box, G. E. P. 1983. Obituary: G. M. Jenkins, 19331982. Journal of the Royal Statistical Society, Series A 146:
205206.
Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. 2008. Time Series Analysis: Forecasting and Control. 4th ed.
Hoboken, NJ: Wiley.
Chatfield, C. 2004. The Analysis of Time Series: An Introduction. 6th ed. Boca Raton, FL: Chapman & Hall/CRC.
David, J. S. 1999. sts14: Bivariate Granger causality test. Stata Technical Bulletin 51: 4041. Reprinted in Stata
Technical Bulletin Reprints, vol. 9, pp. 350351. College Station, TX: Stata Press.
Davidson, R., and J. G. MacKinnon. 1993. Estimation and Inference in Econometrics. New York: Oxford University
Press.
DeGroot, M. H. 1987. A conversation with George Box. Statistical Science 2: 239258.
Diggle, P. J. 1990. Time Series: A Biostatistical Introduction. Oxford: Oxford University Press.
Enders, W. 2004. Applied Econometric Time Series. 2nd ed. New York: Wiley.
Friedman, M., and D. Meiselman. 1963. The relative stability of monetary velocity and the investment multiplier in
the United States, 18971958. In Stabilization Policies, Commission on Money and Credit, 123126. Englewood
Cliffs, NJ: Prentice Hall.
Gourieroux, C. S., and A. Monfort. 1997. Time Series and Dynamic Models. Trans. ed. G. M. Gallo. Cambridge:
Cambridge University Press.
24 arima ARIMA, ARMAX, and other dynamic regression models
Greene, W. H. 2012. Econometric Analysis. 7th ed. Upper Saddle River, NJ: Prentice Hall.
Hamilton, J. D. 1994. Time Series Analysis. Princeton: Princeton University Press.
Harvey, A. C. 1989. Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge: Cambridge
University Press.
. 1993. Time Series Models. 2nd ed. Cambridge, MA: MIT Press.
Hipel, K. W., and A. I. McLeod. 1994. Time Series Modelling of Water Resources and Environmental Systems.
Amsterdam: Elsevier.
Holan, S. H., R. Lund, and G. Davis. 2010. The ARMA alphabet soup: A tour of ARMA model variants. Statistics
Surveys 4: 232274.
Kalman, R. E. 1960. A new approach to linear filtering and prediction problems. Transactions of the ASMEJournal
of Basic Engineering, Series D 82: 3545.
McDowell, A. W. 2002. From the help desk: Transfer functions. Stata Journal 2: 7185.
. 2004. From the help desk: Polynomial distributed lag models. Stata Journal 4: 180189.
Newton, H. J. 1988. TIMESLAB: A Time Series Analysis Laboratory. Belmont, CA: Wadsworth.
Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. 2007. Numerical Recipes: The Art of Scientific
Computing. 3rd ed. New York: Cambridge University Press.
Sanchez, G. 2012. Comparing predictions after arima with manual computations. The Stata Blog: Not Elsewhere
Classified. http://blog.stata.com/2012/02/16/comparing-predictions-after-arima-with-manual-computations/.
Shumway, R. H. 1984. Some applications of the EM algorithm to analyzing incomplete time series data. In Time
Series Analysis of Irregularly Observed Data, ed. E. Parzen, 290324. New York: Springer.
. 1988. Applied Statistical Time Series Analysis. Upper Saddle River, NJ: Prentice Hall.
Wang, Q., and N. Wu. 2012. Menu-driven X-12-ARIMA seasonal adjustment in Stata. Stata Journal 12: 214241.
Also see
[TS] arima postestimation Postestimation tools for arima
[TS] tsset Declare data to be time-series data
[TS] arch Autoregressive conditional heteroskedasticity (ARCH) family of estimators
[TS] dfactor Dynamic-factor models
[TS] forecast Econometric model forecasting
[TS] mgarch Multivariate GARCH models
[TS] prais Prais Winsten and Cochrane Orcutt regression
[TS] sspace State-space models
[TS] ucm Unobserved-components model
[R] regress Linear regression
[U] 20 Estimation and postestimation commands