Box-Jenkins Methodology Forecasting Basics

Box-Jenkins Methodology
Introduction
Forecasting Basics: The basic idea behind self-projecting time series forecasting

models is to find a mathematical formula that will approximately generate the
historical patterns in a time series.
Time Series: A time series is a set of numbers that measures the status of some
activity over time. It is the historical record of some activity, with measurements
taken at equally spaced intervals (exception: monthly) with a consistency in the
activity and the method of measurement.
Approaches to time Series Forecasting: There are two basic approaches to

forecasting time series: the self-projecting time series and the cause-and-effect
approach. Cause-and-effect methods attempt to forecast based on underlying series
that are believed to cause the behavior of the original series. The self-projecting time
series uses only the time series data of the activity to be forecast to generate forecasts.
This latter approach is typically less expensive to apply and requires far less data and
is useful for short, to medium-term forecasting.
Box-Jenkins Forecasting Method: The univariate version of this methodology is a

self- projecting time series forecasting method. The underlying goal is to find an
appropriate formula so that the residuals are as small as possible and exhibit no
pattern. The model- building process involves a few steps, repeated as necessary, to
end up with a specific formula that replicates the patterns in the series as closely as
possible and also produces accurate forecasts.
Box-Jenkins Methodology
Box-Jenkins forecasting models are based on statistical concepts and principles

and are able to model a wide spectrum of time series behavior. It has a large
class of models to choose from and a systematic approach for identifying the
correct model form. There are both statistical tests for verifying model validity
and statistical measures of forecast uncertainty. In contrast, traditional
forecasting models offer a limited number of models relative to the complex
behavior of many time series, with little in the way of guidelines and statistical
tests for verifying the validity of the selected model.
Data: The misuse, misunderstanding, and inaccuracy of forecasts are often the

result of not appreciating the nature of the data in hand. The consistency of the
data must be insured, and it must be clear what the data represents and how it
was gathered or calculated. As a rule of thumb, Box-Jenkins requires at least 40
or 50 equally-spaced periods of data. The data must also be edited to deal with
extreme or missing values or other distortions through the use of functions such
as log or inverse to achieve stabilization.
Preliminary Model Identification Procedure: A preliminary Box-Jenkins

analysis with a plot of the initial data should be run as the starting point in
determining an appropriate model. The input data must be adjusted to form a
stationary series, one whose values vary more or less uniformly about a fixed
level over time. Apparent trends can be adjusted by having the model apply a
technique of "regular differencing," a process of computing the difference
between every two successive values, computing a differenced series which has
overall trend behavior removed. If a single differencing does not achieve
stationarity, it may be repeated, although rarely, if ever, are more than two
regular differencing required. Where irregularities in the differenced series
continue to be displayed, log or inverse functions can be specified to stabilize
the series, such that the remaining residual plot displays values approaching
zero and without any pattern. This is the error term, equivalent to pure, white
noise.
Pure Random Series: On the other hand, if the initial data series displays
neither trend nor seasonality, and the residual plot shows essentially zero values
within a 95% confidence level and these residual values display no pattern,
then there is no real-world statistical problem to solve and we go on to other
things.
Model Identification Background
Basic Model: With a stationary series in place, a basic model can now be

identified. Three basic models exist, AR (autoregressive), MA (moving
average) and a combined ARMA in addition to the previously specified RD
(regular differencing): These comprise the available tools. When regular
differencing is applied, together with AR and MA, they are referred to as
ARIMA, with the I indicating "integrated" and referencing the differencing
procedure.
Seasonality: In addition to trend, which has now been provided for, stationary
series quite commonly display seasonal behavior where a certain basic pattern
tends to be repeated at regular seasonal intervals. The seasonal pattern may
additionally frequently display constant change over time as well. Just as
regular differencing was applied to the overall trending series, seasonal
differencing (SD) is applied to seasonal non-stationarity as well. And as
autoregressive and moving average tools are available with the overall series,
so too, are they available for seasonal phenomena using seasonal autoregressive
parameters (SAR) and seasonal moving average parameters (SMA).
Establishing Seasonality: The need for seasonal autoregression (SAR) and

seasonal moving average (SMA) parameters is established by examining the
autocorrelation and partial autocorrelation patterns of a stationary series at lags
that are multiples of the number of periods per season. These parameters are
required if the values at lags s, 2s, etc. are nonzero and display patterns
associated with the theoretical patterns for such models. Seasonal differencing
is indicated if the autocorrelations at the seasonal lags do not decrease rapidly.
B-J Modeling Approach to Forecasting
Referring to the above chart know that, the variance of the errors of the
underlying model must be invariant, i.e., constant. This means that the variance
for each subgroup of data is the same and does not depend on the level or the
point in time. If this is violated then one can remedy this by stabilizing the
variance. Make sure that there are no deterministic patterns in the data. Also,
one must not have any pulses or one-time unusual values. Additionally, there
should be no level or step shifts. Also, no seasonal pulses should be present.
The reason for all of this is that if they do exist, then the sample autocorrelation
and partial autocorrelation will seem to imply ARIMA structure. Also, the
presence of these kinds of model components can obfuscate or hide structure.
For example, a single outlier or pulse can create an effect where the structure is
masked by the outlier.
Improved Quantitative Identification Method
Relieved Analysis Requirements: A substantially improved procedure is now

available for conducting Box-Jenkins ARIMA analysis which relieves the
requirement for a seasoned perspective in evaluating the sometimes ambiguous
autocorrelation and partial autocorrelation residual patterns to determine an
appropriate Box-Jenkins model for use in developing a forecast model.
ARMA (1, 0): The first model to be tested on the stationary series consists
solely of an autoregressive term with lag 1. The autocorrelation and partial
autocorrelation patterns are examined for significant autocorrelation often early
terms and to see whether the residual coefficients are uncorrelated; that is the
value of coefficients are zero within 95% confidence limits and without
apparent pattern. When fitted values are as close as possible to the original
series values, then the sum of the squared residuals will be minimized, a
technique called least squares estimation. The residual mean and the mean
percent error should not be significantly nonzero. Alternative models are
examined comparing the progress of these factors, favoring models which use
as few parameters as possible. Correlation between parameters should not be
significantly large and confidence limits should not include zero. When a
satisfactory model has been established, a forecast procedure is applied.
ARMA (2, 1): Absent a satisfactory ARMA (1, 0) condition with residual

coefficients approximating zero, the improved model identification procedure
now proceeds to examine the residual pattern when autoregressive terms with
order 1 and 2 are applied together with a moving average term with an order of
1.
Subsequent Procedure: To the extent that the residual conditions described

above remain unsatisfied, the Box-Jenkins analysis is continued with ARMA
(n, n-1) until a satisfactory model reached. In the course of this iteration, when
an autoregressive coefficient (phi) approaches zero, the model is reexamined
with parameters ARMA (n-1, n-1). In like manner, whenever a moving average
coefficient (theta) approaches zero, the model is similarly reduced to ARMA
(n, n-2). At some point, either the autoregressive term or moving average term
may fall away completely, and the examination of the stationary series is
continued with only the remaining term, until the residual coefficients approach
zero within the specified confidence levels.
Model Selection in B-J Approach to Forecasting
Seasonal Analysis: In parallel with this model development cycle and in an

entirely similar manner, seasonal autoregressive and moving average
parameters are added or dropped in response to the presence of a seasonal or
cyclical pattern in the residual terms or a parameter coefficient approaching
zero.
Model Adequacy: In reviewing the Box-Jenkins output, care should be taken

to insure that the parameters are uncorrelated and significant, and alternate
models should be weighted for these conditions, as well as for overall
correlation (R2), standard error, and zero residual.
Forecasting with the Model: The model must be used for short term and
intermediate term forecasting. This can be achieved by updating it as new data
becomes available in order to minimize the number of periods ahead required
of the forecast.
Monitor the Accuracy of the Forecasts in Real Time: As time progresses,

the accuracy of the forecasts should be closely monitored for increases in the
error terms, standard error and a decrease in correlation. When the series
appears to be changing over time, recalculation of the model parameters should
be undertaken.
Autoregressive Models
The autoregressive model is one of a group of linear prediction formulas that

attempt to predict an output of a system based on the previous outputs and
inputs, such as:
Y(t) = 1 + 2Y(t-1) + 3X(t-1) + t,
where X(t-1) and Y(t-1) are the actual value (inputs) and the forecast (outputs),
respectively. These types of regressions are often referred to as Distributed Lag
Autoregressive Models, Geometric Distributed Lags, and Adaptive Models in
Expectation , among others.
A model which depends only on the previous outputs of the system is called an
autoregressive model (AR), while a model which depends only on the inputs to
the system is called a moving average model (MA), and of course a model
based on both inputs and outputs is an autoregressive-moving-average model
(ARMA). Note that by definition, the AR model has only poles while the MA
model has only zeros. Deriving the autoregressive model (AR) involves
estimating the coefficients of the model using the method of least squared error.
Autoregressive processes as their name implies, regress on themselves. If an

observation made at time (t), then, p-order, [AR(p)], autoregressive model
satisfies the equation:
X(t) = 0 + 1X(t-1) + 2X(t-2) + 2X(t-3) + . . . . + pX(t-p) + t,
where t is a White-Noise series.
The current value of the series is a linear combination of the p most recent past
values of itself plus an error term, which incorporates everything new in the
series at time t that is not explained by the past values. This is like a multiple
regressions model but is regressed not on independent variables, but on past
values; hence the term "Autoregressive" is used.
Autocorrelation: An important guide to the properties of a time series is

provided by a series of quantities called sample autocorrelation coefficients or
serial correlation coefficient, which measures the correlation between
observations at different distances apart. These coefficients often provide
insight into the probability model which generated the data. The sample
autocorrelation coefficient is similar to the ordinary correlation coefficient
between two variables (x) and (y), except that it is applied to a single time
series to see if successive observations are correlated.
Given (N) observations on discrete time series we can form (N - 1) pairs of

observations. Regarding the first observation in each pair as one variable, and
the second observation as a second variable, the correlation coefficient is called
autocorrelation coefficient of order one.
Correlogram: A useful aid in interpreting a set of autocorrelation coefficients

is a graph called a correlogram, and it is plotted against the lag(k); where is the
autocorrelation coefficient at lag(k). A correlogram can be used to get a general
understanding on the following aspects of our time series:
1. A random series: if a time series is completely random then for Large

(N), will be approximately zero for all non-zero values of (k).
2. Short-term correlation: stationary series often exhibit short-term
correlation characterized by a fairly large value of 2 or 3 more
correlation coefficients which, while significantly greater than zero, tend
to get successively smaller.
3. Non-stationary series: If a time series contains a trend, then the values of
will not come to zero except for very large values of the lag.
4. Seasonal fluctuations: Common autoregressive models with seasonal
fluctuations, of period s are:
X(t) = a + b X(t-s) + t
and
X(t) = a + b X(t-s) + c X(t-2s) +t
Partial Autocorrelation: A partial autocorrelation coefficient for order k

measures the strength of correlation among pairs of entries in the time series
while accounting for (i.e., removing the effects of) all autocorrelations below
order k. For example, the partial autocorrelation coefficient for order k=5 is
computed in such a manner that the effects of the k=1, 2, 3, and 4 partial
autocorrelations have been excluded. The partial autocorrelation coefficient of
any particular order is the same as the autoregression coefficient of the same
order.
Fitting an Autoregressive Model: If an autoregressive model is thought to be
appropriate for modeling a given time series then there are two related
questions to be answered: (1) What is the order of the model? and (2) How can
we estimate the parameters of the model?
The parameters of an autoregressive model can be estimated by minimizing the

sum of squares residual with respect to each parameter, but to determine the
order of the autoregressive model is not easy particularly when the system
being modeled has a biological interpretation.
One approach is, to fit AR models of progressively higher order, to calculate

the residual sum of squares for each value of p; and to plot this against p. It
may then be possible to see the value of p where the curve "flattens out" and
the addition of extra parameters gives little improvement in fit.
Selection Criteria: Several criteria may be specified for choosing a model

format, given the simple and partial autocorrelation correlogram for a series:
1. If none of the simple autocorrelations is significantly different from zero,

the series is essentially a random number or white-noise series, which is
not amenable to autoregressive modeling.
2. If the simple autocorrelations decrease linearly, passing through zero to
become negative, or if the simple autocorrelations exhibit a wave-like
cyclical pattern, passing through zero several times, the series is not
stationary; it must be differenced one or more times before it may be
modeled with an autoregressive process.
3. If the simple autocorrelations exhibit seasonality; i.e., there are
autocorrelation peaks every dozen or so (in monthly data) lags, the series
is not stationary; it must be differenced with a gap approximately equal
to the seasonal interval before further modeling.
4. If the simple autocorrelations decrease exponentially but approach zero
gradually, while the partial autocorrelations are significantly non-zero
through some small number of lags beyond which they are not
significantly different from zero, the series should be modeled with an
autoregressive process.
5. If the partial autocorrelations decrease exponentially but approach zero
gradually, while the simple autocorrelations are significantly non-zero
through some small number of lags beyond which they are not
significantly different from zero, the series should be modeled with a
moving average process.
6. If the partial and simple autocorrelations both converge upon zero for
successively longer lags, but neither actually reaches zero after any
particular lag, the series may be modeled by a combination of
autoregressive and moving average process.
The following figures illustrate the behavior of the autocorrelations and the
partial autocorrelations for AR(1) models, respectively,
AR1 Autocorrelations and Partial Autocorrelations
Similarly, for AR(2), the behavior of the autocorrelations and the partial
autocorrelations are depicted below, respectively:
AR2 Autocorrelations and Partial Autocorrelations
Adjusting the Slope's Estimate for Length of the Time Series: The

regression coefficient is biased estimate and in the case of AR(1), the bias is -(1
+ 3 1) / n, where n is number of observations used to estimate the parameters.
Clearly, for large data sets this bias is negligible.
Stationarity Condition: Note that an autoregressive process will only be stable

if the parameters are within a certain range; for example, in AR(1), the slope
must be within the open interval (-1, 1). Otherwise, past effects would
accumulate and the successive values get ever larger (or smaller); that is, the
series would not be stationary. For higher order, similar (general) restrictions
on the parameter values can be satisfied.
Inevitability Condition: Without going into too much detail, there is a

"duality" between a given time series and the autoregressive model
representing it; that is, the equivalent time series can be generated by the
model. The AR models are always invertible. However, analogous to the
stationarity condition described above, there are certain conditions for the Box-
Jenkins MA parameters to be invertible.
Forecasting: The estimates of the parameters are used in Forecasting to

calculate new values of the series, beyond those included in the input data set
and confidence intervals for those predicted values.
An Illustrative Numerical Example: The analyst at Aron Company has a time

series of readings for the monthly sales to be forecasted. The data are shown in
the following table:
Aron Company Monthly Sales ($1000)

t X(t) t X(t) t X(t) t X(t) t X(t)
1 50.8 6 48.1 11 50.8 16 53.1 21 49.7

2 50.3 7 50.1 12 52.8 17 51.6 22 50.3
3 50.2 8 48.7 13 53.0 18 50.8 23 49.9
4 48.7 9 49.2 14 51.8 19 50.6 24 51.8
5 48.5 10 51.1 15 53.6 20 49.7 25 51.0
By constructing and studying the plot of the data one notices that the series
drifts above and below the mean of about 50.6. By using the Time Series
Identification Process JavaScript, a glance of the autocorrelation and the partial
autocorrelation confirm that the series is indeed stationary, and a first-order
(p=1) autoregressive model is a good candidate.
X(t) = 0 + 1X(t-1) + t,
Stationary Condition: The AR(1) is stable if the slope is within the open
interval (-1, 1), that is:
| 1|  1
is expressed as a null hypothesis H0 that must be tested before forecasting stage.

To test this hypothesis, we must replace the t-test used in the regression
analysis for testing the slope with the -test introduced by the two economists,
Dickey and Fuller. This test is coded in the Autoregressive Time Series
Modeling JavaScript.
The estimated AR(1) model is:
X(t) = 14.44 + 0.715 X(t-1)
The 3-step ahead forecasts are:
X(26) = 14.44 + 0.715 X(25) = 14.44 + 0.715 (51.0) = 50.91

X(27) = 14.44 + 0.715 X(26) = 14.44 + 0.715 (50.91) = 50.84
X(28) = 14.44 + 0.715 X(27) = 14.44 + 0.715 (50.84) = 50.79

Box-Jenkins Methodology Forecasting Basics

Uploaded by

Copyright:

Available Formats

Box-Jenkins Methodology Forecasting Basics

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Box-Jenkins Methodology Forecasting Basics

Uploaded by

Copyright:

Available Formats

Box-Jenkins Methodology

Forecasting Basics: The basic idea behind self-projecting time series forecasting

Approaches to time Series Forecasting: There are two basic approaches to

Box-Jenkins Forecasting Method: The univariate version of this methodology is a

Box-Jenkins forecasting models are based on statistical concepts and principles

Data: The misuse, misunderstanding, and inaccuracy of forecasts are often the

Preliminary Model Identification Procedure: A preliminary Box-Jenkins

Model Identification Background

Basic Model: With a stationary series in place, a basic model can now be

Establishing Seasonality: The need for seasonal autoregression (SAR) and

B-J Modeling Approach to Forecasting

Relieved Analysis Requirements: A substantially improved procedure is now

ARMA (2, 1): Absent a satisfactory ARMA (1, 0) condition with residual

Subsequent Procedure: To the extent that the residual conditions described

Seasonal Analysis: In parallel with this model development cycle and in an

Model Adequacy: In reviewing the Box-Jenkins output, care should be taken

Monitor the Accuracy of the Forecasts in Real Time: As time progresses,

The autoregressive model is one of a group of linear prediction formulas that

Autoregressive processes as their name implies, regress on themselves. If an

X(t) = 0 + 1X(t-1) + 2X(t-2) + 2X(t-3) + . . . . + pX(t-p) + t,

where t is a White-Noise series.

Autocorrelation: An important guide to the properties of a time series is

Given (N) observations on discrete time series we can form (N - 1) pairs of

Correlogram: A useful aid in interpreting a set of autocorrelation coefficients

1. A random series: if a time series is completely random then for Large

X(t) = a + b X(t-s) + t

X(t) = a + b X(t-s) + c X(t-2s) +t

where t is a White-Noise series.

Partial Autocorrelation: A partial autocorrelation coefficient for order k

The parameters of an autoregressive model can be estimated by minimizing the

One approach is, to fit AR models of progressively higher order, to calculate

Selection Criteria: Several criteria may be specified for choosing a model

1. If none of the simple autocorrelations is significantly different from zero,

AR1 Autocorrelations and Partial Autocorrelations

AR2 Autocorrelations and Partial Autocorrelations

Adjusting the Slope's Estimate for Length of the Time Series: The

Stationarity Condition: Note that an autoregressive process will only be stable

Inevitability Condition: Without going into too much detail, there is a

Forecasting: The estimates of the parameters are used in Forecasting to

An Illustrative Numerical Example: The analyst at Aron Company has a time

Aron Company Monthly Sales ($1000)

1 50.8 6 48.1 11 50.8 16 53.1 21 49.7

X(t) = 0 + 1X(t-1) + t,

where t is a White-Noise series.

is expressed as a null hypothesis H0 that must be tested before forecasting stage.

The estimated AR(1) model is:

X(t) = 14.44 + 0.715 X(t-1)

The 3-step ahead forecasts are:

X(26) = 14.44 + 0.715 X(25) = 14.44 + 0.715 (51.0) = 50.91

You might also like