Box-Jenkins Methodology Forecasting Basics
Box-Jenkins Methodology Forecasting Basics
Box-Jenkins Methodology Forecasting Basics
Introduction
Time Series: A time series is a set of numbers that measures the status of some
activity over time. It is the historical record of some activity, with measurements
taken at equally spaced intervals (exception: monthly) with a consistency in the
activity and the method of measurement.
Box-Jenkins Methodology
Pure Random Series: On the other hand, if the initial data series displays
neither trend nor seasonality, and the residual plot shows essentially zero values
within a 95% confidence level and these residual values display no pattern,
then there is no real-world statistical problem to solve and we go on to other
things.
Seasonality: In addition to trend, which has now been provided for, stationary
series quite commonly display seasonal behavior where a certain basic pattern
tends to be repeated at regular seasonal intervals. The seasonal pattern may
additionally frequently display constant change over time as well. Just as
regular differencing was applied to the overall trending series, seasonal
differencing (SD) is applied to seasonal non-stationarity as well. And as
autoregressive and moving average tools are available with the overall series,
so too, are they available for seasonal phenomena using seasonal autoregressive
parameters (SAR) and seasonal moving average parameters (SMA).
Referring to the above chart know that, the variance of the errors of the
underlying model must be invariant, i.e., constant. This means that the variance
for each subgroup of data is the same and does not depend on the level or the
point in time. If this is violated then one can remedy this by stabilizing the
variance. Make sure that there are no deterministic patterns in the data. Also,
one must not have any pulses or one-time unusual values. Additionally, there
should be no level or step shifts. Also, no seasonal pulses should be present.
The reason for all of this is that if they do exist, then the sample autocorrelation
and partial autocorrelation will seem to imply ARIMA structure. Also, the
presence of these kinds of model components can obfuscate or hide structure.
For example, a single outlier or pulse can create an effect where the structure is
masked by the outlier.
Improved Quantitative Identification Method
ARMA (1, 0): The first model to be tested on the stationary series consists
solely of an autoregressive term with lag 1. The autocorrelation and partial
autocorrelation patterns are examined for significant autocorrelation often early
terms and to see whether the residual coefficients are uncorrelated; that is the
value of coefficients are zero within 95% confidence limits and without
apparent pattern. When fitted values are as close as possible to the original
series values, then the sum of the squared residuals will be minimized, a
technique called least squares estimation. The residual mean and the mean
percent error should not be significantly nonzero. Alternative models are
examined comparing the progress of these factors, favoring models which use
as few parameters as possible. Correlation between parameters should not be
significantly large and confidence limits should not include zero. When a
satisfactory model has been established, a forecast procedure is applied.
Forecasting with the Model: The model must be used for short term and
intermediate term forecasting. This can be achieved by updating it as new data
becomes available in order to minimize the number of periods ahead required
of the forecast.
where X(t-1) and Y(t-1) are the actual value (inputs) and the forecast (outputs),
respectively. These types of regressions are often referred to as Distributed Lag
Autoregressive Models, Geometric Distributed Lags, and Adaptive Models in
Expectation , among others.
A model which depends only on the previous outputs of the system is called an
autoregressive model (AR), while a model which depends only on the inputs to
the system is called a moving average model (MA), and of course a model
based on both inputs and outputs is an autoregressive-moving-average model
(ARMA). Note that by definition, the AR model has only poles while the MA
model has only zeros. Deriving the autoregressive model (AR) involves
estimating the coefficients of the model using the method of least squared error.
The current value of the series is a linear combination of the p most recent past
values of itself plus an error term, which incorporates everything new in the
series at time t that is not explained by the past values. This is like a multiple
regressions model but is regressed not on independent variables, but on past
values; hence the term "Autoregressive" is used.
and
The following figures illustrate the behavior of the autocorrelations and the
partial autocorrelations for AR(1) models, respectively,
Similarly, for AR(2), the behavior of the autocorrelations and the partial
autocorrelations are depicted below, respectively:
By constructing and studying the plot of the data one notices that the series
drifts above and below the mean of about 50.6. By using the Time Series
Identification Process JavaScript, a glance of the autocorrelation and the partial
autocorrelation confirm that the series is indeed stationary, and a first-order
(p=1) autoregressive model is a good candidate.
Stationary Condition: The AR(1) is stable if the slope is within the open
interval (-1, 1), that is:
| 1| 1