American Statistical Association
American Statistical Association
American Statistical Association
Series Models
Author(s): G. E. P. Box and David A. Pierce
Reviewed work(s):
Source: Journal of the American Statistical Association, Vol. 65, No. 332 (Dec., 1970), pp. 1509-
1526
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/2284333 .
Accessed: 28/09/2012 01:07
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.
http://www.jstor.org
? Journalof the AmericanStatisticalAssociation
December 1970, Volume 65, Number332
Theoryand Methods Section
Distribution
ofResidualAutocorrelations
inAutoregressive-
MovingAverageTimeSeriesModels
Integrated
1. INTRODUCTION
An approach to the modelingof stationaryand non-stationarytime series
such as commonlyoccur in economic situationsand controlproblemsis dis-
cussedby Box and Jenkins[4, 5], buildingon the earlierworkofseveralauthors
beginningwithYule [19] and Wold [17], and involvesiterativeuse ofthe three-
stage process of identification,estimation,and diagnosticchecking.Given a
discretetimeseriesZt, Zt-1, Z t-2, * * -and usingB forthebackwardshiftoper-
atorsuch that Bzt = Zt1, thegeneralautoregressive-integratedmovingaverage
(ARIMA) model of order(p, d, q) discussedin [4, 5] may be written
+(B)Vdzt = O(B)at (1.1)
where cp(B)=1- 01B- * * * - ,BP and 0(B)=1-01B- OqBql {at} is a
sequence of independentnormal deviates with common variance 0a2, to be
referredto as "white noise," and where the roots of b(B) =0 and 0(B) =0 lie
outsidetheunitcircle.In otherwords,if Wt = VdZt = (1-B)dz, is thedthdiffer-
ence of the seriesZt, then wtis the stationary,invertible,mixed autoregressive
(AR)-moving average (MA) processgivenby
Wt i=1 iwt-i -,=. Ojat-1 + at,
andpermitting
d>O allowstheoriginal
seriesto be (homogeneously)
nonsta-
* G. E. P. Box is professorofstatistics,UniversityofWisconsin.David A. Pierceis on leave fromthe Depart-
mentof Statistics,Universityof Missouri,Columbia,as statistician,Research Department,Federal ReserveBank
of Cleveland.This workwas supportedjointlyby the Air Force Officeof ScientificResearchunderGrantAFOSR-
69-1803and bytheU. S. ArmyResearchOfficeunderGrantDA-ARO-D-31-124-G917.
1509
1510 Journalof the AmericanStatisticalAssociation,December 1970
Eatat-k
rk , (1.3)
E at
would formoderateor large n possess a multivariatenormal distribution[1].
Also it can readilybe shownthat the {rA;
} are uncorrelatedwithvariances
n-k
V (rAk)= - 1/n, (1.4)
n(n + 2)(14
from which it follows in particular that the statistic n(n+2) , (n-k)-'rk2
would forlargen be distributedas x2 withm degreesoffreedom;or as a further
approximation,
inTimeSeriesModels
ResidualAutocorrelations 1511
It is temptingto suppose that these same propertiesinightto a sufficient
approximationbe enjoyedby the r's fromthefitted model; and diagnosticchecks
based on this suppositionwere suggestedby Box and Jenkins [4] and Box,
Jenkins,and Bacon [6]. If this assumptionwerewarranted,approximatestan-
dard errorsof I/V/n [or more accurate standard errorsof V/n- k/n(n+2)]
could be attached to the r's and a quality-control-chart
type of approach used,
with particularattentionbeing paid to the r's of low orderforthe indication
of possible model inadequacies. Also it might be supposed that Equation
(1.5) with r's replacing r's would still be approximatelyvalid, so that large
values of thisstatisticwould place the modelundersuspicion.
It was pointed out by Durbin [10], however,that this approximationis
invalid when applied to the residual autocorrelationsfroma fittedautoregres-
sive model. For example,he showed that r' calculated fromthe residuals of a
firstorderautoregressiveprocesscould have a muchsmallervariance than ri for
whitenoise.
The presentpaper thereforeconsidersin some detail the propertiesof the r's
and in particulartheircovariance matrix,both for AR processes (Sections 2
and 3) and for MA and ARIMA processes (Section 5). This is done with the
intentionof obtaininga suitable modificationto the above diagnosticchecking
procedures(Sections4 and 5.3)
The problemof testingfitin time series models has been consideredprevi-
ously by several authors.Quenouille [14]1 developed a large-sampleprocedure
forAR processesbased on theirsample partialautocorrelations, whichpossesses
the same degree of accuracy as the presentone.2 Quenouille's test was sub-
sequently extended [3, 15, 18] to cover MA and mixed models. Whittle [16]
proposed tests based on the likelihood ratio and resemblingthe overfitting
methodabove. The presentprocedure(a) is a unifiedmethodequally applicable
to AR, MA, and general ARIMA models, (b) is motivated by the intuitive
idea that the residualsfroma correctfitshould resemblethe true errorsof the
process,and (c) can be used to suggestparticularm-odifications in the model
when lack of fitis found [5 ].
2. DISTRIBUTIONOF RESIDUAL
AUTOCORRELATIONS
FOR THE AUTOREGRESSIVE
PROCESS
In this section we obtain the joint large-sampledistributionof the residual
autocorrelationsr=(ri, * * , 9m)' wherer4kis givenby (1.2), foran autoregres-
sive process. This is done by firstsettingforthsome generalpropertiesof AR
processes,using these to obtain a set of linearconstraints(2.9) satisfiedby the
}, and then approximatingrkby a firstorderTaylor expansion(2.22) about
irk
the whitenoise autocorrelationrk.Finally,theseresultsare combinedin matrix
formto establish a linear relationship(2.27) between r and r analogous to
that betweenthe residualsand true errorsin a standardregressionmodel,from
which the distribution(2.29) of r readily follows. Subsections 2.5-2.7 then
discussexamplesand applicationsofthisdistribution.
1 See also [111.
2 The authorsare gratefulto a refereeforthis observation.
1512 Journal
of theAmerican December1970
Association,
Statistical
Process
2.1 TheAutoregressive
The generalAR processof orderp,
(B)yt - at, (2.1)
whereB, ?(B), and ta,} are as in (1.1), can also be expressedas a movingaver-
age of infiniteorderby writingAI(B)= 4-'(B) (1 +11B+k2B2+ *) to ob-
tain
- o Vjat-j;
yt VI(B)at 2- (2.2)
where 4to=1. By equating coefficients in the relationVI'(B) 0(B) 1, it is seen
that the VI'sand O's satisfythe relation
1V/V-1+ ***+ OV-41l + 0Y V < p
v 23
+ ***+
0141Y-1 ?OP4,JV-X
7 > P.
= 0 forv<0, we have
Thereforeby setting /41
iIo =1; 4(B)4I' 0, v 0. (2.4)
Suppose then we have a series {yt} generatedby the model (2.1) or (2.2),
where in general yt= Vdzt can be the dth difference(d 0, 1, 2, - - ) of the
actual observations.Then forgivenvalues = (s, . , kr)' of the parameters
we can define
tit=at( = yt(- =- - qy-l
p(B)yt (2.5)
and the correspondingautocorrelation
- Ek 6kfk+j
j- I- (2.11)
afj P=A
Now
[E t2] = 0 at 4= (2.12)
so that
Vi - - [ E &2]-1 _C (2.13)
where
ek = E atat-k = E [k(B)yt][L(B)yt_k]
= Et ZtZ0
Ej=Do 4ijYt-iYt-k-jt (2.14)
wherein (2.14) and below, 4o = ko -1. From (2.13) and (2.14) it followsthat
sJk =
Z
E-aYt2 =o
j 'i[r(Y)7-i+j + r(Y)k?iJ]
i=0 k-i
[r(j)k_+I + rQ()k+ i-jj
t_0 '^P
=09ijr(Y)f-j (2.15)
1514 Journalof the AmericanStatisticalAssociation,December 1970
where
r()
E YtYt-v
yt2t
Let us approximateajk by replacing "'s and r(y)'sin (2.15) by ?'s and p's (the
theoreticalparametersand autocorrelationsof the autoregressiveprocess {Yt
and denotethe resultby bjk. That is,
aft =-
[Pk-i+j + Pk+i-j1
(2.16)
- Z=-o Zj=O ciojPi-j
I:=o SbiPk-j+i
(3jk - (2.20)
O'ipi
EP'1=o
Thus ajk depends oiily on (k-j), arid we thereforewrite Sk-j= ajk. Then it is
straightforwardto show that
(a) o = 1
(b) SP= 0, v < 0, and thus
Comparing (a), (b), and (c) with the correspondingresults (2.4) for 41, we
have 6,5=1, that is
therefore
3jk = ilk-j, (2.21)
whence,fork= 1, 2, * * *, m,
2.4 Representation
of r as a LinearTransformation
of r
We can now establish a relationshipbetween the residual autocorrelations
r and the whitenoise autocorrelationsr. Let
Residual Autocorrelations
in TimeSeries Models 1515
1 0 *. . O-
i1t1 1
J/t2 61i 0
X= . . 1 (2.23)
. . .
_V/m-1 A/m-2 mp
X'X- EzQVj2j-
'jii E {j ,j2J,/-i *..p* * EzV14'j-pi-
_ t41j-p+1
E2qpj->2
. . .
_. (2.30)
Pi
I
P1 PP1
. .
ay2P[ P2
La2
- Pp-, Pp2 * *
1516 of theAmerican
Journal December1970
Association,
Statistical
whichwhen multipliedby ta2 is the autocovariancematrixof the processitself.
Let cU be the (ij)th elementof (X'X)-' (given explicitlyin [9]), and similarly
qij for Q. If (=(Ajj, * * , qlj-,) denotes the jth row of X, then
qij kj'(X'X)-1tj
" 23
-= -1
k=4 1t=l {_kCU41j_t
t, Ct__ (2.31)
(-n)colrv[p,fj] if i 5 j.
Since the elementsof each columnof X satisfythe recursiverelation(2.4), we
have t(B) tj = 0, and hence
)(B)qti = 0, (2.32)
wherein (2.32) B can operateeitheron i or on j. This establishesan interesting
recursivestructurein the residual autocorrelationcovariance matrix (1/n)
*(I -Q) and providesan importantclue as to how rapidlythe covariances die
out and the variances approach 1. Also, because of this propertythe entire
covariancematrixis determinedby specifyingthe elements
. .
qll q12 . Ulp
qPP
Thus
qll 1-422, q12 3 -P142(l + 4)2), q22 1 22 012(l + 02 )2,
nZ 2 (2.38)
$
obtained when estimates are substitutedfor the true parameterse? in the
model,will stillbe distributedas X2,only now withm- p ratherthan m degrees
of freedom.This resultis of considerablepracticalinterestbecause it suggests
that an overall test of the type discussedin [4] can in fact be justifiedwhen
suitable modificationscomingfroma more carefulanalysis are applied. Later
we considerin more detail the use of this test, along with procedureson indi-
vidual r's, in diagnosticchecking.
and samplecorrelations
I7k* = Ckt/VCkkCZe. (3.3)
The resultsofthisMonteCarlosamplingare set out in detailin [8] and in
generalconfirm used. As an exampleof
the adequacyof the approximations
these calculations, Table I compares the empiricalvariances (3.2) of Pk and
correlations(3.3) of (rj, 9k) with theirtheoreticalcounterpartsobtained from
(2.35). Allowingfor the sampling errorof the Monte Carlo estimates them-
selves, thereis good agreementbetweenthe two sets of quantities,a phenom-
enonwhichoccurredalso forthe othervalues of 0 considered.
Since the large-samplevariance 02/n of 9l departsthe mostfromthe common
variance of 1/n for white noise autocorrelations,an examinationof the em-
pirical behavior of this quantityis of particularinterest.Thus Figure 1 shows
the sample variance of PI for ) = 0, ?.1, + .3 ? .5, +.7, +.9 in relationto the
parabola V(f1)= 02/n,withreasonableagreementbetweenthe two. (The coeffi-
cient of variation of the sample variance of fk for4) 40 is approximatelyV2/s
-1/5, independentofk and n; at 4 = 0, V(i) =0(1/n2).
Table 1. THEORETICAL
(AS IN (2.35)1 AND EMPIRICAL(FROMMONTE-
CARLOSAMPLING)VARIANCES AND CORRELATIONS OF SAMPLE
RESIDUALAUTOCORRELATIONS FROM FIRST-ORDERAR
PROCESS WITH 0=.5
20
\ ~~~~~~~v(rl)l
n f
\ ~~~n /
0.2
2
3n61
k+ 1 2 3 4 5 6
Residual Autocorrelations
in TimeSeries Models 1521
autoregressiveand afterfitting =.5. Suppose also that r',= .10. Now the stan-
dard errorof r1forwhitenoise is 1//n= .07, so that Pi is well withinthe limits
in Figure 2. Thereforeifwe erroneouslyregardedthese as limitson rfwe would
probably not conclude that this model was inadequate. However, if the true
process actually were firstorderautoregressive(say with 0=.5), the standard
errorof Pi would be | j /v/n=.035; since the observedri=.10 is almost three
timesthis value, we shouldbe verysuspiciousof the adequacy of this fit.
The situationis furthercomplicatedby the existenceof ratherhigh correla-
tions betweenthe r's, especiallybetweenthose of small lags. For the firstorder
process,the mostseriouscorrelationis
p[rf, r2] = 1 _1 +
1 1
/iY-
2 2
2 2
0 0
1 .1
/jj~~~~~~~~~~~~~n
+
=tAR=atARQI) = 7r(B)xt=- 1rij (5.4)
where7ro -1, and now also
at* at*(+, 0) = k(B)#'(B)wt i=oq?Bt[
Z0-o jBL]-'wc, (5.5)
where ko= =-1. We will expand these quantities about the true parameter
values and go througha least squares estimationin each case whichis analogous
to writingthe linearregressionmodely - X5 +? as
e =-y - y X( - +)? = X& + ?, (5.6)
forfixedg, and then performing the regressiondirectlyon e ratherthan on y.
The equality of the residualsin the two cases depends heavily on the fact that
the derivativesin each expansioninvolve the same autoregressivevariable xt.
ResidualAutocorrelations
inTimeSeriesModels 1523
Thus
a4tAR
- Xttj 1 <j < p + q, irrespectiveof =;
ao, 3
=(B)6-2(B)wtj, 1< j <
a* = a + X[A(-L)J= a + Z(%-L),
then the sum of squares
a'a = Eat2 = E [a 2
that is,
-0 =(X'X)-fX'a*. (5.16)
Then by settinga = a in (5.15) and (5.16), we have from(5.14) the important
equality
-_ =X (X'X) - 'X'a = - (5.17)
and finallyby setting"."-"^in (5.12) and (5.13), it followsfrom(5.17) that
to Op(1/n)
aAR = a + X(:n:) =a +X(5 = (5.18)
-) a*,
and thus (to the same order) fAR = r*, as we set out to show.
Table2. RESIDUALCORRELATIONS
FROM FIRSTORDERAR AND MA
TIMESERIESGENERATEDFROM SAME WHITENOISE (n= 200)