Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time

Series Models
Author(s): G. E. P. Box and David A. Pierce
Reviewed work(s):
Source: Journal of the American Statistical Association, Vol. 65, No. 332 (Dec., 1970), pp. 1509-
Published by: American Statistical Association
? Journalof the AmericanStatisticalAssociation
December 1970, Volume 65, Number332
Theoryand Methods Section



Many statisticalmodels,and in particularautoregressive-movingaveragetime

seriesmodels,can be regardedas means of transforming the data to whitenoise,
thatis, to an uncorrelatedsequenceoferrors.If theparametersare knownexactly,
this randomsequence can be computeddirectlyfromthe observations;when this
calculationis made withestimatessubstitutedforthe trueparametervalues, the
resultingsequence is referredto as the "residuals,"whichcan be regardedas esti-
mates of the errors.
If the appropriatemodel has been chosen,therewill be zero autocorrelationin
the errors.In checkingadequacy of fitit is therefore logical to study the sample
autocorrelationfunctionof the residuals.For large samples the residualsfroma
correctlyfittedmodelresembleverycloselythe trueerrorsoftheprocess;however,
care is needed in interpretingthe serial correlationsof the residuals.It is shown
herethat the residual autocorrelationsare to a close approximationrepresentable
as a singularlineartransformation ofthe autocorrelations ofthe errorsso that they
possessa singularnormaldistribution. Failingto allowforthisresultsin a tendency
to overlookevidenceof lack of fit.Tests of fitand diagnosticchecksare devised
whichtake thesefactsinto account.

An approach to the modelingof stationaryand non-stationarytime series
such as commonlyoccur in economic situationsand controlproblemsis dis-
cussedby Box and Jenkins[4, 5], buildingon the earlierworkofseveralauthors
beginningwithYule [19] and Wold [17], and involvesiterativeuse ofthe three-
stage process of identification,estimation,and diagnosticchecking.Given a
discretetimeseriesZt, Zt-1, Z t-2, * * -and usingB forthebackwardshiftoper-
atorsuch that Bzt = Zt1, thegeneralautoregressive-integratedmovingaverage
(ARIMA) model of order(p, d, q) discussedin [4, 5] may be written
+(B)Vdzt = O(B)at (1.1)
where cp(B)=1- 01B- * * * - ,BP and 0(B)=1-01B- OqBql {at} is a
sequence of independentnormal deviates with common variance 0a2, to be
referredto as "white noise," and where the roots of b(B) =0 and 0(B) =0 lie
outsidetheunitcircle.In otherwords,if Wt = VdZt = (1-B)dz, is thedthdiffer-
ence of the seriesZt, then wtis the stationary,invertible,mixed autoregressive
(AR)-moving average (MA) processgivenby
Wt i=1 iwt-i -,=. Ojat-1 + at,
d>O allowstheoriginal
seriesto be (homogeneously)
* G. E. P. Box is professorofstatistics,UniversityofWisconsin.David A. Pierceis on leave fromthe Depart-
mentof Statistics,Universityof Missouri,Columbia,as statistician,Research Department,Federal ReserveBank
of Cleveland.This workwas supportedjointlyby the Air Force Officeof ScientificResearchunderGrantAFOSR-
69-1803and bytheU. S. ArmyResearchOfficeunderGrantDA-ARO-D-31-124-G917.

1510 Journalof the AmericanStatisticalAssociation,December 1970

tionary.In some instancesthe model (1.1) will be appropriateaftera suitable

transformation is made on z; in othersz may representthe noise structureafter
allowingforsome systematicmodel.
This generalclass of models is too rich to allow immediatefittingto a par-
ticularsample series {Zt } = strategyis therefore
Z21 . . . Zn, and the following
1. A processof identificationis used to finda smallersubclass of modelsworthcon-
sideringto representthe stochasticprocess.
2. A modelin thissubclassis fittedby efficient
3. An examinationof the adequacy of the fitis made.
The object of the thirdor diagnosticcheckingstage is not merelyto determine
whetherthereis evidence of lack of fitbut also to suggestways in which the
model may be modifiedwhen this is necessary.Two basic methodsfor doing
thisare suggested:
The modelmay be deliberatelyoverparameterized
Overfitting. in a way it is feared
nay be neededand in a mannersuch thattheentertainedmodelis obtainedby setting
certainparametersin the moregeneralmodel at fixedvalues, usually zero. One can
thencheckthe adequacy of the originalmodelby fittingthe moregeneralmodel and
consideringwhetheror not the additionalparameterscould reasonablytake on the
specifiedvalues appropriateto the simplermodel.
Diagnosticchecksapplied to theresiduals.The methodof overfitting is most useful
wherethe nature of the alternativefearedmodel is known.Unfortunately, this in-
formationmaynot alwaysbe available,and less powerfulbut moregeneraltechniques
are needed to indicate the way in which a particularmodel mightbe wrong.It is
natural to considerthe stochasticpropertiesof the residuals a = (di, 42, * , a,)'
calculatedfromthe sample seriesusing the model (1.1) with estimatesq 1,qP2,
cp; 01,02, 0, substitutedfor the parameters.In particulartheirautocorrelation
rk dgtat-k/,2 dt2 (1.2)
may be studied.
Now if the model were appropriate and the a's for the particularsample
serieswere calculated using the trueparametervalues, then these a's would be
uncorrelatedrandom deviates, and their first m sample autocorrelations
r = (r1,r2, , rm)',wherem is small relativeto n and

rk , (1.3)
E at
would formoderateor large n possess a multivariatenormal distribution[1].
Also it can readilybe shownthat the {rA;
} are uncorrelatedwithvariances
V (rAk)= - 1/n, (1.4)
n(n + 2)(14
from which it follows in particular that the statistic n(n+2) , (n-k)-'rk2
would forlargen be distributedas x2 withm degreesoffreedom;or as a further
ResidualAutocorrelations 1511
It is temptingto suppose that these same propertiesinightto a sufficient
approximationbe enjoyedby the r's fromthefitted model; and diagnosticchecks
based on this suppositionwere suggestedby Box and Jenkins [4] and Box,
Jenkins,and Bacon [6]. If this assumptionwerewarranted,approximatestan-
dard errorsof I/V/n [or more accurate standard errorsof V/n- k/n(n+2)]
could be attached to the r's and a quality-control-chart
type of approach used,
with particularattentionbeing paid to the r's of low orderforthe indication
of possible model inadequacies. Also it might be supposed that Equation
(1.5) with r's replacing r's would still be approximatelyvalid, so that large
values of thisstatisticwould place the modelundersuspicion.
It was pointed out by Durbin [10], however,that this approximationis
invalid when applied to the residual autocorrelationsfroma fittedautoregres-
sive model. For example,he showed that r' calculated fromthe residuals of a
firstorderautoregressiveprocesscould have a muchsmallervariance than ri for
The presentpaper thereforeconsidersin some detail the propertiesof the r's
and in particulartheircovariance matrix,both for AR processes (Sections 2
and 3) and for MA and ARIMA processes (Section 5). This is done with the
intentionof obtaininga suitable modificationto the above diagnosticchecking
procedures(Sections4 and 5.3)
The problemof testingfitin time series models has been consideredprevi-
ously by several authors.Quenouille [14]1 developed a large-sampleprocedure
forAR processesbased on theirsample partialautocorrelations, whichpossesses
the same degree of accuracy as the presentone.2 Quenouille's test was sub-
sequently extended [3, 15, 18] to cover MA and mixed models. Whittle [16]
proposed tests based on the likelihood ratio and resemblingthe overfitting
methodabove. The presentprocedure(a) is a unifiedmethodequally applicable
to AR, MA, and general ARIMA models, (b) is motivated by the intuitive
idea that the residualsfroma correctfitshould resemblethe true errorsof the
process,and (c) can be used to suggestparticularm-odifications in the model
when lack of fitis found [5 ].

In this section we obtain the joint large-sampledistributionof the residual
autocorrelationsr=(ri, * * , 9m)' wherer4kis givenby (1.2), foran autoregres-
sive process. This is done by firstsettingforthsome generalpropertiesof AR
processes,using these to obtain a set of linearconstraints(2.9) satisfiedby the
}, and then approximatingrkby a firstorderTaylor expansion(2.22) about
the whitenoise autocorrelationrk.Finally,theseresultsare combinedin matrix
formto establish a linear relationship(2.27) between r and r analogous to
that betweenthe residualsand true errorsin a standardregressionmodel,from
which the distribution(2.29) of r readily follows. Subsections 2.5-2.7 then
discussexamplesand applicationsofthisdistribution.
1 See also [111.
2 The authorsare gratefulto a refereeforthis observation.
1512 Journal
of theAmerican December1970
2.1 TheAutoregressive
The generalAR processof orderp,
(B)yt - at, (2.1)
whereB, ?(B), and ta,} are as in (1.1), can also be expressedas a movingaver-
age of infiniteorderby writingAI(B)= 4-'(B) (1 +11B+k2B2+ *) to ob-
- o Vjat-j;
yt VI(B)at 2- (2.2)
where 4to=1. By equating coefficients in the relationVI'(B) 0(B) 1, it is seen
that the VI'sand O's satisfythe relation
1V/V-1+ ***+ OV-41l + 0Y V < p
v 23
+ ***+
0141Y-1 ?OP4,JV-X
7 > P.

= 0 forv<0, we have
Thereforeby setting /41
iIo =1; 4(B)4I' 0, v 0. (2.4)
Suppose then we have a series {yt} generatedby the model (2.1) or (2.2),
where in general yt= Vdzt can be the dth difference(d 0, 1, 2, - - ) of the
actual observations.Then forgivenvalues = (s, . , kr)' of the parameters
we can define
tit=at( = yt(- =- - qy-l
p(B)yt (2.5)
and the correspondingautocorrelation

k= rk(4>) = iLtt-k (2.6)

Thus, in particular,
1. at(+)=atasin (2.1), (2.2);
2. at(+) -Q are the residualswhen (2.1) is fittedand least squares estimated
+ obtained; and
3. rk(4) and rk(+) are respectivelythe residual and whitenoise autocorrela-
tions (1.2) and (1.3).
on ther's
2.2 LinearConstraints
It is knownthat the residuals {at } above satisfythe orthogonality
n A _
Et_P+1 atyt_ =0 1 j p. (2.7)
Thereforeif we let
A(B)-4-(B) = (I1- B - *^-pBP)-') (2.8)
then yt= 4(B)dt, and from(2.7) we have
0 -ZEt Ek kkd?atk j

- Ek 6kfk+j

= E2 f'kf)k+j+ Op(l/n) (2.9)

ResidualAutocorrelations 1513
wherethe symbolintroducedin (2.9) denotes "orderin probability"as defined
in [13].
In leading up to (2.9) we have presumablysummed an infinitenumberof
autocorrelationsfroma finiteseries.However since {y,} is stationarywe have
41k-*Oas k becomeslarge; and unless 4)is extremelyclose to the boundaryofthe
stationarityregion,this dyingoffof {i4 is fast so that the summationcan gen-
erallybe stopped at a value of k much less than n. More precisely,we are as-
sumingthat n is largerthan a fixednumberN and for such n there exists a
sequence ofnumbersmnsuch that
(a) all 3Cjwherej > m. - p are oforder1/-VTt
(b) the ratio m./nis itselfof order1/\/n.
Then in (2.9) and in all followingdiscussionthe errorin stoppingthe summa-
tions at k=m (we writem for mnin the sequel) can to the presentdegree of
approximationbe ignored; and (b) also ensures that "end effects"(such as
therebeing only n-k termssummedin the numeratorof rkcomparedwith n
termsin the denominator)can also be neglected.
of r, aboutrk
2.3 LinearExpansion
The root mean square errorof cj, 1<j?p, definedby VE(4j_-j)2, is of
order1/V/f,and we can thereforeapproximatefk by a firstorderTaylor expan-
sion about =4) (evaluating the derivatives,however,at 45rather than 4) in
orderto obtain the simplification(2.12) below). Thus
rk= rk+ E 1 (oj - fj)gjk + Op(1/n), (2.10)

j- I- (2.11)
afj P=A


[E t2] = 0 at 4= (2.12)
so that

Vi - - [ E &2]-1 _C (2.13)

ek = E atat-k = E [k(B)yt][L(B)yt_k]
= Et ZtZ0
Ej=Do 4ijYt-iYt-k-jt (2.14)
wherein (2.14) and below, 4o = ko -1. From (2.13) and (2.14) it followsthat

sJk =
E-aYt2 =o
j 'i[r(Y)7-i+j + r(Y)k?iJ]

i=0 k-i
[r(j)k_+I + rQ()k+ i-jj

t_0 '^P
=09ijr(Y)f-j (2.15)
1514 Journalof the AmericanStatisticalAssociation,December 1970

E YtYt-v

Let us approximateajk by replacing "'s and r(y)'sin (2.15) by ?'s and p's (the
theoreticalparametersand autocorrelationsof the autoregressiveprocess {Yt
and denotethe resultby bjk. That is,

aft =-
[Pk-i+j + Pk+i-j1
- Z=-o Zj=O ciojPi-j

Now fromBartlett'sformula[2, Equation (7) ] we have

rk(y)=Pk + Op(l/V/n), (2.17)

and as in the discussionpreceding(2.10), j = 4j+ Op(l/Vn);
3jk = 3jk + O p(l/n), (2.18)
so that equation (2.10) holdswhen Ak iS replacedby 5jk.
By makinguse of the recursiverelationwhichis satisfiedby the autocorrela-
tionsofan autoregressiveprocess,namely
pv 4lpv-l -ppv=p =(B)p. 0, v> 1 (2.19)
expression(2.16) can be siiyplifiedto yield

I:=o SbiPk-j+i
(3jk - (2.20)
Thus ajk depends oiily on (k-j), arid we thereforewrite Sk-j= ajk. Then it is
straightforwardto show that
(a) o = 1
(b) SP= 0, v < 0, and thus

(e) 'p(B) 3V- = o

o, v > 1.

Comparing (a), (b), and (c) with the correspondingresults (2.4) for 41, we
have 6,5=1, that is
3jk = ilk-j, (2.21)
whence,fork= 1, 2, * * *, m,

rk = rk + j=1 (pi - $j)tk-j + Op(1/n). (2.22)

2.4 Representation
of r as a LinearTransformation
of r
We can now establish a relationshipbetween the residual autocorrelations
r and the whitenoise autocorrelationsr. Let
Residual Autocorrelations
in TimeSeries Models 1515
1 0 *. . O-
i1t1 1
J/t2 61i 0

X= . . 1 (2.23)

. . .
_V/m-1 A/m-2 mp

[Xj I X21 * * * I Xp].

Then to Op(1/n)we can write(2.22) in matrixformas

+ X(+-P), (2.24)
rX= O. (2.25)
If we now multiply(2.24) on both sides by
Q = X(X'X)-lX', (2.26)
then using (2.25) we obtain
r = (I - Q)r. (2.27)

It is known [1] that r is very nearly normal for n moderatelylarge. The

vector of residual autocorrelationsis thus approximatelya linear transforma-
tion of a multi-normalvariable and is thereforeitselfnormallydistributed.
r N(O, (1/n)I), (2.28)
and hence
r 'N(O, (1/n)[I - Q]). (2.29)
Note that the matrixI - Q is idempotentof rank m- p, so that the distribu-
tionofr has a p-dimensionalsingularity.
2.5 FurtherConsiderationof the Covariance Structureof the r's
It is illuminatingto examinein greaterdetail the covariance matrixof r, or
equivalentlythe matrixQ. The latter matrixis idempotentof rank p, and its
non-nulllatentvectorsare the columnsofX. Also,

X'X- EzQVj2j-
'jii E {j ,j2J,/-i *..p* * EzV14'j-pi-

_ t41j-p+1
. . .
_. (2.30)
P1 PP1
. .
ay2P[ P2

- Pp-, Pp2 * *
1516 of theAmerican
Journal December1970
whichwhen multipliedby ta2 is the autocovariancematrixof the processitself.
Let cU be the (ij)th elementof (X'X)-' (given explicitlyin [9]), and similarly
qij for Q. If (=(Ajj, * * , qlj-,) denotes the jth row of X, then
qij kj'(X'X)-1tj
" 23
-= -1
k=4 1t=l {_kCU41j_t
t, Ct__ (2.31)
(-n)colrv[p,fj] if i 5 j.
Since the elementsof each columnof X satisfythe recursiverelation(2.4), we
have t(B) tj = 0, and hence
)(B)qti = 0, (2.32)
wherein (2.32) B can operateeitheron i or on j. This establishesan interesting
recursivestructurein the residual autocorrelationcovariance matrix (1/n)
*(I -Q) and providesan importantclue as to how rapidlythe covariances die
out and the variances approach 1. Also, because of this propertythe entire
covariancematrixis determinedby specifyingthe elements
. .
qll q12 . Ulp

q22 *q* 22p (2.33)


of Q, whichare readilyobtainedby invertingthe X'X matrix(2.30).

and secondorderprocesses
2.6 CovarianceMatrixof r forfirst
Consider,for example, the firstorder autoregressiveprocess yt-=yt_-+at,
whichin accordancewith(2.2) we can writeas
Yt = (1 - OB)-'at = E'= o yat-j. (2.34)
For this process,Aj= 4i and (X'X)-l = 1 -2. From (2.31) the (ij)th elementof
Q is thereforei+j-2(1 - .2), thecovariancematrixofthe
so thatapproximately
r = (l/n)(l-Q) = l/n 2
->O + 4)3 1 - (2 + ?04 (pl + 0?5 .
_<>2+ 04 - (p3 + (A>5 1 - 04 + (p6 .

For the second orderprocess

y= (1 - 41B - 402B2)-'at = ,(B)at, (2.36)
we have
ResidualAutocorrelations 1517
(x')' x- 2
-a'1 r1 -i2)Oa2
a 2 - (1

cr2(1 - p12) -pl (1 + 4)2)[(1 - )2)2 -


qll 1-422, q12 3 -P142(l + 4)2), q22 1 22 012(l + 02 )2,

fromwhichQ and r = L/n(I - Q) may be determinedusing (2.32). In particu-


V(f) =1/n .22

V(f2) = 1/n[4)22 + 4)12(1 + 4)2)21 and (2.37)

V(Pk) = 1/n[l - Olqk,k-, - 02qk,k-2], k > 3.

From these exampleswe can see a generalpatternemerging.As in (2.33) the

firstp variances and correspondingcovariances will be heavily dependenton
the parameters4)1,- - *, 4, and in generalcan depart sharplyfromthe corre-
spondingvalues forwhitenoise autocorrelations,whereasfork> p+1 a "1" is
introducedinto the expressionforvariances (as in (2.35) and (2.37)), and the
(2.32) ensuresthat as k increasesthe { fk} behaveincreasingly
the corresponding{rk} with respectto both theirvariances and covariances.
of nE
2.7 Thedistribution
We have remarkedearlier that if the fittedmodel is appropriateand the
parameters4+are exactlyknown,thenthe calculated at's would be uncorrelated
normaldeviates,theirserialcorrelationsr wouldbe approximatelyN(O, (1/n)I),
and thus n El rk2would possess a x2 distributionwith m degreesof freedom.
large so that the elementsafterthe
We now see that if m is taken sufficiently
mthin the latent vectorsof Q are essentiallyzero, then we should expect that
to the orderof approximationwe are hereemploying,the statistic

nZ 2 (2.38)
obtained when estimates are substitutedfor the true parameterse? in the
model,will stillbe distributedas X2,only now withm- p ratherthan m degrees
of freedom.This resultis of considerablepracticalinterestbecause it suggests
that an overall test of the type discussedin [4] can in fact be justifiedwhen
suitable modificationscomingfroma more carefulanalysis are applied. Later
we considerin more detail the use of this test, along with procedureson indi-
vidual r's, in diagnosticchecking.


We have made certain approximationsin derivingthe distributionof the
of interestto investigatethisdistri-
residualautocorrelations,and it is therefore
bution empiricallythroughrepeatedsamplingand to comparethe resultswith
(2.29). This was done forthe firstorderAR processfor 0=O, ?.1, ?.3, ?A.5,
?.7, ?.9. For given 4, s =50 sets of n - 200 random normal deviates were
generated on the computerusing a method described in [7], with separate
aggregatesof deviates obtained for each parametervalue. For the jth set a
1518 Journalof the AmericanStatisticalAssociation,December 1970

series{yt(i)} was generatedusingformula(2.34), +() was estimated,{t4i)

determined, and thequantities
E A,(j) A(j)
(j) at at-k
rk -- - (3.1)

computedfor1<k<m=20, 1<j<s=50. This yieldedsamplevariancesand

1 50
Ck= - EJ,= rk rt (3.2)

and samplecorrelations
I7k* = Ckt/VCkkCZe. (3.3)
The resultsofthisMonteCarlosamplingare set out in detailin [8] and in
generalconfirm used. As an exampleof
the adequacyof the approximations
these calculations, Table I compares the empiricalvariances (3.2) of Pk and
correlations(3.3) of (rj, 9k) with theirtheoreticalcounterpartsobtained from
(2.35). Allowingfor the sampling errorof the Monte Carlo estimates them-
selves, thereis good agreementbetweenthe two sets of quantities,a phenom-
enonwhichoccurredalso forthe othervalues of 0 considered.
Since the large-samplevariance 02/n of 9l departsthe mostfromthe common
variance of 1/n for white noise autocorrelations,an examinationof the em-
pirical behavior of this quantityis of particularinterest.Thus Figure 1 shows
the sample variance of PI for ) = 0, ?.1, + .3 ? .5, +.7, +.9 in relationto the
parabola V(f1)= 02/n,withreasonableagreementbetweenthe two. (The coeffi-
cient of variation of the sample variance of fk for4) 40 is approximatelyV2/s
-1/5, independentofk and n; at 4 = 0, V(i) =0(1/n2).


Variance of rh Correlation between

k (multipliedbyn) r%and rhk
Theoretical Empirical Theoretical Empirical
1 .250 .244 1.000 1.000
2 .813 .676 - .832 - .812
3 .953 .741 -.384 -.301
4 .988 .864 - .189 - .186
5 .997 1.240 -.094 -.366
6 .999 .967 - .047 - .221
7 1.000 .870 -.023 .083
8 1.000 1.203 -.012 -.148
9 1.000 .982 -.006 -.009
10 1.000 .881 -.003 -.080
ResidualAutocorrelations 1519
OF ri
1.O ,?2

\ ~~~~~~~v(rl)l
n f

\ ~~~n /


-1.0-.9-.8-.7-.6-.5 . 4-.3-.2-.10 .1 .2 .3 .4 .5 .6 .7 .8- .9 1.0

There are severaladditionalcomparisonswhichcan be made based on certain

functionsofthe r's. Thus we have seen that
=^ Eo-
k0pk = ? (3.4)
and in the courseofour derivationswe have had to make the approximation
0.2Ix rk - ? (3.5)
Some indicationof the validity of this approximationis gained by examining
the actual values of Ifromthe samplingexperiment,which were foundto be
distributedabout zero with a variance of about one-hundredththat which
would have been expectedfromthe same linear formin whitenoise autocorre-
Of considerableimportancebecause of its role in diagnosticcheckingis an
examinationofthe quantity
n 1k rk2SZ0k 200 A20 (3.6)
which as in (2.38) should possess a XI-distributionwith v= m -1 = 19 degrees
of freedom.Such a distributionhas a mean and variance of 19 and 38, respec-
tively,with which the Monte Carlo values can be compared. When this was
done, the overall or pooled empiricalmean was fount, w e found signifi-
cantly different from19. This difference is plausible, however,when it is real-
ized that the statisticn Elm Wkpossessesa x2m-P_ distributiononly insofaras the
whitenoise autocorrelationsra= (ri, a nc, r)' have a commonvariance of 1/n;
and from(1.4) it is seen thatthisapproximationoverestimatesthn ue variance
ofa givenrkby a factorof (n +2)/ (n- k). In particular,forn=200, m= 20, and
a typicalvalueofk 10, the y actualvarianceV(rk)is 190/202 94 percentofthe
1/n approximation.Since the residualautocorrelationSr are by (2.27) a linear
transformation of r, it is reasonableto expect that a comparable depressionof
1520 of theAmerican
Journal Statistical
the variances of {rik} would occur,and this would account forthe discrepancy
between the theoreticaland empiricalmeans of the statistic200 El rk2 en-
counteredabove. (This phenomenonwould also explain the tendencyforthe
empiricalvariances themselves,such as those in Table 1, to take on values
averagingabout 5 percentlowerthan thosebased on the matrix(I /n)(I - Q) of
We have obtained the large sample distributionof the residual autocorrela-
tions r fronm fittingthe correctmodel to a time series,and we have discussed
the ways in whichthis distributiondepartssignificantly fromthat of the white
noise autocorrelationsr. It is desirable now to considerthe practical implica-
tionsoftheseresultsin examiningthe adequacy offitofa model.
First of all it appears that even thoughthe r's have a variance/covariance
matrixwhich can differvery considerablyfromthat of the r's, the statistic
ET=1 Pk2 will (since the matrixI- Q is idempotent)stillpossess a X2-distribu-
tion, only now with m-p ratherthan m degreesof freedom.Thus the overall
x2-testdiscussed in Section 1 may be justifiedto the same degree of approxi-
mation as before when the number of degrees of freedomis appropriately
However, regardingthe "quality-control-chart" procedure,that is the com-
parison of the {rk} with their standard errors,some modificationis clearly
Figure 2 shows the straight-linestandard errorbands of width 1/V/nasso-
ciated withany set of whitenoise autocorrelations{rk}. These stand in marked
contrast to the correspondingbands for the residual autocorrelations{rk},
derived fromtheircovariance matrix (1/n)(I -Q) and shown in Figure 3 for
selected firstand second order AR processes. Since it is primarilythe r's of
small lags that are mostusefulin revealingmodel inadequacies,we see that the
consequenceof treatingP's as r's in the diagnosticcheckingprocedurecan be a
seriousunderestimationof significance, that is, a failureto detectlack of fitin
the modelwhenit exists.Of course,ifthe model would have been judged inade-
quate anyway,our convictionin thisregardis now strengthened.
Suppose, forexample,that we identifya series of length 200 as firstorder



k+ 1 2 3 4 5 6
Residual Autocorrelations
in TimeSeries Models 1521
autoregressiveand afterfitting =.5. Suppose also that r',= .10. Now the stan-
dard errorof r1forwhitenoise is 1//n= .07, so that Pi is well withinthe limits
in Figure 2. Thereforeifwe erroneouslyregardedthese as limitson rfwe would
probably not conclude that this model was inadequate. However, if the true
process actually were firstorderautoregressive(say with 0=.5), the standard
errorof Pi would be | j /v/n=.035; since the observedri=.10 is almost three
timesthis value, we shouldbe verysuspiciousof the adequacy of this fit.
The situationis furthercomplicatedby the existenceof ratherhigh correla-
tions betweenthe r's, especiallybetweenthose of small lags. For the firstorder
process,the mostseriouscorrelationis

p[rf, r2] = 1 _1 +

which,for example, approaches -1 as p0---*+ and is still as large as - .6 for

q5=.7. Correlationamong the r's is even more prevalentin second and higher-
orderprocesses,where(as forvariances) those involvinglags up to k = p can be
particularlyserious.From then on theirmagnitudeis controlledby the recur-
sive relationship(2.32); in particular,the closer + is to the boundary ofthe
stationarityregion,the slowerwill be the dyingout of covQrk,r9) or p(rk, r
although oftenin these situationsthe less serious will the initial correlations
pQri,92),p('2, r3), p(ri, r3), etc.,tendto be.
We have thus seen that the departureofthe distributionofthe residualauto-
correlationsr fromthat of white noise autocorrelationsr is serious enoughto


k - 1 2 3 4 5 6 1 2 3 4 5 6
2 2

1 1

2 2

(a) AR(1) = 53 2 (b) AR(1), .77 =

2 2

0 0

1 .1

V'_(c) AR(2), n (d) AR(2),

4j .5 2 .253 ~j 1.0 ~2 75
1522 Journal
of theAmerican
Statistical December1970
warrantsome modificationsin theiruse in diagnosticchecking.The residual
autocorrelationfunction,however,remainsa powerfuldevice forthis purpose.
In obtainingthe distributionof -, -*, rAm)'forthe pure autoregressive
process in Section 2, considerable use was made of the recursive relation
O(B)pk= 0, whichis not satisfiedby movingaverage modelsyt= O(B)at, or more
generallyby mixed models of the form(1.1) with wt = Vdzt denotingthe sta-
tionarydth difference.
It is fortunate,therefore, that these models have in commonwith the pure
AR models (2.1) an importantproperty(derived in Section 5.1) because of
which the distributionof their residual autocorrelationscan be found as an
immediateconsequence of the autoregressivesolution (2.29). This propertyis
that if two time series, (a) the mixed autoregressive-movingaverage series
(1.1), and (b) an autoregressiveseries
r(B)xt = (1 - r1B - *- -
*-rp+,qBP+q)xt = at (5.1)
are both generatedfromthesame set ofdeviates{at}, and moreover
ir(B) -= (B)O(B), (5.2)
then when these models are each fittedby least squares, theirresiduals,and
hence also theirresidualautocorrelations, will be verynearlythe same. There-
foreif a mixed model of order (p, d, q) is correctlyidentifiedand fitted,its re-
sidual autocorrelationsforn sufficiently large will be distributedas thoughthe
model had been of order (p+q, d, 0) withthe relationsbetweenthe two sets of
parametersgivenby (5.2). In particularthe ,6'scomprisingthe X-matrix(2.23)
in +(B) = [O(B)G(B) ]i
forthe model (1.1) are the coefficients
5.1 Equalityof ResidualsinARand ARIMAModels
Let Wt and Xt be as in (1.1) and (5.1); (5.2) thenimplies
wt =2(B)xt. (5.3)
As in (2.5), define

=tAR=atARQI) = 7r(B)xt=- 1rij (5.4)
where7ro -1, and now also
at* at*(+, 0) = k(B)#'(B)wt i=oq?Bt[
Z0-o jBL]-'wc, (5.5)
where ko= =-1. We will expand these quantities about the true parameter
values and go througha least squares estimationin each case whichis analogous
to writingthe linearregressionmodely - X5 +? as
e =-y - y X( - +)? = X& + ?, (5.6)
forfixedg, and then performing the regressiondirectlyon e ratherthan on y.
The equality of the residualsin the two cases depends heavily on the fact that
the derivativesin each expansioninvolve the same autoregressivevariable xt.
inTimeSeriesModels 1523
- Xttj 1 <j < p + q, irrespectiveof =;

____ = - &-'(B)Wt_j, 1 < j < p

3-G(B)xt_j at (4, 6) = (4, 0); and

ao, 3
=(B)6-2(B)wtj, 1< j <

4(B)xt_j at (4) 0) (4, 0).


atAR atAR + j=1 j)t- -rj (5.7)

and approximately
at* = at* + J=j (ci - ki)G(B)xt_,
- Y1 (Gj - Oj)O(B)xt_j (5.8)
at + E- (4i - t)xt-x - (j - j)xt-j
+ St=1 Eq 1 [i(oj -_ ) -_ j(4i -i)]Xt
= At* + j:=pI (i -_ -_ q (
i)Xt_i -_j)Xt_

+ = i il [fi(oj - 6j) -j( i -) _X_ (5.9)

= at* + Ep+q (fl - Oxt-

Thus letting = (f, ., 3,q)' and I-=_ we see that

= AL, (5.10)
whereA is a (p+q) -square matrixwhose elementsinvolve I but not the true
parametervalues L. For example,ifp = q = 1, we would have

Now equations (5.7) and (5.9) can be writtenas

aAR =+va+X( ..=) (5.12)
~4ia*a+X(5-) (5.13)
wherethe errorin (5.13) is 0(1 - f1
2), and wherewe have made use of the fact
that,at=X =0, and + =4,
atAR = at* = at. (5.14)
Thus in (5.12) the sum of squares
a'a = Eat2 =
E [atAR(Q,)]2

is minimizedas a functionofX when

-X= =- ( X)lXfXfAAR, (5.15)
1524 Journalof the AmericanStatisticalAssociation,December 1970

while in (5.13) if we write

a* = a + X[A(-L)J= a + Z(%-L),
then the sum of squares
a'a = Eat2 = E [a 2

is minimizedas a functionof Z when

- == - -= (Z'Z)-Z'a* A-1(g -

that is,
-0 =(X'X)-fX'a*. (5.16)
Then by settinga = a in (5.15) and (5.16), we have from(5.14) the important
-_ =X (X'X) - 'X'a = - (5.17)
and finallyby setting"."-"^in (5.12) and (5.13), it followsfrom(5.17) that
to Op(1/n)
aAR = a + X(:n:) =a +X(5 = (5.18)
-) a*,
and thus (to the same order) fAR = r*, as we set out to show.

5.2 Monte Carlo Experiment

The equality (5.18) betweenthe residualsfromthe autoregressiveand mixed
modelsdepends on the accuracyof the expansion(5.8), that is, on the extentof
linearityin the movingaverage model,betweenthe true and estimatedvalues
0 and 0. It is thereforeworthwhileto confirmthis model-dualityby generating
and fittingpairs of series of the form(1.1) and (5.1) and conmparing theirre-
siduals, or more to our purpose,theirresidual autocorrelations.This was done
forp+ q = 1 and p + q = 2 forseriesof length200. Some indicationof the close-


q0= .1 -0- .3 =$= .9


1 -.029 -.010 .003 -.005 -.048 -.057

2 .164 .169 .044 .045 .157 .151
3 .096 .099 - .098 - .096 .008 .009
4 -.050 -.049 .014 .021 - .126 -.127
5 - .003 - .006 .057 .058 .034 .035
6 - .143 - .144 .010 .012 - .091 - .090
7 -.023 -.026 -.004 .001 -.001 -.000
8 -.040 -.041 -.054 -.046 -.038 -.035
9 .010 .009 .052 .052 - .004 .000
10 -.049 -.049 -.065 -.067 .113 .116
q or 0-+ .159 .057 .543 .451 .922 .870
inTimeSeriesModels 1525
ness of the agreementis obtained fromthe few resultsforfirstorderAR and
MA processesshownin Table 2, whereit is seen that the residual autocorrela-
tion 1kAR and rkMA are equal or nearlyequal to the second decimal place.
A samplingexperimentof the type describedin Section 3 was also performed
for the firstorder MA process. The resultswere very similar,which is to be
expectedin view of (5.18).
5.3 Conclusions
We have shown above that to a close approximationthe residualsfromany
moving average or mixed autoregressive-moving average process will be the
same as those froma suitablychosen autoregressiveprocess. We have further
confirmedthe adequacy of this approximationby empirical calculation. It
followsfromthis that we need not consider separately these two classes of
processes;more precisely,
1. We can immediatelyuse the AR result to writedown the variance/covariance
matrixof r for any autoregressive-integrated
moving average process (1.1) by
consideringthe corresponding
variance/covariancematrixof r fromthe pure AR
ir(B)xt= O(B)4(B)xt = at. (5.19)
2. All considerationsregardingthe use of residual autocorrelationsin tests of fit
and diagnosticcheckingdiscussedin Section4 forthe autoregressive modelthere-
foreapply equally to movingaverage and mixedmodels.
3. In particularit followsfromthe above that a "portmanteau"testforthe adequacy
of any ARIMA processis obtainedby referring nE
ik4 to a x2distribution
v degreesof freedom,wherev= m-p -q.
