0% found this document useful (0 votes)
63 views

Covariances of ARMA Processes

This document discusses covariance functions of autoregressive moving average (ARMA) processes. It begins by reviewing ARMA models and their properties of causality and invertibility. It then discusses: 1) How the covariances of an ARMA process are related to its coefficients through its covariance generating function. 2) How the Yule-Walker equations expose the relationship between covariances and autoregression coefficients. 3) How the shape of an estimated autocorrelation function can provide insights into the underlying ARMA process.

Uploaded by

Ryan Teehan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Covariances of ARMA Processes

This document discusses covariance functions of autoregressive moving average (ARMA) processes. It begins by reviewing ARMA models and their properties of causality and invertibility. It then discusses: 1) How the covariances of an ARMA process are related to its coefficients through its covariance generating function. 2) How the Yule-Walker equations expose the relationship between covariances and autoregression coefficients. 3) How the shape of an estimated autocorrelation function can provide insights into the underlying ARMA process.

Uploaded by

Ryan Teehan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Statistics 910, #9 1

Covariances of ARMA Processes

Overview
1. Review ARMA models: causality and invertibility

2. AR covariance functions

3. MA and ARMA covariance functions

4. Partial autocorrelation function

5. Discussion

Review of ARMA processes


ARMA process A stationary solution {Xt } (or if its mean is not zero,
{Xt − µ}) of the linear difference equation

Xt − φ1 Xt−1 − · · · − φp Xt−p = wt + θ1 wt−1 + · · · + θq wt−q


φ(B)Xt = θ(B)wt (1)

where wt denotes white noise, wt ∼ W N (0, σ 2 ). Definition 3.5 adds


the

• identifiability condition that the polynomials φ(z) and θ(z) have


no zeros in common and the
• normalization condition that φ(0) = θ(0) = 1.

Causal process A stationary process {Xt } is said to be causal if there ex-


ists a summable sequence (some require square summable (`2 ), others
want more and require absolute summability (`1 )) sequence {ψj } such
that {Xt } has the one-sided moving average representation

X
Xt = ψj wt−j = ψ(B)wt . (2)
j=0

Proposition 3.1 states that a stationary ARMA process {Xt } is


causal if and only if (iff) the zeros of the autoregressive polynomial
Statistics 910, #9 2

φ(z) lie outside the unit circle (i.e., φ(z) 6= 0 for |z| ≤ 1). Since
φ(0) = 1, φ(z) > 0 for |z| ≤ 1. (The unit circle in the complex plane
consists of those x ∈ C for which |z| = 1; the closed unit disc includes
the interior of the unit circle as well as the circle; the open unit disc
consists of |z| < 1.)
If the zeros of φ(z) lie outside the unit circle, then we can invert each
of the factors (1 − B/zj ) that make up φ(B) = pj=1 (1 − B/zj ) one
Q

at a time (as when back-substituting in the derivation of the AR(1)


representation). Owing to the geometric decay in 1/zj , the coefficients
in the resulting expression are summable. (Complex zeros are more
interesting.)
Suppose, on the other hand, that the process is causal. Then by taking
expectations substituting Xt = ψ(B)wt for Xt in the definition of the
ARMA process, it follows that for all k = 0, 1, . . . that

E φ(B)ψ(B)wt wt−k = E θ(B)wt wt−k ⇒ φ(z) ψ(z) = θ(z)

By definition, φ(z) and θ(z) share no zeros and ψ(z) < ∞ for |z| ≤ 1
(since the ψj are summable). The zeros of φ(z) must be outside the
unit circle else we get a contradiction because the existence of such a
zero implies that φ(z̃) = 0 and θ(z̃) = 0. (For |z| > 1, ψ(z) can grow
arbitrarily large, balancing the zero of φ(z).)

Covariance generating function The covariances of the ARMA process


{Xt } are

X
γ(h) = σ 2 ψj+|h| ψj . (3)
j=0

Equivalently, the covariance γ(h) is the coefficient of z |h| in the poly-


nomial
θ(z)θ(z −1 )
G(z) = σ 2 ψ(z)ψ(z −1 ) = σ 2 , (4)
φ(z)φ(z −1 )
which is called the covariance generating function of the process. The
constant in the polynomial z −h G(z) is γ(h). Note the ambivalence
about the direction of time.
Statistics 910, #9 3

Connection to spectrum Recall that the spectral density function also


“generates” the covariances in the sense that
Z 1/2
γ(h) = e−i2πωh f (ω)dω .
−1/2

Compare this expression to the following that uses the covariance


generating function. Replace z in (4) by e−i2πω and integrate over
− 21 ≤ ω ≤ 21 to get
1/2 −i2πω )|2
 
2 |θ(e
Z
−2πωh
γ(h) = e σ dω .
−1/2 |φ(e−i2πω )|2
| {z }
f (ω)

The spectral density of an ARMA process is the shown ratio of polyno-


mials. Since polynomials are dense in the space of continuous functions
on bounded intervals, ARMA processes can approximate any station-
ary process with continuous spectrum.

Invertible The ARMA process {Xt } is invertible if there exists an `2 se-


quence {πj } such that we obtain

X
wt = πj Xt−j . (5)
j=0

Proposition 3.2 states that the process {Xt }is invertible if and only
if the zeros of the moving average polynomial lie outside the unit circle.

Common assumption: invertible and causal In general when dealing


with covariances of ARMA processes, we assume that the process is
causal and invertible so that we can move between the two one-sided
representations (5) and (2).

Example 3.6 shows what happens with common zeros in φ(z) and θ(z).
The process is

Xt = 0.4Xt−1 + 0.45Xt−2 + wt + wt−1 + 0.25wt−2

for which

φ(z) = (1 + 0.5z)(1 − 0.9z), θ(z) = (1 + 0.5z)2 .


Statistics 910, #9 4

Hence, the two share a common factor. The initial ARMA(2,2) reduces
to a causal, invertible ARMA(1,1) model.

Calculations in R R includes the function polyroot for finding the ze-


ros of polynomials. Other symbolic software (e.g., Mathematica) do
this much better, giving you a formula for the roots in general (when
possible). S&S supply some additional utilities, particularly ARMAtoMA
which finds the moving average coefficients for an arbitrary ARMA
process and ARMAacf and ARMApacf.

AR covariance functions
Estimation Given the assumption of stationarity, in most cases we can
easily obtain consistent estimates of the process covariances, such as
PT −h
t=1 (xt+h − x̄)(xt − x̄)
γ̂(h) = . (6)
n
What should such a covariance function resemble? Does the “shape”
of the covariance function or estimated correlation function
γ̂(h)
ρ̂(h) =
γ̂(0)

distinguish the underlying process? (There are problems with this


approach, suggested by our discussion of the dependence in covariance
estimates.)

Yule-Walker equations These equations expose the relationship between


the covariances and coefficients of an autoregression: the covariances
satisfy the difference equation that defines the autoregression. (This
comment helps to explain the huge dependence in estimates of the
covariances.) The one-sided moving-average representation of the pro-
cess (2) implies that wt is uncorrelated with prior observations Xs , s <
t. Hence, for lags k = 1, 2, . . . (assuming E Xt = 0)

E [Xt−k (Xt − φ1 Xt−1 − · · · − φp Xt−p )] = E[Xt−k wt ]



γ(k) − φ1 γ(k − 1) · · · − φp γ(k − p) = φ(B)γ(k) = 0 .
Statistics 910, #9 5

Rearranging the terms, we obtain the Yule-Walker equations (where


δk = 0 for k 6= 0),
p
X
2
δk σ = γ(k) − γ(k − j)φj , k = 0, 1, 2, . . . . (7)
j=1

Written out, these give an equation that provides σ 2 (for k = 0)

σ 2 = γ(0) − γ(1)φ1 − γ(2)φ2 − · · · − γ(p)φp

and p additional equations that give the coefficients:


    
γ(1) γ(0) γ(1) γ(2) . . . γ(p − 1) φ1
 γ(2)   γ(1)
   γ(0) γ(1) . . . γ(p − 2) 
 φ2 

. . . γ(p − 3)
    
 γ(3)  =  γ(2) γ(1) γ(0)  φ3 
. . . .. ..
    
 .   .. ..
 .   .

.

 
γ(p) γ(p − 1) γ(p − 2) γ(p − 3) . . . γ(0) φp

Define vectors γ = (γ(1), . . . , γ(p))0 , φ = (φ1 , . . . , φp )0 and the matrix


Γp = [γ(j − k)]j,k=1,...,p . The matrix form of these last p equations of
the Yule-Walker equations is

γ = Γp φ .

(The Yule-Walker equations are analogous to the normal equations


from least-squares regression, with Y = Xt and the explanatory vari-
ables Xt−1 , . . . , Xt−p ). The sample version of these equations is used
in some methods for estimation, most often to obtain initial estimates
required for an iterative estimator.

Difference equations Backshift notation is handy in appreciating these


characteristics of the covariances. When we multiply both sides of
φ(B)Xt = wt by Xt−h , h > 0 and take expections we obtain

φ(B)γ(h) = 0 ⇒ γ(h) solves the difference eqn defined by φ(B)

A solution of this homogeneous difference equation is a sequence {ch }


for which

ch − φ1 ch−1 − φ2 ch−2 − · · · − φp ch−p φ(B)c = 0 .


Statistics 910, #9 6

As a solution, try (as in finding the eigenvectors of the circulant ma-


trix) ch = zj−h , where φ(zj ) = 0. Then observe that

zj−h (1 − φ1 zj − φ2 zj2 − · · · − φp zjp ) = 0

In general, any linear combination of the zeros of φ(z) is a solution.


To find the particular solution, one must take into account boundary
conditions implied by the first few values of γ(h) given by the Yule-
Walker equations. The solution is of the form (ignoring duplicated
zeros) X
γ(k) = aj zj−k , |zj | < 1 ,
j

where the aj are constants (determined by boundary conditions).

Boundary conditions The Yule-Walker equations can be solved for γ(0),


γ(1), . . ., γ(p−1) given φ1 , . . . , φp . This use of these equations (i.e., to
solve for the covariances from the coefficients rather than expressing
the coefficients in terms of the covariances) is the “inverse” of how
they are used in estimation.

MA and ARMA covariance functions


Moving average case For an MA(q) process, we have (θ0 = 1)
X
γ(h) = σ 2 θj+|h| θj
j

where θj = 0 for j < 0 and j > q. In contrast to the geometric decay


of an autoregression, the covariances of a moving average “cut off”
abruptly. Such covariance functions are necessary and sufficient to
identify a moving average process.

Calculation of the covariances via the infinite MA representation and


equation (3) proceeds by solving system of equations, defined by the
relation
θ(z)
ψ(z) = ⇒ ψ(z)φ(z) = θ(z) .
φ(z)
The idea is to match the coefficients of like powers of z in

(1 + ψ1 z + ψ2 z 2 + . . .)(1 + φ1 z + φ2 z 2 + . . .) = (1 + θ1 z + . . .)
Statistics 910, #9 7

For example, the equating the coefficients of z implies that

ψ1 + φ1 = θ1

But this only leads to the collection of ψ’s, not the covariances. The
covariances require summing the resulting expressions.

Mixed models Observe that the covariances satisfy the convolution ex-
pression (multiply both sides of (1) by lags Xt−j , j ≥ max(p, q + 1))
p
X
γ(j) − φk γ(j − k) = 0, j ≥ max(p, q + 1),
k=1

which is again a homogeneous linear difference equation. Thus, for


high enough lag, the covariances again decay as a sum of geometric
series. The mixed ARMA(1,1) example from the text (Example 3.11,
p. 105) illustrates these calculations. For initial values, we find
p
X X
γ(j) − φk γ(j − k) = σ 2 θk ψk−j
k=1 j≤k≤q

a generalization of the Yule-Walker equations. The extra summand


arises from the observation that E Xt−k wt = σ 2 ψk .
Essentially, the initial q covariances of an ARMA(p, q) process deviate
from the recursion that defines the covariances of the AR(p) compo-
nents of the process.

Partial autocorrelation function


Definition The partial autocorrelation function φhh is the partial correla-
tion between Xt+h and Xt conditioning upon the intervening variables,

φhh = Corr(Xt+h , Xt |Xt+1 , . . . , Xt+h−1 ) .

Consequently, φ11 = ρ(1), the usual autocorrelation. The partial au-


tocorrelations are often called reflection coefficients, particularly in
signal processing.
Statistics 910, #9 8

Partial regression The reasoning behind the use of the partial corre-
lations resembles the motivation for partial regression residual plots
which show the impact of a variable in regression. If we have the OLS
fit of the two-predictor linear model

Y = b0 + b1 X1 + b2 X2 + residual

and we form two “partial” regressions by regressing out the effects of


X1 from X2 and Y ,

r2 = X2 − a0 − a1 X1 , ry = Y − c0 − c1 X1 ,

then the regression coefficient of residual ry on the other residual r2 is


b2 , the multiple regression coefficient.

Defining equation Since the partial autocorrelation φhh is the coefficient


of the last lag in the regression of Xt on Xt−1 , Xt−2 , . . . , Xt−h , we
obtain an equation for φhh (assuming that the mean of {Xt } is zero)
by noting that the normal equations imply that

E [Xt−j (Xt − φh1 Xt−1 − φh2 Xt−2 − · · · − φhh Xt−h )] = 0, j = 1, 2, . . . , h .

Key property For an AR(p) process, φhh = 0, h > p so that the partial
correlation function cuts off after the order p of the autoregression.
Also notice that φpp = φp . Since an invertible moving average can be
represented an infinite autoregression, the partial autocorrelations of
a moving average process decay geometrically.
Hence, we have the following table of behaviors (Table 3.1):

AR(p) ARMA(p, q) MA(q)


γ(h) geometric decay geometric after q cuts off at q
φhh cuts off at p geometric after p geometric decay

Once upon a time before the introduction of model selection criteria


such as AIC (discussed in S&S), this table was the key to choosing the
order of an ARMA(p, q) process.

Estimates Estimates of the partials autocorrelations arise from solving


the Yule-Walker equations (7), using a recursive method known as the
Levinson recursion. This algorithm is discussed later.
Statistics 910, #9 9

Discussion
Weak spots We have left some problems only partially solved, such as
the meaning of infinite sums of random variables. How does one ma-
nipulate these expressions? When are such manipulations valid?

Correlation everywhere Much of time series analysis is complicated be-


cause the observable terms {Xt } are correlated. In a sense, time series
analysis is a lot like regression with collinearity. Just as regression is
simplified by moving to uncorrelated predictors, time series analysis
benefits from using a representation in terms of uncorrelated terms.

You might also like