Rayleigh Quotients and Inverse Iteration: Restriction To Real Symmetric Matrices

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

LECTURE 27

Rayleigh Quotients and


Inverse Iteration
Restriction to real symmetric matrices
Most algorithmic ideas in numerical linear algebra are
usually applicable to general matrices, or simplify for
symmetric matrices.
For eigenvalue problems the dierences are substantial.
We simplify matters by assuming (for now) that A is real
and symmetric.
We assume =
2
until further notice.
A is real and symmetric:
it has real eigenvalues
a complete set of orthogonal eigenvectors
Notation:
real eigenvalues:
1
,
2
, . . . ,
m
orthonormal eigenvectors: q
1
, q
2
, . . . , q
m
Most of what we describe now pertains to Phase 2 of the
eigenvalue calculation.
Think of A as real, symmetric, and tridiagonal.
27-1
Rayleigh quotient
Rayleigh quotient of a vector x R
m
:
r(x) =
x
T
Ax
x
T
x
Note
If x is an eigenvector, then r(x) = , the corresponding
eigenvalue.
One way to motivate this formula:
Given x, what scalar minimizes Ax x?
i.e., What scalar acts most like an eigenvalue?
This is an m1 least-squares problem:
x Ax
Here, x is the matrix, is the unknown, and Ax is the
right-hand side.
m equations for 1 unknown
Use the normal equations to derive
= r(x).
27-2
Thus r(x) is a natural eigenvalue estimate.
Expand x as a linear combination of q
1
, q
2
, . . . , q
m
:
x =
m

j=1
a
j
q
j
Then
r(x) =

m
j=1
a
2
j

m
j=1
a
2
j
i.e., r(x) is a weighted mean of the eigenvalues of A.
If

a
j
a
J

< for all j = J (i.e., x is close to q


J
) then
r(x) r(q
J
) = O(
2
)
Use this fact:
r(q
J
) =
J
27-3
Power iteration
Let v
(0)
be unit-norm: v
(0)
= 1.
Power iteration produces a sequence v
(i)
q
1
.
ALGORITHM 27.1: POWER ITERATION
Initial v
(0)
with v
(0)
= 1
for k = 1, 2, . . . do
w = Av
(k1)
% apply A
v
(k)
= w/w % normalize

(k)
= (v
(k)
)
T
Av
(k)
% compute Rayleigh quotient
end for
Note
Good termination criteria are vital in practice.
To analyze, write
v
(0)
= a
1
q
1
+ a
2
q
2
+ . . . + a
m
q
m
Since v
(k)
is some multiple of A
k
v
(0)
. Then there exist
27-4
constants c
k
such that
v
(k)
= c
k
A
k
v
(0)
= c
k
A
k
(a
1
q
1
+ a
2
q
2
+ . . . + a
m
q
m
)
= c
k
(a
1

k
1
q
1
+ a
2

k
2
q
2
+ . . . a
m

k
m
q
m
)
= c
k

k
1
_
a
1
q
1
+ a
2
_

1
_
k
q
2
+ . . . +
_

1
_
k
q
m
_
We used
A
k
q
j
=
k
j
q
j
.
Theorem
Let |
1
| > |
2
| . . . |
m
| 0 and q
T
1
v
(0)
= 0.
Then the iterates of Algorithm 27.1 satisfy
v
(k)
(q
1
) = O
_

k
_
and
|
(k)

1
| = O
_

2k
_
as k .
If
1
> 0, all signs are all + or all .
If
1
< 0, signs alternate.
27-5
Here are 3 shortcomings:
1. It only nds the eigenvector corresponding to the
largest eigenvalue.
2. Convergence is linear, with the error reduced by a
constant factor

at each iteration.
3. If
2

1
, convergence can be very slow!
Note
Use deation to nd next eigenvector:
A
1
q
1
q
T
1
=
m

i=2

i
q
i
q
T
i
Inverse iteration
Goal: amplify dierences between eigenvalues and accel-
erate convergence.
Observation: for any that is not an eigenvalue of A:
the eigenvector of (A I)
1
are the same as the
eigenvectors of A
the corresponding eigenvalues are
_
1

_
m
j=1
.
27-6
Suppose is close to an eigenvalue
J
of A.
Then
1
(
j
)
may be much larger than
1

for all j = J.
Power iteration on (A I)
1
should converge rapidly
to q
J
.
This idea is called inverse iteration.
ALGORITHM 27.2: INVERSE ITERATION
Initial v
(0)
with v
(0)
= 1
to some value near
J
for k = 1, 2, . . . do
Solve (A I)w = v
(k1)
apply (A I)
1
v
(k)
= w/w normalize

(k)
= (v
(k)
)
T
Av
(k)
compute Rayleigh quotient
end for
What if is chosen as an eigenvalue of A? Then
A I is singular
(A I)w = v
(k1)
is highly ill-conditioned
Turns out these are not problems:
It can be shown that if the system is solved by a backward
stable algorithm to produce a solution w, then w/ w
will be close to w/w even though w and w may not
be.
27-7
Properties:
convergence is still linear
we can control the rate of the linear convergence by
improving the quality of .
If is much closer to one eigenvalue of A than to the
others, the largest eigenvalue of (AI)
1
will be much
larger than the rest.
Theorem
Suppose
J
is the closest eigenvalue to and
K
is the
second closest, i.e.,
|
J
| < |
K
| |
j
|
for all j = J.
Suppose that q
T
J
v
(0)
= 0.
Then the iterates of Algorithm 27.2 satisfy
v
(k)
(q
J
) = O
_


J

K

k
_
and
|
(k)

J
| = O
_


J

K

2k
_
27-8
Inverse iteration is used in practice to calculate eigenvec-
tors of a matrix when eigenvalues are already known.
Rayleigh quotient iteration
We have seen:
e-value estimate from e-vector estimate (Rayleigh
quotient)
e-vector estiamte from e-value estimate (inverse it-
eration)
We now combine these. The idea is to continually improve
the eigenvalue estimates to increase the rate of conver-
gence of inverse iteration at every step.
This algorithm is called Rayleigh quotient iteration.
RAYLEIGHQUOTIENT ITERATION
Initial v
(0)
with v
(0)
= 1

(0)
= (v
(0)
)
T
Av
(0)
for k = 1, 2, . . . do
Solve (A
(k1)
I)w = v
(k1)
(A
(k1)
I)
1
v
(k)
= w/w normalize

(k)
= (v
(k)
)
T
Av
(k)
% update Rayleigh quotient
end for
Each iteration triples the number of digits of accuracy
(cubic convergence).
27-9
Theorem
Rayleigh quotient iteration converges to an e-value/e-
vector pair for (almost all) starting vectors v
(0)
.
When it converges, the convergence is cubic
i.e., if
J
is an eigenvalue of A and v
(0)
is suciently
close to the eigenvector q
J
, then as k
v
(k+1)
(q
J
) = O(v
(k)
(q
J
)
3
)
and
|
(k+1)

J
| = O(|
(k)

J
|
3
).
Note
The signs are not necessarily the same on both sides!
Example 27.1
A =
_
_
2 1 1
1 3 1
1 1 4
_
_
Let
v
(0)
=

3
_
_
1
1
1
_
_
27-10
Applying Rayleigh quotient iteration to A, we obtain the
following eigenvalue estimates:

(0)
= 5

(1)
= 5.2131 . . .

(2)
= 5.214319743184 . . .
The actual eigenvalue is = 5.214319743377.
Only 3 iterations has produced 10 digits of accuracy!
Three more iteration should increase this to 270 digits of
accuracy - if we had a computer capable of storing this!
Operation counts
We conclude by tabulating the amount of work required
at each step of the three algorithms we have described.
Suppose A is dense.
Each step of power iteration requires a matrix-vector mul-
tiplication: O(m
2
) ops
Each step of inverse iteration requires the solution of a
linear system: O(m
3
) ops.
Note: we actually have the same coecient matrix and
dierent right-hand sides.
If we store the LU or QR factorization, each step only
involves a backward substitution, and hence costs only
27-11
O(m
2
) ops (after decomposing A).
For Rayleigh iteration, the matrix to be inverted at each
step changes, so it is hard to beat O(m
3
) ops per step.
The saving grace of course is that very few steps may be
sucient.
These gures improve dramatically if A is tridiagonal:
All three algorithms only take O(m) ops in this case.
Similarly, if A is non-symmetric and we have to deal with
Hessenberg matrices, the op count increases to O(m
2
)
for each algorithm.
27-12

You might also like