On Complex Extrapolated Successive Overrelaxation (Esor) : Some Theoretical Results
On Complex Extrapolated Successive Overrelaxation (Esor) : Some Theoretical Results
On Complex Extrapolated Successive Overrelaxation (Esor) : Some Theoretical Results
Nicholas J. Daras
Abstract
In this paper we discuss the complex theory of the extrapolated successive
overrelaxation (ESOR) method for the numerical solution of large sparse linear
systems A·x = b of complex algebraic equations. Some subsets of convergence
for this method are obtained through an application of conformal mapping
techniques. We also study the choice of the involved complex parameters
giving an arbitrarily “good” convergence behavior for the method. Among
other results, it is shown that in general there is no value of the complex
parameters maximizing the asymptotic rate of convergence and we investigate
the conditions under which the complex extrapolated Gauss-Seidel (EGS)
method converges as soon as possible..
1 General formulation
Let us consider a complex system of linear equations
A·x=b (1.1)
where A is a consistently ordered complex m×m matrix with non-vanishing diagonal
elements and b is a given complex m-vector. By splitting A into A = D − CL − CU ,
where D is a diagonal matrix possessing the same diagonal elements as A and
−CL , −CU are the strictly lower and upper triangular parts of A respectively, we
define the general extrapolated successive overrelaxation (ESOR) by
x(n+1) = (1−τ )·x(n)+ω·Lx(n+1) +(τ −ω)·Lx(n) +τ ·Ux(n) +τ ·c (n = 0, 1, 2, . . .) (1.2)
Received by the editors September 1999.
Communicated by A. Bultheel.
1991 Mathematics Subject Classification : 65F10.
Key words and phrases : Extrapolated successive overrelaxation, Conformal mappings.
where L = D−1 ·CL , U = D−1 ·CU , c = D−1 ·b and ω, τ (6= 0) are complex parameters.
By putting
Lτ,ω = (I −ω·L)−1 ·[(1−τ )·I +(τ −ω)·L+τ ·U] = I −τ ·(I −ω·L)−1·D−1 ·A = I −τ ·Λω
with
Λω = (I − ω · L)−1 · D−1 · A
we can write the ESOR method as
PROBLEM 2. Determine the values for ω and τ , if they exist, which are optimum,
in the sense of minimising the spectral radius ρ(Lτ,ω ).
In the real case, that is when the iteration matrix B possesses only real eigen-
values and the parameters ω and τ are in R, many authors independently presented
interesting results ([1], [2], [3], [4], [7], [8], [9]). However, the detailed analysis was
not presented since a tremendous number of cases had to be examined.
The purpose of this paper is to study the complex case, that is when B possesses
complex eigenvalues and the parameters ω and τ are in C. In Section 2, we follow
step-by-step the analysis in [6] and [5] and we give some answers to the first problem
by showing that if ω ∈ C, τ ∈ C and if |ω − 1| < 1 and
!
ω2 + 1 1
∆ ((1/ω); (1/|ω|)) ⊂ ∆ ((1/τ ); (1/|τ |)) or τ ∈ ω−∆(1; 1)·∆ ; , (ω 6= 1)
ω |ω|
radius ρ(Lτ,ω ) and we investigate the conditions under which for ε > 0 there exists
a (ω, τ ) ∈ C2 such that ρ(Lτ,ω ) < ε. Among other results, it is shown that if the
spectrum of B is contained in the open interval (−1,q 1) and if 0 < ε < 2, then for
ω = 1 and τ = x + i · y, with 0 < x < 1 and 0 < y < ε · (2 − ε), we have ρ(Lτ,1) < ε
(Corollary 3.8). However, if the Jacobi matrix B has a critical eigenvalue-pair ±µe,
that is a pair which corresponds to the dominant absolute value of the eigenvalues
of the Lτ,ω -matrix whenever (ω, τ ) ∈ C2, then for
2 1
ω= √ and τ = √ e 6= 0, ±1)
(µ
1 ∓ 1 − µe2 ∓ 1−µe2
√
there holds ρ(Lτ,ω ) = 0 (see also [7] for the real case). Here, A (with A ∈ C − {0})
1 √
denotes the principal value of A 2 , that is A = exp 12 · log A = exp 12 · [ln |A|+
i · arg A]), where log A is the principal logarithmic value of A and arg A is the
principal argument of A.
σ(B) = {µ : eigenvalue of B} ,
n o
σ(Λω ) = λ : eigenvalue of Λω = (I − ω · L)−1 · D−1 · A ,
n o
σ(Lτ,ω ) = ζ : eigenvalue of Lτ,ω = I − τ · (I − ω · L)−1 · D−1 · A .
First, observe that the identity Lτ,ω = I − τ · Λω implies a linear relation between
σ(Λω ) and σ(Lτ,ω ) : if ζ is an eigenvalue of Lτ,ω , then
ζ = 1−τ ·λ (2.1)
(1 − λ)2 = µ2 · (1 − λ · ω) (2.2)
The above Theorem describes a mapping between the complex µ- and λ-planes and
is studied by means of successive elementary transformations. Evidently, (2.2) is
equivalent to s s
ω ω−1 1−λ·ω
µ· √ = + .
ω −1 1−λ·ω ω −1
With s
ω ω−1
2·z = √ · µ and ξ = (2.3)
ω−1 1−λ·ω
this becomes !
1 1 √
z= · ξ+ and ξ = z + z 2 − 1. (2.4)
2 ξ
The map defined by (2.4) is the well known Joukowski function. Putting a :=
√
ω − 1 (2.4) gives
1 1
z = · a+ · µ. (2.5)
2 a
Now, let ±µ be an eigenvalue-pair for the B-matrix. By (2.5), these correspond
to the points z + and z − in Figure 1 below. There is now an ellipse E|α| such that z +
and z − are two points interior to E|α|. By (2.4), this ellipse is mapped on two circles
|ξ| = |a| and |ξ| = |a|1
in the ξ-plane and its interior is mapped on to the annulus
|a| < |ξ| < |a| in the ξ-plane.
1
Figure 1
|ω − 1|2 1 1
< λ − < (2.7)
|ω| ω |ω|
On complex extrapolated successive overrelaxation : some theoretical results 9
in the λ-plane.
We may now formulate the main results of this Section.
In what follows, for any subset U of C − {0}, we will denote by U −1 the set
{z ∈ C − {0} : z −1 ∈ U}. Further, if A and B are two subsets of C, then we will
denote by A · B the set {a · b : a ∈ A and b ∈ B}.
the ESOR method converges (under √ the assumption that all eigenvalues of B belong
to the interior of the ellipse E( ω − 1)). In particular, if
It is easy to verify that (2.9) holds for any (a + ib) ∈ σ(Λω ) if and only if y ∈ Sω
and that (2.10) is fulfilled for every (a + ib) ∈ σ(Λω ) if and only if y ∈ Fω and
q q
Reλ − |λ|2 − (|λ|2 · y + Imλ)2 Reλ + |λ|2 − (|λ|2 · y + Imλ)2
max < x < min .
λ∈σ(Λω ) |λ|2 λ∈σ(Λω ) |λ|2
The assumptions of the above Theorem seem to be very technical, but, on the
other hand, its proof generalizes to the context of the problem of optimum values
(see Theorem 3.4). For instance let us give a direct consequence of this Theorem.
Corollary 2.6. If ω ∈ C is chosen so that Reλ > 0 for any λ ∈ σ(Λω ) and if
τ = x + iy satisfies
q
|λ|2 − Imλ Reλ + |λ|2 − (|λ|2 · y + Imλ)2
0 < y < min and 0 < x < min
λ∈σ(Λω ) |λ|2 λ∈σ(Λω ) |λ|2
then (ω, τ ) ∈ Ω.
Let us finally turn to the special cases ω = 0 and ω = 1.
If ω = 0, the (1.2) yields the JOR method:
x(n+1) = Lτ,0 x(n) + τ · c = [I − τ · Λ0 ] · x(n) + τ · c. (2.11)
From (2.2), it follows that if µ ∈ σ(B) then λ = (1 ± µ) ∈ σ(Λ0), and conversely,
if λ ∈ σ(Λ0) then µ = ±(1 − λ) ∈ σ(B). By (2.1), we therefore have: ρ(Lτ,0) < 1
iff |1 − τ · (1 ± µ)|2 < 1 for all µ ∈ σ(B) or iff [1 − τ · (1 ± µ)] · [1 − τ · (1 ± µ)] =
1 − τ · (1 ± µ) − τ · (1 ± µ) + |τ |2 · |1 ± µ|2 = 1 − 2 · Re[τ · (1 ± µ)] + {Re[τ · (1 ±
µ)]}2 + {Im[τ · (1 ± µ)]}2 = {Re[τ · (1 ± µ)] − 1}2 + {Im[τ · (1 ± µ)]}2 < 1 for all
µ ∈ σ(B). Hence
Theorem 2.7. If ±1 6∈ σ(B), then a necessary and sufficient condition for the JOR
to converge is the validity of the following inequality
2 2
{Re[τ · (1 ± µ)] − 1} < 1 − {Im[τ · (1 ± µ)]} for any µ ∈ σ(B).
Corollary 2.8. ([11]) Suppose σ(B) ⊂ (−1, 1). If
2
0<τ < for any µ ∈ σ(B),
1±µ
the JOR method converges.
If ω = 1, the (1.2) gives the EGS method:
x(n+1) = Lτ,1x(n) + τ · (I − L)−1 · c = [I − τ · Λ1 ] · x(n) + τ · (I − L)−1 · c. (2.12)
If µ is any eigenvalue of B, then λ = (1 − µ2 ) ∈ σ(Λ1), because of (2.2). Conversely,
if λ ∈ σ(Λ1), then, by Theorem 2.1, there exists a µ ∈ σ(B) such that λ = 1 − µ2 .
From (2.1), it follows that the inequality ρ(Lτ,1 ) < 1 holds iff |1 − τ · (1 − µ2 )|2 < 1
for all µ ∈ σ(B). Since
h i h i
2 2
1 − τ · (1 − µ ) = 1 − τ · (1 − µ ) · 1 − τ · (1 − µ ) =
2 2
h i n h io2 n h io2
1 − 2 · Re τ · (1 − µ2 ) + Re τ · (1 − µ2 ) + Im τ · (1 − µ2 ) =
n h i o2 n h io2
Re τ · (1 − µ2 ) − 1 + Im τ · (1 − µ2 ) ,
we immediately establish the
12 N. J. Daras
Theorem 2.9. A necessary and sufficient condition for EGS to converge is the
validity of the following inequality
n o2 n o2
Re[τ · (1 − µ2 )] − 1 < 1 − Im[τ · (1 − µ2 )] for any µ ∈ σ(B),
is minimized is characterized by the fact that the absolute values of the roots are
equal.
More precisely, we have the following
Theorem 3.1. If there is a t0 ∈ U fulfilling
f(t0) ≤ f(t)
|hα (ua (t0), νa (t0))| < |hα (ua (t), νa(t))| (3.4)
for all t ∈ Vt0 . By (3.2), hα (ua (t0), νa (t0)) 6= 0, and, by (3.4), hα (ua (t), νa(t)) 6= 0
for any t ∈ Vt0 . From the minimum principle for holomorphic functions and from
(3.4), it follows that the function hα (ua (t), νa(t)) is constant in Vt0 . By the identity
theorem, the holomorphic function hα (ua(t), νa(t)) must be constant in the open
connected set U. This is an absurdity. Consequently, there exist a point te ∈ Vt0
satisfying
e νa (t)e < |hα (ua (t0 ), νa (t0 ))| .
hα ua (t), (3.5)
By (3.3) and (3.5), we thus obtain the inequalities
e < hα ua (t),
e νb (t) e < |hα (ua (t0 ), νa (t0 ))| ,
e νa (t)
hβ ub (t),
which contradict (3.1). Hence, the point t0 ∈ U must be such that |hβ (ub (t0), νb (t0 ))|
= |hα (ua(t0 ) , νa (t0))|, for any α, β = 1, 2, . . . , 2m and b ∈ {(β/2), (β + 1)/2}, a ∈
{(α/2), (α + 1)/2}. The proof is complete.
We shall now study the problem of determination of optimum values for the
parameters ω and τ . Assume that (ω, τ ) is a fixed point in the convergence domain
Ω for (1.2) (or (1.3)), i.e. ρ(Lτ,ω ) < 1. Further, suppose
( )
1
ω 6∈ {0, 2} ∪ · (−∞, −4] : µ ∈ σ(B) − {±1} .
1 − µ2
(1 − λ)2 = µ2 · (1 − λ · ω) (3.7)
Thus, if {µ1 , . . . , µm } is the set σ(B) of all eigenvalues of B, then for any j =
1, 2, . . . , m there holds
with
uj (ω, τ ) = − τ · ω · µ2j − 2τ + 2 and νj (ω, τ ) = τ · ω · µ2j − 2τ + 1 + τ 2 − τ 2 · µ2j .
and
q
0 2 − ω · µ2j + µj ω 2 · µ2j − 4ω + 4
ζj (uj (ω, τ ), νj (ω, τ )) = 1 − τ · .
2
Suppose (ω, τ ) is an optimum value for the ESOR method. According to Theorem
3.2, it must hold
1 − τ · λ (ω) (1/λ (ω)) − τ λ (ω)
α α β
=1⇔ = (3.10)
1 − τ · λβ (ω) (1/λβ (ω)) − τ λα (ω)
\
2m
τ∈ C (Hα,β (ω); Rα,β (ω)) .
α,β=1
Notice that the explicit algebraic form of the equation (3.10) for the circle
C (Hα,β (ω); Rα,β (ω)) is
τ12 −2·τ1 {ReHα,β (ω)}+τ22 −2·τ2 {ImHα,β (ω)}+|Hα,β (ω)|2 = R2α,β (ω) (τ = τ1 +iτ2).
(3.11)
Following Theorem 3.3, the investigation of the optimum values (ω, τ ) for the ESOR
method requires the knowledge of the conditions on ω which guarantee that the
common intersection of all the circles C(Hα,β (ω); Rα,β (ω)) is not empty.
The last two Theorems allow us to suspect that the existence of an optimum
value depends upon how the eigenvalues in the µ-plane are located and that in case
of a general distribution such a value may not exist: The next Theorem shows how
the complex parameters involved can give an arbitrarily good convergence behavior
for the ESOR method; its proof is completely analogous to that of Theorem 2.5.
Theorem 3.4. Let ε > 0. For any
( )
1
ω ∈C− · (−∞, −4] : µ ∈ σ(B) − {±} − {0, 2},
1 − µ2
16 N. J. Daras
put n o
−2
ω := max |λ|
λ(ε) · (−|λ| − ε · Imλ) : λ ∈ σ(Λω )
and n o
e (ε) := min |λ|−2 · (−|λ| + ε · Imλ) : λ ∈ σ(Λ ) ,
λ ω ω
If
( ( ) )
1
G(ε)
:= ω ∈ C − · (−∞, −4] : µ ∈ σ(B) − {±1} − {0, 2} : Fω(ε) 6= ∅
1 − µ2
then, for any
n
(ω, τ ) ∈ G(ε) × x + iy ∈ C : y ∈ Fω(ε), ω ∈ G(ε) and
q
Reλ − |λ|2 − [|λ|2 · y + Imλ]2 − |λ|2 · (1 − ε)2
max <x<
λ∈σ(Λω ) |λ|2
q
Reλ + |λ|2 − [|λ|2 · y + Imλ]2 − |λ|2 · (1 − ε)2
min ,
λ∈σ(Λω ) |λ|2
we have
ρ(Lτ,ω ) < ε.
Corollary 3.5. Let 0 < ε < 2. If ω ∈ C is chosen so that
" #2
Reλ
σ(Λω ) ⊂ R+ × R+ and 1 + · ε · (2 − ε) > 1 (λ ∈ σ(Λω ))
Imλ
σ(Λω ) ⊂ R+ ,
and if τ = x + iy ∈ C satisfies
q
1 ε(2 − ε)
0 < x < min and 0 < y < min ,
λ∈σ(Λω ) λ λ∈σ(Λω ) λ
then
ρ(Lτ,ω ) < ε.
and
1 1
min = min .
λ∈σ(Λω ) λ m∈σ(B) 1 − µ2
Letting
µ = min{µ : µ ∈ σ(B) ⊂ (−1, 1)},
we immediately have the following:
then
ρ(Lτ,1) < ε.
then
ρ(Lτ,1) < ε.
According to Corollary 3.8 (or 3.7), the complex EGS method may have an
arbitrarily “good” convergence behavior.
The difficulty of the investigation in practice for the assumptions of Theorem
3.4 and of Corollaries 3.5 and 3.6 forces us to seek for another confronting of the
problem.
In what follows, we will assume that the B-matrix has a critical eigenvalue - pair
±µe . By definition, the critical eigenvalue - pair ±µ
e is that pair which corresponds to
the dominant absolute value of eigenvalue for the Lτ,ω -matrix whenever (ω, τ ) ∈ C2 .
Under this strong condition we have
n n 0
oo
e |ζe |
min 2 ρ(Lτ,ω ) = min 2 max |ζ|, ,
(ω,τ )∈C (ω,τ )∈C
18 N. J. Daras
ζ 2 − (τ · ω · µ
e 2 − 2τ + 2) · ζ + (τ · ω · µ
e2 − 2τ + 1 + τ 2 − τ 2 · µ
e 2 ) = 0. (3.12)
n o
e |ζe | is minimized is
By Theorem 3.1, the value (ω0 , τ0 ) of (ω, τ ) for which max |ζ|,
0
e = |ζe0 |. Setting
characterized by the fact that |ζ|
√
2 ± 2 1 − µ2 1
ω0 = and τ0 = √ ,
µe 2 ∓ 1−µe2
it is readily seen that (τ0 ·ω0 · µe2 −2τ0 +2) = (τ0 ·ω0 · µe2 −2τ0 +1+τ02 −τ02 · µ
e 2 ) = 0 and
e e 0
therefore, in such a case ζ = ζ = 0, which implies that min 2 ρ(Lτ,ω ) = ρ(Lτ0 ,ω0 ) = 0.
(ω,τ )∈C
We have thus proved the following
Theorem 3.9. Assume that the B-matrix has a critical eigenvalue - pair ±µ e 6=
0, ±1. The optimum values of (ω, τ ) that minimize the spectral radius for the Lτ,ω -
matrix and therefore maximize the asymptotic rate of convergence for the ESOR
method are
2 1
ω0 = √ , τ0 = √
1− 1−µ e 2 − 1−µ e2
and
2 1
ω0 = √ , τ0 = √
1+ 1−µ
e 2 1−µe2
2x1 − x2 = i
−x1 + 2x2 − x3 = 0
−x2 + 2x3 = i.
2 1
ω0 = r √ 2 = 1.1715728 and τ0 = r √ 2 = 1.4142135
1+ 1− ± 22 1 − ± 22
On complex extrapolated successive overrelaxation : some theoretical results 19
References
[1] Avdelas, G. and Hadjidimos, A., Optimum accelerated method in a special case,
Math. Comp. 36 (1981), 183–187.
[2] Hadjidimos, A., Accelerated overrelaxation method, Math. Comp. 32 (1978),
149–157.
[3] Hadjidimos, A., The optimal solution to the problem of complex extrapolation
of a first order scheme, Linear Algebra Appl., 62 (1984), 241–261.
[4] Hadjidimos, A., A survey of the iterative methods for the solution of linear
systems by extrapolation, relaxation and other techniques, J. Comput. Appl.
Maths 20 (1987), 37–51.
[5] Kjellberg, G., On the convergence of successive overrelaxation applied to a class
of linear systems of equations with complex eigen-values, Ericsson Technics 2
(1958), 245–258.
[6] Kredell, B., On complex successive overrelaxation, BIT 2 (1962), 143–152.
[7] Missirlis, N.M., Convergence theory of extrapolated iterative methods for a class
of non-symmetric linear systems, Numer. Math. 45 (1984), 447–458.
[8] Missirlis, N.M. and Evans, D.J., On the convergence of some generalised pre-
conditioned iterative methods, SIAM J. Numer. Anal. 18 (1981), 591–596.
[9] Missirlis, N.M. and Evans, D.J., The extrapolated successive overrelaxation
(ESOR) method for consistently ordered matrices, Intern. J. Math. & Math.
Sci. 7, No 2 (1984), 361–370.
[10] Opfer, G. and Schober, G., Richardson’s iterations for nonsymmetric matrices,
Linear Algebra Appl. 58 (1984), 343–361.
20 N. J. Daras
[11] Young, D., Iterative methods for solving partial differential equations of elliptic
type, Trans. Am. Math. Soc. 76(1954), 92–111.
[12] Young, D., Iterative solution of large linear systems, Academic Press, New York,
1971.