0% found this document useful (0 votes)
9 views

MLEofa TS

lyapunov

Uploaded by

Ahmed El-Qady
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

MLEofa TS

lyapunov

Uploaded by

Ahmed El-Qady
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

The Maximal Lyapunov Exponent of a Time Series

Mark Goldsmith

A Thesis

in

The Department

of

Computer Science

Presented in Partial Fulfillment of the Requirements


for the Degree of Master of Computer Science at
Concordia University
Montreal, Quebec, Canada

December 2009

c Mark Goldsmith, 2009


Abstract

The Maximal Lyapunov Exponent of a Time Series


Mark Goldsmith

Techniques from dynamical systems have been applied to the problem of predicting epileptic seizures
since the early 90’s. In particular, the computation of Lyapunov exponents from a series of electrical
brain activity has been claimed to have great success. We survey the relevant topics from pure
dynamical systems theory and explain how Wolf et al. adapted these ideas to the practical situation
of trying to extract information from a time series. In doing so, we consider instances of time
series where we may visually extract properties of the maximal Lyapunov exponent in an attempt
to cultivate some intuition for more complicated and realistic situations.

iii
Contents

Introduction 1

1 Discrete Dynamical Systems 2


1.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 One-dimensional discrete dynamical systems. . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Cobwebbing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Stable and attracting fixed points. . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 Periodic trajectories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.5 The Lyapunov exponent of a one-dimensional map. . . . . . . . . . . . . . . . 23
1.2.6 The Lyapunov exponent of a one-dimensional differentiable map. . . . . . . . 24
1.2.7 Conjugacy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.2.8 Computing the global Lyapunov exponent of a map. . . . . . . . . . . . . . . 32
1.3 Multi-dimensional discrete dynamical systems. . . . . . . . . . . . . . . . . . . . . . 34
1.3.1 Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.3.2 Stability of fixed points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.3.3 Lyapunov exponents in general. . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.3.4 Lyapunov exponents of differentiable maps. . . . . . . . . . . . . . . . . . . . 42
1.3.5 Computing global Lyapunov exponents. . . . . . . . . . . . . . . . . . . . . . 43
1.3.6 The spectrum of Lyapunov exponents. . . . . . . . . . . . . . . . . . . . . . . 44
1.3.7 Avoiding Oseledets’ Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Constant, symmetric Jacobians. . . . . . . . . . . . . . . . . . . . . . . . 45
Constant Jacobians. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
A slight improvement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
1.3.8 From a trajectory to its maximal Lyapunov exponent. . . . . . . . . . . . . . 54
How to choose s(t). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Distance biased operators. . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A seminal operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
The finite case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
1.3.9 Trajectory-like sequences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
1.3.10 Primitive trajectories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

2 Time Series and Lyapunov Exponents 66


2.1 The maximal Lyapunov exponent of a time series. . . . . . . . . . . . . . . . . . . . 66
2.2 The maximal Lyapunov exponent of strictly monotonic time series. . . . . . . . . . . 69

Concluding Remarks and Further Work 77

Bibliography 78

Appendix A: Lyapunov Exponents and Epilepsy. 81


Bibliography for predicting epileptic seizures . . . . . . . . . . . . . . . . . . . . . . . . . 85

Appendix B: Linear Algebra. 87


Norms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Jordan Canonical Form. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
The Spectral Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Other Lemmas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Introduction

This thesis is first and foremost a primer on dynamical systems and Lyapunov exponents. The
literature is scattered with inconsistencies and vague explanations, which will be pointed out and
hopefully clarified throughout this thesis. The goal we set out to achieve is not simply to rigorously
introduce dynamical systems, but to explain how Lyapunov exponents can be used to extract in-
formation from a given time series. The motivation for this stems from trying to understand how
Lyapunov exponents are currently being used in an attempt to predict epileptic seizures (see [19],
[18] and [24]). On the whole, we expand on the notes found in [6], which in turn are a clarification
of the celebrated paper by Wolf et. al ([40]).
Chapter 1 begins with a treatment of dynamical systems as pure mathematical objects. We
introduce typical notions such as fixed points, periodic trajectories and stability, and then proceed
to define the Lyapunov exponent (of a point under a map). We will then explore the counterparts
of these ideas in the multi-dimensional case. Throughout this treatment we will illustrate the topics
at hand with demonstrative examples, some of which have become classical staples in the literature
of dynamical systems. Chapter 1 ends with a rigorous treatment of the algorithm provided in [40]
for estimating the maximal Lyapunov exponent of a trajectory.
In Chapter 2 we explain how [40] deals with the problem of estimating Lyapunov exponents
in a minimal setting, where only a time series is provided. Finally, we present various results
about strictly monotonic time series that are also strictly convex or strictly concave. These results
will demonstrate how to use the definitions and the theory that has been built up along the way.
Furthermore, through the exploration of such simple time series, we hope to build some intuition as
to when the Lyapunov exponent can be easily deduced and therefore understood, so that clarity in
more complex situations may ultimately be achieved.

1
Chapter 1

Discrete Dynamical Systems

1.1 Introduction.

A discrete-time dynamical system on a set X is just a function Φ : X → X. This function, often


called a map, may describe deterministic evolution of some physical system: if the system is in state
x at time t, then it will be in state Φ(x) at time t + 1. Study of discrete-time dynamical systems is
concerned with iterates of the map: the sequence

x, Φ(x), Φ2 (x), . . .

is called the trajectory of x and the set of its points is called the orbit of x. These two terms are
often used interchangeably, although we will remain consistent in their usage. The set X in which
the states of the system exist is referred to as the phase space or state space. We will restrict our
attention to maps Φ : X → X such that X is a subset of Rd .

We shall begin by studying one-dimensional maps.

1.2 One-dimensional discrete dynamical systems.

1.2.1 Examples

The following examples of one-dimensional maps will be used throughout this thesis to illustrate
new ideas and techniques as they are introduced.

Example 1.2.1:
The tent map with parameter r (ranging from 0 to 2), is a one-dimensional map Tr : [0, 1] → [0, 1]

2
defined as follows: (
rx 0 ≤ x ≤ 21 ,
Tr (x) = 1
r(1 − x) 2 ≤ x ≤ 1.

The reason for its name is obvious once we plot it.

Figure 1.1: The tent map with r = 1.5.

The parameter is sometimes locked to r = 2 (for example in [8] and [10]) in order to show its simi-
larities with our next example. 

Example 1.2.2:
The logistic map with parameter µ (ranging from 0 to 4) is a one-dimensional map Gµ : [0, 1] → [0, 1]
defined as

Gµ (x) = µx(1 − x). (1.1)

3
Figure 1.2: Logistic map with µ = 3.

The logistic map is likely the most celebrated of all one-dimensional maps. It was originally intro-
duced by Verhulst in 1845 ([39]) in continuous form, and later unearthed as a discrete map by the
biologist Robert May in [23], where he introduces many of its classical features. 

1.2.2 Cobwebbing.

It is often cumbersome and uninformative to work out the trajectory of a point by hand. Fortunately,
for one-dimensional maps there is a technique known as cobwebbing (or less commonly known as a
Verhulst diagram, see [28]) that allows us to visually work out the trajectory of a point. It works as
follows:
1: plot the map Φ(x) vs. x
2: plot the diagonal line y = x, we will call this line L
3: begin at initial condition x0 , draw a vertical line (either up or down) until the plot for Φ(x) is
met
4: while you feel like iterating do
5: draw a horizontal line left or right until L is met at some point (p, p)
6: (p corresponds to the next point on the trajectory, output it if you wish)
7: draw vertically up or down until Φ(x) is met

Example 1.2.3:
Recall the tent map from Example 1.2.1 with r = 0.5, x = 0.6. We can use cobwebbing to visually
verify the trajectory
0.6, 0.2, 0.1, 0.05, 0.025, . . . .
as follows:

4
Figure 1.3: Tent map cobwebbing with r = 0.5, x0 = 0.6.

In the above figure, we start the system off at x = 0.6. Following the cobwebbing algorithm, we move
vertically upwards until we hit the map, which is the point that corresponds to T0.5 (x). Shifting
vertically to the line y = x essentially sets the new x to the old x, and we repeat. 

1.2.3 Stable and attracting fixed points.

A fixed point of a map Φ : X → X is any point x∗ such that x∗ ∈ X and Φ(x∗ ) = x∗ . Fixed points
x∗ of a map Φ are special in that once a trajectory lands on x∗ , it will stay there forever. Note that
not all maps have fixed points. For example, the map Φ : R → R defined by Φ(x) = x + 1 has no
fixed points.

Example 1.2.4:
Recall the tent map, which was defined as
(
rx 0 ≤ x ≤ 12 ,
Tr (x) = 1
r(1 − x) 2 ≤ x ≤ 1.

When r = 0, the only fixed point of T0 is 0 and indeed the map is quite boring. When r = 1, every
point in [0, 1/2] is a fixed point of T1 , and there are no others.

When 0 < r < 1, consider the equation x = r(1 − x). Solving for x yields x = r/(1 + r), and

5
then r < 1 yields x < 1/2. Thus, fixed points of the system must lie in [0, 1/2). In this case, we
note that every fixed point must be a solution to x = rx, and thus x = 0 is the only fixed point of
the system when 0 < r < 1.

Finally, in the case where 1 < r ≤ 2 the map has two fixed points: 0 and r/(r + 1) corresponding to
the solutions of x = rx and x = r(1 − x), respectively. 

Example 1.2.5:
The fixed points of the logistic map, Gµ (x) = µx(1 − x), are trivially 0 when µ = 0. In order to find
the fixed points when 0 < µ ≤ 4, we solve for x in x = µx(1 − x). Thus, the fixed points are the
roots of µx2 + (1 − µ)x. The roots are given by

µ − 1 ± (1 − µ)
,

which are 0 and (µ − 1)/µ. Note that (µ − 1)/µ ∈ / [0, 1] when 0 < µ < 1. Therefore, 0 is a fixed
point of Gµ when 0 < µ ≤ 4, and (µ − 1)/µ is a fixed point of Gµ when 1 ≤ µ ≤ 4. 

A trajectory that lands on a fixed point becomes rather boring. However, numerous questions may
be asked about the points close to a fixed point x∗ . Do these nearby points get sucked in to x∗ ? Are
they repelled by it? Is it possible that neither of these two cases occur? These questions are dealt
with by what is loosely known as stability analysis. The analysis depends heavily on the definitions
being used, and unfortunately there does not appear to be much of a convention in place. We will
begin by defining what we mean for a fixed point x∗ to be stable or unstable.

Definition 1.2.1:
A fixed point x∗ of a map F is stable if for all positive ε, there exists a positive δ such that for
all positive integers t, we have that for all points x,

if x ∈ X and |x − x∗ | < δ then |F t (x) − x∗ | < ε.

Definition 1.2.2:
A fixed point x∗ of a map F is unstable if it is not stable.
More explicitly, this means that there exists a positive ε such that for all positive δ there exist
a positive integer t and a point x in X for which

|x − x∗ | < δ and |F t (x) − x∗ | ≥ ε.

6
Example 1.2.6:
Let us return to the fixed points of the tent map. We have seen in example 1.2.4 that when r = 0,
the only fixed point is 0. We will now show that 0 is stable. Let a positive ε be given, and for
simplicity let us set δ = 1. If x is any point in [0, 1] such that |x − 0| < δ, then for every positive
integer t we have T0t (x) = 0, so that |T0t (x) − 0| < ε. Thus, 0 is a stable fixed point.

Let us move to case where r = 1. The fixed points of T1 are all the points in [0, 1/2], and they are
all stable as well. First we will show that if x∗ is any point in [0, 1/2), then x∗ is stable. To this
end, let any positive ε be given. Set
 1 ∗
2 −x x∗

δ = min ε, , .
2 2
Now every point x in (x∗ − δ, x∗ + δ) remains fixed under T1 . Thus, for every positive integer t, we
have
|T1t (x) − x∗ | = |x − x∗ | < δ ≤ ε
and we have stability. To see that the point x∗ = 1/2 is stable, we let ε be given and set δ = ε.
Then every point in (1/2 − δ, 1/2] is fixed, and every point x such that x ∈ (1/2, 1/2 + δ) is sent to
(1/2 − δ, 1/2) and remains fixed there. Thus, for all positive integers t we have
|T1t (x) − 1/2| < δ = ε
and x∗ is stable.

Figure 1.4: Tent map cobwebbing with r = 1, x0 = 0.9.

Next, suppose r is such that 0 < r < 1. We have seen that 0 is the only fixed point in this case.

7
Once again, it is stable. Let a positive ε be given and let us set δ = min{ε, 1/2}. Let x be any point
such that |x − 0| < δ. In this case, note that T t (x) < 1/2 for all positive integers t, and therefore
|Trt (x) − 0| = Trt (x) = rt x < x < δ < ε,
for all positive integers t, so 0 is stable.

Finally, when 1 < r ≤ 2 there are two fixed points: 0 and r/(r + 1). Let us show that each of
these points is unstable. We will deal with 0 first. Consider ε = 1/2 and let a positive δ be given.
Consider any point x such that 0 < x < min{1/2, δ}. Since r > 1, there must be some t for which
Trt (x) = rt (x) > 1/2 = ε.
Lastly, let us show that x∗ = r/(r + 1) is unstable. Let us choose ε = x∗ − 1/2 and let us be given
a positive δ. Take any x in (x∗ , x∗ + min{δ, 1 − x∗ }) and let us denote the distance between x and
x∗ by d. Note that
Tr (x) = Tr (x∗ + d)
= r(1 − x∗ − d)
 
r
=r 1− −d
r+1
r(r + 1) − r2 − rd(r + 1)
=
r+1
r − rd − r2 d
= ,
r+1
and therefore
r − rd − r2 d r
|Tr (x) − x∗ | = | − | = dr. (1.2)
r+1 r+1
Suppose that there exists no integer t for which |Trt (x) − x∗ | > ε. This implies that the trajectory
x, Tr (x), Tr2 (x), . . .
never enters [0, 1/2). Since equation (1.2) holds for any x such that x ∈ [1/2, 1], we may iterate
it to yield |Trt (x) − x∗ | = drt . However, note that r > 1 implies drt → ∞ as t → ∞, which is a
contradiction. We conclude that there must be some positive integer t for which |Trt (x) − x∗ | > ε,
and therefore x∗ is unstable. 

The definition of stability requires that points within a certain neighborhood of a fixed point x∗
cannot drift too far away from x∗ . Our next notion is that of an attracting fixed point.

Definition 1.2.3:
A fixed point x∗ of a map F is attracting if there exists a positive δ such that for all points x,

if x ∈ X and |x − x∗ | < δ then lim F t (x) = x∗ .


t→∞

8
Definition 1.2.4:
A fixed point x∗ of a map F is non-attracting if it is not attracting.
Let us be explicit and note that this means that for all positive δ there exists a positive ε such
that for all positive integers t0 there exist an integer t and a point x in X such that t > t0 and

|x − x∗ | < δ and |F t (x) − x∗ | ≥ ε.

Example 1.2.7:
Let us return to the tent map. When r = 0, the only fixed point is 0 and we will show that it is
attracting. Setting δ = 1 and taking any point x such that |x − 0| < δ, we have T0t (x) = 0 for all
positive integers t, and therefore limt→∞ T0t (x) = 0.

Next, we will consider r such that 0 < r < 1 and show that 0, which is the only fixed point, is
attracting. Choose δ = 1/2. Given any x such that |x − 0| < δ, we have Trt (x) = rt x and since
0 < r < 1 we have limt→∞ Trt (x) = 0. So 0 is attracting.

When r = 1, we have seen that the fixed points are all the points in [0, 1/2]. Let us pick any
fixed point x∗ in [0, 1/2] and show that it is non-attracting. Let a positive δ be given. If x∗ 6= 1/2,
set
1/2 − x∗ δ
 
ε = min ,
3 3
and choose x = x∗ + 2ε. If x∗ = 1/2, set

x∗ δ
 
ε = min ,
3 3

and choose x = x∗ − 2ε. Let a positive integer t0 be given and set t = t0 + 1. Then |T1t (x) − x∗ | =
|x − x∗ | > ε. Therefore every fixed x∗ in [0, 1/2] is non-attracting. Note, however, that they are all
stable as we have seen in example 1.2.6.

When 1 < r ≤ 2 we have two fixed points to deal with: 0 and r/(1 + r). We will show that
both of these points are non-attracting. Let us deal with 0 first. Let a positive δ be given and
choose ε = 1/2. Let t0 be given and let us choose
1
x = min{δ, }.
2rt0
Note that Trt (x) ≤ 1/2 for all positive integers t such that t ≤ t0 . Since Tr−1 (0) = 1, we can note for
later that Trt0 (x) 6= 0. Now suppose that |Trt (x) − 0| < ε = 1/2 for all positive integers t such that
t > t0 . Then by how the tent map is defined, we know that

Trt (x) = rt−t0 Trt0 (x)

9
for all integers t such that t > t0 . Since r > 1 and Trt0 (x) 6= 0, we have Trt (x) → ∞ as t → ∞ and
this is a contradiction. Therefore, there must be some t for which t > t0 and |Trt (x) − 0| ≥ ε.

Finally, let us show that the fixed point r/(1 + r) is non-attracting. We will repeat the proof
from example 1.2.6 with some slight modifications. Let us be given a positive δ. Let us choose
ε = x∗ − 1/2. Let a positive integer t0 be given. Let us choose a point x such that

x∗ − 1/2
x = x∗ + min{δ, }.
rt0
Let us denote the distance between x and x∗ by d. Note that d = |x − x∗ | ≤ (x∗ − 1/2)/rt0 . We
have

Tr (x) = Tr (x∗ + d)
= r(1 − x∗ − d)
 
r
=r 1− −d
r+1
r(r + 1) − r2 − rd(r + 1)
=
r+1
r − rd − r2 d
= ,
r+1
and therefore
r − rd − r2 d r
|Tr (x) − x∗ | = | − | = dr. (1.3)
r+1 r+1
Note that equation (1.3) holds for every x ∈ [1/2, 1].

We claim that Trt (x) ≥ 1/2 for all positive integers t such that t ≤ t0 . Suppose this is not the
case. Let t0 be the smallest positive integer t such that Trt (x) < 1/2. Then Trt (x) ≥ 1/2 for all
positive integers t such that t < t0 and we may iterate equation (1.3) to get

0 0 (x∗ − 1/2) t0
|Trt (x) − x∗ | = drt ≤ r ≤ x∗ − 1/2,
r t0
which is a contradiction.

Now suppose that there does not exist a positive integer t for which |Trt (x) − x∗ | > ε. This implies
that the trajectory
x, Tr (x), Tr2 (x), . . .
never enters [0, 1/2). Again, since equation (1.3) holds for every x ∈ [1/2, 1], we may iterate it to
yield |Trt (x) − x∗ | = drt . Since r > 1 implies drt → ∞ as t → ∞, we conclude that there must be
some integer t (which is greater than t0 ) for which |Trt (x) − x∗ | > ε, and therefore x∗ is unstable.

10
Example 1.2.8:
The case where r = 1 in the preceding example showed us that it is possible to have a fixed point
that is stable and non-attracting. Is it possible to have a fixed point that is attracting and unstable?
Let us try to construct one. For simplicity we will try to construct a map with a fixed point at 0 that
is unstable and attracting. Bearing in mind what the definitions mean, we need a map that that
forces every point near 0 away, and eventually maps it back to x. Consider the map M : R → R
defined by 


 2x + 4 −2 ≤ x ≤ −1,
−2x −1 ≤ x ≤ 0,






M (x) = 2x 0 ≤ x ≤ 1,

4 − 2x 1 ≤ x ≤ 2,






0
 otherwise.

Figure 1.5: M(x) vs. x.

Indeed 0 is an unstable fixed point of M as we will now show. Let us choose ε = 1 and let a positive
δ be given. Let us take t to be any integer such that t > 1 − lg δ. Let us also choose x = min{δ/2, 1}.
Note that x ∈ [0, 1] so that we have M t (x) = 2t x > 21−lg δ (δ/2) = 1. Therefore 0 is unstable.

Thus we have a map that sends points near 0 away from it. Must they come back to 0? Un-
fortunately, if we take any point in [−2, 2], its trajectory will forever remain in that interval. Let us

11
consider only the positive domain of the map and note that
2x
[0, 1] −→ [0, 2]

and
4−2x
[1, 2] −−−→ [0, 2].

We may try to remedy this situation by increasing the slope of the lines in [−2, −1] and [1, 2]:

22


 20x + 22 − 20 ≤ x ≤ −1,
−2x −1 ≤ x ≤ 0,






M (x) = 2x 0 ≤ x ≤ 1,
22

22 − 20x

 1 ≤ x ≤ 20 ,



0
 otherwise.

Figure 1.6: M(x) cobweb.

Points near 0 may now leave [−2, 2] and return. However, this is still not good enough, since we
need this to hold for all points. Consider, say, the point 11/84.

12
Figure 1.7: Unfortunate fixed point.

Its trajectory is
11 11 11 22 22 22
, , , , , ,....
84 42 21 21 21 21
There is an unfortunate fixed point at 22/21 on the line 22 − 20x. This squashes the possibility of
0 being attracting, since we need all points near it to get sucked back in.

We may try to remedy this situation by creating a discontinuity at 22/21 as follows:

Let the map M : R → R be defined by



22


 20x + 22 − 20 ≤ x ≤ −1,
−2x −1 ≤ x ≤ 0,






M (x) = 2x 0 ≤ x ≤ 1,
22

22 − 20x x ∈ [1, 22/20] \ { 21 },






0
 otherwise.

Our proof that 0 is an unstable fixed point still holds, and the problematic trajectory of the point
11/84 is now
11 11 11
, , , 0, 0, 0, . . . .
84 42 21

13
Unfortunately we are not off the hook yet. Consider, say, the point 11/82.

Figure 1.8: Unfortunate periodic points.

Its trajectory is
11 11 22 44 22 44 22 44
, , , , , , , ,...
82 41 41 41 41 41 41 41
and thus we have a point that leaves the neighborhood of 0 and never gets mapped back to 0.

At this point, rather than trying to patch each hole as it becomes apparent, let us simplify our
map to



2x 0 ≤ x ≤ 1,

M (x) = −2x −1 ≤ x ≤ 0,

0

otherwise.

14
Figure 1.9: Problem points removed.

Our fixed point 0 is still unstable and we are left to show that it is attracting. Let us set δ = 1 and
let x be any point in (0, 1).

If 1/2 ≤ x < 1 then M t (x) = 0 for all integers t such that t > 1, which allows us to conclude
that limt→∞ M t (x) = 0.

If 0 < x < 1/2, then we claim that 1/2 ≤ M t (x) < 1 for some positive integer t. Supposing
this is not the case, we note that M t (x) = 2t x and this yields a contradiction since 2t x → ∞ as
t → ∞. Thus there exists some positive integer t for which 1/2 ≤ M t (x) < 1 and limt→∞ M t (x) = 0
by the preceding paragraph. 

Although our attempts at constructing a continuous map with an unstable attracting fixed point
were fruitless, we have seen that such points may exist for discontinuous maps. In fact, [5] contains
a proof of the following Theorem, which we will not prove here.

Theorem 1.2.1:
Let x∗ be an attracting fixed point of a continuous map f : I → R, where I is an interval. Then x∗
is stable.
Proof:
See [5] or [10]. 

15
We are now ready to state a theorem that allows us to easily deduce the stability of a fixed point,
in most cases.

Theorem 1.2.2:
Let F be a one-dimensional differentiable map (F : X → X and X is an interval of R) and let x∗
be a fixed point of F such that F 0 is continuous at x∗ . We have the following:

1. If |F 0 (x∗ )| < 1, then x∗ is stable and attracting.


2. If |F 0 (x∗ )| > 1 then x∗ is unstable.

Proof (of 1.):


Let a positive ε be given and let us set

1 − |F 0 (x∗ )|
 
ε0 = min ε, ,
2

so that |F 0 (x∗ )| + ε0 < 1.

Since F 0 is continuous at x∗ , we know that there is a positive δ such that

F 0 (x∗ ) − ε0 < F 0 (y) < F 0 (x∗ ) + ε0 (1.4)

for all y such that


x∗ − δ < y < x∗ + δ. (1.5)
∗ 0
Now consider any trajectory starting at x, where x is such that 0 < |x − x | < min{δ, ε }.

By the Mean Value Theorem, there exists a z such that z is between x and x∗ and such that

|F (x) − x∗ | = |F (x) − F (x∗ )|


= |F 0 (z)||x − x∗ |. (1.6)

Since z is between x and x∗ , and x is less than δ far away from x∗ , we know that z is less than δ far
away from x∗ . Thus, (1.5) holds with z in place of y and we can apply (1.4) to get that

−1 < F 0 (x∗ ) − ε < F 0 (z) < F 0 (x∗ ) + ε < 1,

so that |F 0 (z)| < M < 1, where M = F 0 (x∗ ) + ε. Thus, we have

|F (x) − x∗ | < M |x − x∗ |.

In particular, |F (x) − x∗ | < |x − x∗ | and we may iterate equation (1.6) (applying the Mean Value
Theorem, to F (x) and x∗ instead of x and x∗ ) to get

|F 2 (x) − x∗ | < M 2 |x − x∗ |

16
by iteration. Proceeding in this fashion allows us to conclude that for every positive integer t, we
have
|F t (x) − x∗ | < M t |x − x∗ |
by iteration.
Since M < 1 we can conclude that |F t (x) − x∗ | < |x − x∗ | < ε for all positive integers t, so that x∗
is stable. Furthermore, since M < 1 we may conclude that |F t (x) − x∗ | → 0 as t → ∞. Thus, x∗ is
attracting. 

Proof (of 2.):


Suppose that |F 0 (x∗ )| > 1. Let us set εc = |F 0 (x∗ ) − 1|/2. Since F 0 is continuous at x∗ there is a
positive δc such that F 0 (x∗ ) − εc < F 0 (y) < F 0 (x∗ ) + εc for all y such that x∗ − δc < y < x∗ + δc .
With this in mind, we will set ε = δc and show that for all positive δ, there is a point x and positive
integer t such that 0 < |x − x∗ | < δ and |F t (x) − x∗ | ≥ ε.

To this end, let a positive δ be given. Without loss of generality, we may assume that δ < δc .
Let x be any point such that 0 < |x − x∗ | < δ. By the Mean Value Theorem, there is a point z
between x and x∗ such that
F (x) − F (x∗ ) = F 0 (z)(x − x∗ ).
Setting M = |F 0 (x∗ ) − εc |, we have

|F 0 (z)| > |F 0 (x∗ ) − εc | = M > 1

by the above continuity arguments. We now have

|F (x) − F (x∗ )| > |M ||(x − x∗ )|. (1.7)

If |F (x) − F (x∗ )| ≥ ε then we are done. If not, then we may iterate (1.7) with F (x) in place of x
(we are allowed to do this since in this case we must have |F (x) − F (x∗ )| < ε = δc ). Proceeding in
this manner we get
|F t (x) − F t (x∗ )| ≥ |M t ||(x − x∗ )|,
and therefore there must be some t such that |F t (x) − x∗ | ≥ ε, since M > 1. 

Note that Theorem 1.2.2 does not tell us how to deal with a fixed point x∗ of a map F for which
|F 0 (x∗ )| = 1. Such a fixed point is called non-hyperbolic. The stability of non-hyperbolic fixed
points can be studied by looking at higher derivatives (and Schwarzian1 derivatives in particular).
We will not explore such techniques here. See [10] for a detailed account.

Example 1.2.9:
Let us apply Theorem 1.2.2 to the fixed points of the tent map, which we have already studied. Note
that Tr0 (x∗ ) = |r| for every fixed point x∗ (and indeed every point in [0, 1]), so that fixed points are
stable and attracting when r < 1 and unstable when r > 1. This confirms our previous remarks. 
000
 00
2
1 The f (x) 3 f (x)
Schwarzian derivative of a function f is f 0 (x)
− 2 f 0 (x)

17
Example 1.2.10:
Recall that the fixed points of the logistic map, Gµ (x) = µx(1 − x), are 0 when 0 ≤ µ ≤ 4 and
(µ − 1)/µ when 1 ≤ µ ≤ 4. The derivative of the map at a point x is µ − 2µx, so that 0 is stable
and attracting for 0 ≤ µ < 1 and unstable for 1 < µ ≤ 4. Furthermore, (µ − 1)/µ is stable and
attracting for 1 < µ < 3 and unstable when 3 < µ ≤ 4 . We will study the case 3 < µ ≤ 4 in more
detail in the next section. 

Now that we have an understanding of what the definitions in this section mean, here are some com-
ments on some of the variations that may be found in the literature. Our definitions are borrowed
from Elaydi (reference [10]), although he adds the term asymptotically stable to refer to fixed points
that are both stable and attracting. Our definitions are also similar to those found in [9].

In reference [1], Alligood et al. define an attracting fixed point (or sink) in the same way we did, and
uses source or repelling fixed point to refer to our unstable fixed points.

The text by Devaney ([8]) states that a fixed point x∗ of a map f for which |f 0 (x∗ )| < 1 is an
attracting point or a sink. Thus, his attracting fixed point (or sink) is a stable and attracting fixed
point for us. Furthermore, he defines a repellor or source to be a fixed point for which |f 0 (x∗ )| > 1.
This corresponds to our unstable fixed points.

Similarly, Lynch ([22]) calls a fixed point stable if |f 0 (x∗ )| < 1, and an unstable fixed point is
one for which |f 0 (x∗ )| > 1. This corresponds to our stable and attracting fixed points and unstable
fixed points, respectively.

Schuster ([34]) calls our attracting fixed point a locally stable fixed point, and uses unstable to
refer to points that we call non-attracting.

Scheinerman ([33]) calls our stable and attracting fixed points stable fixed points, and uses marginally
stable to denote fixed points that, in our language, are both stable and non-attracting. His unstable
fixed points are those that are neither stable nor marginally stable, which make it equivalent to our
notion of an unstable fixed point.

As we can see, the literature is riddled with inconsistencies. Stability and attraction are two separate
notions, and we choose to keep them as such.
Similar proofs of Theorem 1.2.2 can be found in most textbooks on dynamical systems, for example
[1], [10] and [33].

1.2.4 Periodic trajectories.

Let us now consider trajectories with more than one repeating point.

18
Definition 1.2.5:
A trajectory
x, Φ(x), Φ2 (x), . . .
under a map Φ : X → X is eventually periodic if there exist a nonnegative integer k and a
positive integer p such that for all integers t such that t ≥ k

Φp+t (x) = Φt (x).

The smallest positive integer p for which this holds is the period of the trajectory, and the set
of points
{xk , Φ(xk ), Φ2 (xk ), . . . , Φp−1 (xk )},
where xk = Φk (x), is called the periodic orbit of the trajectory.

Note that our fixed points from the previous section are simply periodic orbits with period 1. Let
us also note here the allowance for k to be nonzero in the definition. It would be tempting to simply
require that Φp (xi ) = xi for some positive integer p and all nonnegative integers i. However, this
would exclude trajectories such as
0, 1, 2, 3, 2, 3, 2, 3, . . .
from being eventually periodic, which is not what we want. Thus, letting k be positive gives
us the eventually part of the definition. Such trajectories are possible since the maps under our
consideration need not be one-to-one.
We should note that
2, 3, 2, 3, 2, 3, . . .
is eventually periodic as well.

What would it mean for us to ask if a periodic orbit is stable or attracting? Let

{x0 , Φ(x0 ), Φ2 (x0 ), . . . , Φp−1 (x0 )}

be the periodic orbit in question. We want to know what happens to a nearby point after p iterates
of the map Φ. Thus, in order to determine if the periodic orbit is stable, attracting, unstable or
non-attracting, we can view Φp as a new map Ψ and ask ourselves about x0 as a fixed point under
Ψ. Let us write xi = Φi (x0 ) for every nonnegative integer i. If Φ is a differentiable map, then by
the chain rule we have
0
Ψ0 (x0 ) = Φ0 (Φp−1 (x0 )) × Φp−1 (x0 )
0
= Φ0 (xp−1 ) × Φ0 (Φp−2 (x0 )) × Φp−2 (x0 )
.
= ..
= Φ0 (xp−1 ) × Φ0 (xp−2 ) × · · · × Φ0 (x0 ),

19
Qp−1
so that x0 is stable and attracting (under Ψ) if i=0 |Φ0 (xi )| < 1, by Theorem 1.2.2. Thus,

P = {x0 , Φ(x0 ), Φ2 (x0 ), . . . , Φp−1 (x0 )}


Qp−1 Qp−1
is a stable and attracting periodic orbit of Φ if i=0 |Φ0 (xi )| < 1. If i=0 |Φ0 (xi )| > 1, then P is
an unstable periodic orbit.

Definition 1.2.6:
A trajectory
x, Φ(x), Φ2 (x), . . .
under a map Φ : X → X is called asymptotically periodic if there exists a positive integer s and
an eventually periodic trajectory
y0 , y1 , y2 , . . .
of points in X with periodic orbit

{yt , yt+1 , . . . , yt+p−1 }

such that
lim |Φk (xs ) − yt+k | = 0.
k→∞

In words: an asymptotically periodic trajectory gets arbitrarily close to an eventually periodic


trajectory. Note that an eventually periodic trajectory is also asymptotically periodic.

Example 1.2.11:
The tent map with r = 2 has many eventually periodic orbits. For example,

1 1 2 4 2 4
, , , , , ,...
10 5 5 5 5 5
is eventually periodic. The period is unstable since

2 4
T20 ( ) × T20 ( ) = 4 > 1.
5 5

We may also note that when r = 1/2, the trajectory

0.25, 0.1250, 0.0625, 0.0313, . . .

is asymptotically periodic (with period 1) since it approaches 0 but never actually lands on it. 

20
Example 1.2.12:
We have seen that the logistic map, Gµ (x) = µx(1 − x), has two fixed points: 0 and (µ − 1)/µ.
Furthermore, we know from example 1.2.10 that 0 is asymptotically stable for 0 ≤ µ < 1, and that
(µ − 1)/µ is asymptotically stable for 1 < µ < 3. Once µ becomes slightly larger than 3, the fixed
points are both unstable since
G0µ (0) = µ > 3 > 1
and
G0µ ((µ − 1)/µ) = µ − 2µ(µ − 1)/µ = 2 − µ < −1.

Where do trajectories go? When µ = 3, a periodic trajectory of period 2 that is stable and attracting
is born. In order to study the period 2 trajectories, we need to consider the second iterate of the
map, which is
G2µ (x) = −µ2 x(x − 1)(1 − µx + µx2 ).
In order to find the periodic trajectory (of period 2) of Gµ , we must find the fixed points of G2µ , and
to do that we need to find the roots of −µ2 x(x − 1)(1 − µx + µx2 ) − x. This can be rewritten as

−x(µx + 1 − µ)(µ2 x2 − (µ2 + µ)x + µ + 1),

which immediately yields two of the four roots, namely 0 and (µ − 1)/µ, which are simply the fixed
points of Gµ . We are left to consider the roots of µ2 x2 − (µ2 + µ)x + µ + 1, which are
q
µ 1 µ2 −2µ−3
2 + 2 + 2
+
x2 = ,
µ
q
µ 1 µ2 −2µ−3
2 + 2 − 2
x−2 = .
µ
Note that these numbers are complex if µ < 3, which confirms that the period 2 trajectory only
comes into play once µ = 3. To check stability, we first note that the derivative of G2µ is −4µx3 +
6µx2 − (2µ3 + 2µ2 ) + µ2 , which yields
0
G2µ (x+ 2
2 ) = −µ + 2µ + 4,
0
G2µ (x− 2
2 ) = −µ + 2µ + 4.

0 0 −
At µ = 3, we have G2µ (x+ 2
2 ) = Gµ (x2 ) = 1. A small increase in µ causes this number to drop below
0 20 +
1, making it a stable and attracting orbit, until µ is large enough so that G2µ (x+
2 ) = Gµ (x2 ) = −1.
Without going into the details of the analysis (which are similar to the preceding case, see [23]√ or
[37] for details), the value of µ at which the period 2 trajectory becomes unstable is µ = 1 + 6.

21
Figure 1.10: G2µ with µ = 3.

22
Figure 1.11: G2µ with µ = 3.2.


What happens for µ > 1 + 6? The story seems to repeat itself. Looking back to when µ = 3, the
stable and attracting fixed point became unstable √ and gave rise to a period 2 trajectory that has a
stable and attracting orbit. Similarly, at µ = 1 + 6, the period 2 orbit becomes unstable, with each
of the two points giving rise to two new points, creating a period 4 trajectory, which has a stable
and attracting orbit.

This process continues. As µ increases, the system gives rise to a stable and attracting periodic
orbit of period 2k at some critical value µk . This goes until until a critical value µ∞ is reached
where Gµ∞ has no orbits that are stable and attracting. 

The previous example is merely a small sample of the elegant properties of the logistic map. As we
have mentioned in the example, May goes into further details in [23]. The definitions in this section
are modified from the notions found in [1] and [3], but are common enough to be found in most
texts.

1.2.5 The Lyapunov exponent of a one-dimensional map.

Let us return to the general setting where Φ : X → X is any map (and not necessarily differentiable),
and X ⊆ R. For every point x in the interior of X, the local Lyapunov exponent of Φ at x is defined

23
as the limit (if it exists)
|Φ(x + δ) − Φ(x)|
lim ln ;
δ→0 δ
the global Lyapunov exponent of Φ at x is defined as the limit (if it exists)

1 |Φt (x + δ) − Φt (x)|
lim lim ln .
t→∞ t δ→0 δ
If the global Lyapunov exponent of Φ at x equals λ then, for all sufficiently large values of t for all
sufficiently small (with respect to the value of t) values of δ, we have

|Φt (x + δ) − Φt (x)|
≈ eλt .
δ
For this reason, emphasis is usually placed on whether the Lyapunov exponent is positive or negative.

1.2.6 The Lyapunov exponent of a one-dimensional differentiable map.

Let F : X → X be a differentiable map and let x be any point in X, where X ⊆ R. Since

F (x + δ) − F (x) = F 0 (x)δ + o(|δ|) as δ → 0,

the local Lyapunov exponent of a differentiable map Φ at a point x equals

ln |Φ0 (x)|

and the global Lyapunov exponent of Φ at x, which we will denote by λ(x), equals
1
lim ln |Φn 0 (x)|, (1.8)
n→∞ n
if the limit exists. Let us note that
1 1
ln|Φn 0 (x)| = ln|Φ0 (x)Φ0 (Φ2 (x))Φ0 (Φ3 (x)) · · · Φ0 (Φn−1 (x))|
n n
n−1
1X
= ln|Φ0 (Φt (x))|.
n t=0

Therefore,
n−1
1X
λ(x) = lim ln|Φ0 (Φt (x))|, (1.9)
n→∞ n
t=0

if the limit exists. This means that the global Lyapunov exponent of Φ at the point x is the average
of the local Lyapunov exponents of Φ at points Φ0 (x), Φ1 (x), Φ2 (x), . . . of the trajectory of x.

Note that the Lyapunov exponent of the trajectory x0 , x1 , x2 , . . . does not exist if there is some
xn such that Φ0 (xn ) = 0. We may also note that the limit in (1.9) is guaranteed to exist under
certain mild restrictions by the Birkhoff Ergodic Theorem (see [4] and [7]), which we will not cover

24
here. For future reference, the Birkhoff Ergodic Theorem is the one-dimensional version of a more
general theorem known as Oseledets’ Multiplicative Ergodic Theorem, which we will mention when
we arrive at Section 1.3.6.

Example 1.2.13:
For the tent map, |Tr0 (x)| = |r| everywhere except when x = 1/2, where the derivative does not
exist. Let us consider a point x such that the trajectory x, Tr (x), Tr2 (x), . . . never hits 1/2 (letting
x be an irrational number on the unit interval when r is rational is one such way to ensure this).
In this case, equation (1.9) tells us that the Lyapunov exponent of the trajectory is simply ln r, so
that the Lyapunov exponent is positive when ln(r) > 0, or simply when r > 1.

Figure 1.12: Tent map cobwebbing with r = 1.9, x0 = 0.3.

25
Figure 1.13: Separation of initially nearby points when r = 1.8, x0 = 0.83, 0.85.

These two figures show the effect of a positive Lyapunov exponent. In particular, the second figure
displays what is commonly referred to as sensitive dependence on initial conditions or the butterfly
effect: “Does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?”. This quote is
often misattributed to Edward Lorenz who initially had a similar quote in [21]: “one flap of a sea
gull’s wings would be enough to alter the course of the weather forever.” In fact, a similar quote can
be found much earlier involving grasshoppers. See [14] for a more detailed account of the history of
the butterfly effect. 

Note that in general the calculation of the Lyapunov exponent of the trajectory of a map is not so
simple. The tent map provided for a particularly simple calculation since the derivative of the map
is the same at every point in the phase space (with the exception of 1/2).

Example 1.2.14:
It would be wrong to think that fixed points of a map must have a Lyapunov exponent equal to 0.
This is not the case. Consider the tent map T2 (x) and the fixed point 2/3. The (fixed) trajectory
2/3, 2/3, 2/3, . . . has a Lyapunov exponent of ln 2. Similarly, a periodic trajectory need not have a
Lyapunov exponent equal to 0. The periodic trajectory 2/5, 4/5, 2/5, 4/5, . . . of the tent map T2 (x)
has a Lyapunov exponent of ln 2. We must not be fooled by the fact that nearby points on that
same periodic trajectory will not be separating or contracting. Nearby points not on the trajectory
must be taken into consideration. 

26
1.2.7 Conjugacy.

Definition 1.2.7:
A function f : A → B is a homeomorphism if it is continuous, bijective, and its inverse is
continuous.

We may now introduce the notion of conjugacy, which will allow us to relate two dynamical systems
to each other.

Definition 1.2.8:
Let X ⊆ R, Y ⊆ R and let α : A → A and β : B → B be two maps. We say that α and β are
conjugate if there exists a homeomorphism f : B → A such that α(f (x)) = f (β(x)). We call f
f
a conjugation between α and β and write α ∼ β.

Theorem 1.2.3:
The conjugacy relation is an equivalence relation between maps. Namely, if α : A → A and β : B →
B and γ : C → C are maps then

1. α ∼ α,
2. α ∼ β ⇒ β ∼ α,
3. α ∼ β and β ∼ γ ⇒ α ∼ γ.
Proof:
There are three things to prove:

1. We may take the homeomorphism f : A → A defined by f (x) = x. Then f (α(x)) = α(f (x)).
2. Let f : B → A be a homeomorphism such that α(f (x)) = f (β(x)) for all x such that x ∈
B. Let g : A → B be defined by g(x) = f −1 (x), which is a homeomorphism since f is a
homeomorphism. Then g(α(f (x))) = g(f (β(x))) = β(x) for all x ∈ B. In particular, for every
y such that y ∈ A there is an x such that x ∈ B and g(y) = x, and therefore β(g(y)) = g(α(y)).
3. Let f : B → A and g : C → B be homeomorphisms such that α(f (x)) = f (β(x)) and
β(g(y)) = g(γ(y)), for all x such that x ∈ B and all y such that y ∈ C. For every y such that
y ∈ C, write y = g(x). We have
α(f (g(x))) = f (β(g(x))) = f (g(γ(x)))
so that g ◦ f : C → A (we write g ◦ f to denote the function f (g(x))) is a homeomorphism
g◦f
such that α ∼ γ. 

27
Example 1.2.15:
Let α : A → A be the identity map, defined by α(x) = x for all x in A. Then by the previous
f
Theorem, we know that α is conjugate to itself. Suppose α ∼ β for some map β : B → B and some
homeomorphism f : B → A. Then for all x in B we must have

f (β(x)) = α(f (x)) = f (x).

Since f is one-to-one, this means that β(x) = x for all x in B. Thus, the only map that is conjugate
to an identity map is an identity map. 

Example 1.2.16:
Let α : R+ → R+ and β : R+ → R+ be defined by α(x) = ax and β(x) = bx, where a > 0 and
b > 0. Since we have just seen that identity maps can only be conjugate to other identity maps (or
f
themselves), let us also assume that a 6= 1 and b 6= 1. We will show that α ∼ β, where

f (x) = xlogb a .

We have
logb a
f (β(x)) = f (bx) = (bx) = axlogb a ,
and
α(f (x)) = axlogb a . 

We will now work towards relating the Lyapunov exponents of two conjugate maps.

Lemma 1.2.1:
If α : A → A and β : B → B are one-dimensional maps and f : B → A is a homeomorphism such
f f
that α ∼ β, then for all positive integers k we have αk ∼ β k .
Proof:
Let f : B → A be a homeomorphism such that α(f (x)) = f (β(x)) for all x such that x ∈ B.
We will prove the claim by induction on k. First we note that α(α(f (x))) = α(f (β(x))), so that
α2 (f (x)) = α(f (β(x))) = f (β 2 (x)). Now we suppose the claim is true for all integers n such that
2 ≤ n ≤ k − 1. Then αn (f (x)) = f (β n (x)) so that αn+1 (f (x)) = α(f (β n (x))) = f (β n+1 (x)). 

Corollary 1.2.1:
Let α : A → A and β : B → B be one-dimensional maps and let f : B → A be a homeomorphism
f
such that α ∼ β. If
x0 , x1 , x2 , . . .
is a trajectory of β, then
f (x0 ), f (x1 ), f (x2 ), . . .
is a trajectory of α.

28
Proof:
For every positive integer i we have xi = β i (x0 ) so that f (xi ) = f (β i (x0 )) = αi (f (x0 )), by the
previous Lemma. 

Theorem 1.2.4:
Let α : X → X and β : Y → Y be one-dimensional differentiable maps, where X ⊆ R and Y ⊆ R,
f
and let f : Y → X be a homeomorphism such that α ∼ β. If x0 , x1 , x2 , . . . is a trajectory of β such
that

• f 0 (xi ) 6= 0 for all nonnegative integers i and


1
• lim ln |f 0 (xn )| = 0,
n→∞ n

then the Lyapunov exponent of the trajectory of x0 under β is the same as that of the trajectory of
f (x0 ) under α.
Proof:
By the chain rule, f 0 (β(x)) × β 0 (x) = α0 (f (x)) × f 0 (x). Therefore, for a trajectory x0 , x1 , . . . of β,
we have
α0 (f (x0 )) × f 0 (x0 ) α0 (f (x1 )) × f 0 (x1 ) α0 (f (xn )) × f 0 (xn )
β 0 (x0 )β 0 (x1 ) · · · β 0 (xn ) = · · ·
f 0 (x1 ) f 0 (x2 ) f 0 (xn+1 )
0
f (x0 )
= α0 (f (x0 ))α0 (f (x1 )) · · · α0 (f (xn )) × 0 ,
f (xn+1 )
as long as f 0 (xi ) is never 0, which is guaranteed in our hypothesis.

Next, we note that


n
X n
X
ln |β 0 (xi )| = ln |f 0 (x0 )| − ln |f 0 (xn+1 )| + ln |α0 (f (xi ))|,
t=0 t=0

and thus
n n
!
1X 1 X
lim ln |β 0 (xi )| = lim ln |f 0 (x0 )| − ln |f 0 (xn+1 )| + ln |α0 (f (xi ))| .
n→∞ n n→∞ n
t=0 t=0

1
Let us note that lim ln |f 0 (x0 )| = 0. Furthermore, the trajectories
n→∞ n
f (x), f (β(x)), f (β 2 (x)), . . .
and
f (x), α(f (x)), α2 (f (x)), . . .
are the same by Lemma 1.2.1. Thus, we have shown that the Lyapunov exponent of β starting at
1
x0 is the same as that of α starting at f (x0 ), as long as lim ln |f 0 (xn+1 )| = 0. 
n→∞ n

29
With this Theorem under our belt, we can use it to calculate the Lyapunov exponent of the logistic
map when µ = 4.

Example 1.2.17:
Recall the logistic map, Gµ (x) = µx(1 − x), and let us focus on the case when µ = 4. Fixing a
starting point x, we are interested in calculating the value of
n−1
1X
λ(x) = lim ln |G04 (Gt4 (x))|
n→∞ n
t=0
n−1
1X
= lim ln |4 − 8Gt4 (x)|.
n→∞ n
t=0

Unfortunately, this reduces to being able to say something about the trajectory

x, G(x), G2 (x), . . . .

Instead, we will calculate the Lyapunov exponent of a trajectory starting at x by showing that G4
is conjugate to the tent map T2 .

Let
1 − cos πx
f (x) = ,
2
which is a homeomorphism on [0, 1]. We will first show that G4 (f (x)) = f (T2 (x)) for all x ∈ [0, 1].
We begin by noting that

G4 (f (x)) = 4f (x)(1 − (f (x)))


 
1 − cos πx 1 + cos πx
=4 )(
2 2
= 1 − cos2 πx
= sin2 πx.

Next, we note that f (T2 (x)) = (1 − cos(πT2 (x)))/2, so that if x ∈ [0, 1/2), then

1 − cos(2πx)
f (T2 (x)) =
2
1 − 1 + 2 sin2 (πx)
=
2
= sin2 (πx).

30
If x ∈ [1/2, 1], then
1 − cos(π(2 − 2x))
f (T2 (x)) =
2
1 − cos(2π − 2πx))
=
2
1 − (cos(2π) cos(2πx) + sin(2π) sin(2πx))
=
2
1 − cos(2πx)
=
2
1 − 1 + 2 sin2 (πx)
=
2
= sin2 (πx).

Figure 1.14: G4 (f (x) = f (T2 (x)).

Thus, if x0 , x1 , x2 , . . . is a trajectory of T2 that doesn’t hit 1/2, and


1
lim ln |f 0 (xn )| = 0, (1.10)
n→∞ n
then the Lyapunov exponent of the trajectory f (x0 ), f (x1 ), . . . of G4 is ln 2.

31
In particular, if x0 , x1 , . . . is a trajectory of T2 that never hits 0, then the trajectory never hits
1/2, since T22 (1/2) = 0. Furthermore, (1.10) is satisfied since f 0 (x) = 0 if and only if x = 0. Since
f (0) = 0, we may conclude by saying that every trajectory of G4 that does not hit 0 has a Lyapunov
exponent of ln 2.

It is worth noting that the trajectory starting at x = 0 can be calculated directly, since 0 is a
fixed point. We have
n−1
1X
λ(0) = lim ln |G04 (Gt4 (0))|
n→∞ n
t=0
n−1
1X
= lim ln |4 − 8Gt4 (0)|
n→∞ n
t=0
= ln 4,

which shows that not all trajectories have the same Lyapunov exponent under a given map. 

The definition of conjugacy we provided is the same as that used in [8], [10], [1], and many others
texts on dynamical systems, although this notion is referred to as topological conjugacy in some texts.
Our sole purpose of introducing Theorem 1.2.4, which was adopted from [1], was for the calculation
of the Lyapunov exponent of G4 , which can be found in that reference as well. Conjugacy between
G4 and T2 seems to be the simplest way to find the exact value of G4 . Alternative methods use
ergodic theory to achieve the same results (see [29]), and many references calculate the Lyapunov
exponent of the logistic map numerically (see [36] for example). Finally, we note that all of the
results from this section are general enough to hold for higher dimensions, with the exception of
Theorem 1.2.4.

1.2.8 Computing the global Lyapunov exponent of a map.

How can we compute the global Lyapunov exponent of a prescribed differentiable map Φ : X → X
(such that X ⊆ R) at a prescribed point x0 in the interior of X? Answers to this question depend
on the way Φ is prescribed; let us assume that it is prescribed by an oracle that, given any x in X,
returns Φ(x).

In this situation, we can compute iteratively

xt = Φ(xt−1 ),

and take δt , a number small enough to ensure that

Φ(xt + δt ) − Φ(xt ) ≈ Φ0 (x)δt ,

until the sequence of averages


n−1
1 X |Φ(xt + δt ) − Φ(xt )|
ln
n t=0 δt

32
shows signs of convergence, at which time we return an estimate of its limit as an estimate of the
Lyapunov exponent.

33
1.3 Multi-dimensional discrete dynamical systems.

We may now proceed to multi-dimensional discrete dynamical systems, which, as we will see, have
many similarities to the one-dimensional systems we have just studied. Let us begin by presenting
a few examples.

1.3.1 Examples.

Example 1.3.1:
The delayed logistic map is a two-dimensional map Dµ : R2 → R2 defined as

Dµ (x1 , x2 ) = (µx1 (1 − x2 ), x1 ),

where µ is any nonzero value in R. 

Example 1.3.2: √ √ 
The map Φ : R+ ×[0, 2π) → R+ ×[0, 2π) defined by Φ(r, θ) = r, 2πθ will be particularly useful
when studying stability. In this case we view (r, θ) as a point in the plane in polar coordinates. 

Example 1.3.3:
Arnold’s cat map 2 is a two-dimensional map Φ : [0, 1)2 → [0, 1)2 defined as

Φ(x1 , x2 ) = (x1 + x2 mod 1, x1 + 2x2 mod 1). 

Example 1.3.4:
The Kaplan-Yorke map 3 is another two-dimensional map Φ : [0, 1)2 → [0, 1)2 . It is defined as

Φ(x1 , x2 ) = (ax1 mod 1, bx2 + cos(4πx1 ) mod 1). 

1.3.2 Stability of fixed points.

Fixed points of multi-dimensional systems share some similarities with the fixed points of one-
dimensional systems. They are defined in the same way, namely if Φ : X → X and x∗ is some point
in X such that Φ(x∗ ) = x∗ , then x∗ is a fixed point.

2 Named after Vladimir Igorevich Arnold who displayed the effects of the map on the image of a cat, see [2].
3 Introduced by Kaplan and Yorke in 1979, see [20].

34
Example 1.3.5:
Let us find the fixed points of the delayed logistic map, Dµ (x1 , x2 ) = (µx1 (1 − x2 ), x1 ). In order to
do so we must solve for x = µx(1 − x), which is simply our one dimensional logistic map from before.
Thus, solutions for x are 0 and (µ − 1)/µ, so that the fixed points of the delayed logistic map are
µ−1 µ−1
(0, 0) and ( , ).
µ µ 

Example 1.3.6: √ √ 
The only fixed points of Φ(r, θ) = r, 2πθ are easily seen to be (0, 0) and (1, 0). 

The definitions of section 1.2.3 are easily extended to multi-dimensional maps. Let Φ : X → X be
a multi-dimensional map, where X ⊆ Rd . We have:

Definition 1.3.1:
A fixed point x∗ of a map Φ is stable if for all positive ε, there exists a positive δ such that for
all positive integers t and for all points x in X we have

if ||x − x∗ || < δ, then Φt (x) − x∗ < ε.

Definition 1.3.2:
A fixed point x∗ of a map Φ is unstable if it is not stable.

Definition 1.3.3:
A fixed point x∗ of a map Φ is attracting if there exists a positive δ such that for all points x
such that x ∈ X we have

if ||x − x∗ || < δ, then lim Φt (x) = x∗ .


t→∞

Example 1.3.7:
Let us apply these definitions to the polar coordinate map from above, Φ : R+ ×[0, 2π) → R+ ×[0, 2π)
√ √ 
defined by Φ(r, θ) = r, 2πθ .

Before proceeding, let us note that for every positive integer t, we have
!
2−t
t 2−t 2πθ
Φ (r, θ) = r , . (1.11)
(2π)2−t

35
Let us begin with the fixed point (0, 0). We will first show that it is attracting. Let δ = 1 and let
x be any point such that ||x − (0, 0)|| < δ. Since we have x = (r, θ) for some r such that r < 1,
equation (1.11) allows us to conclude that limt→∞ Φt (x) = (0, 0).

Next, let us show that (0, 0) is stable. Let a positive ε be given, and set δ = min{ε, 1}. For
every point x such that ||x − (0, 0)|| < δ, we necessarily have x = (r, θ), where r < δ and θ is some
−n
angle. Since for every positive integer t we have ||Φt (x) − (0, 0)|| = r2 < r < δ < ε, we are done.

Let us now consider the fixed point x∗ = (1, 0).

To see that (1, 0) is attracting, take δ = 1/2. Now for every x such that ||x − x∗ || < δ (note
that (0, 0) is excluded), by equation (1.11) we have
!
2−t
t 2−t 2πθ
lim Φ (x) = lim r , 2−t
= (1, 2π) = (1, 0).
t→∞ t→∞ (2π)

Finally, we will now show that (1, 0) is unstable. Let us begin by choosing ε = 1/2. Now sup-
pose we are given a positive δ. Without loss of generality, we can assume δ < 1/2. Now we consider
the point  
n
x = 1, π21−2 .

Since this point approaches (1, 0) as n gets larger, we may assume there is some positive integer n(δ)
for which  n(δ)

1, π21−2 − (1, 0) < δ.
Let us call this point xδ . Now we may note that

Φn (δ)(xδ ) = (1, π)

by equation (1.11). Thus


||Φn (δ)(xδ ) − x∗ || = 1 > ε. 

We have just seen a rather simple system in which there is a fixed point that is both attracting and
unstable. This provides an interesting contrast against example 1.2.8 and Theorem 1.2.1, where we
saw that an attracting fixed point of a continuous one-dimensional map must also be stable. On the
one hand, this indicates that we may not simply take what we knew in the one-dimensional case
and assume it to be true in higher dimensions. On the other hand, we will now prove a theorem
that can be seen as the multi-dimensional case of Theorem 1.2.2. Before doing so, let us introduce
some notation.

We will use J(F, x) to denote the Jacobian matrix of F evaluated at x: the entry in the ith row and
the jth column of J(F, x) is the value of ∂yi /∂xj at x, where (y1 , y2 , . . . , yd ) = F (x1 , x2 , . . . , xd ).
The following Lemma also makes use of matrix norms. For their precise definition see section 2.2 of
the Appendix.

36
Lemma 1.3.1:
Let X ⊆ Rd and let F : X → X be a differentiable map with a fixed point x∗ . If λ1 , λ2 , . . . , λd are
the eigenvalues of J(F, x∗ ), then

1. If |λi | < 1 for all integers i such that 1 ≤ i ≤ d, then limt→∞ ||J(F t , x∗ )|| = 0.
2. If |λi | > 1 for any integer i such that 1 ≤ i ≤ d, then limt→∞ ||J(F t , x∗ )|| = ∞.
Proof:
By the chain rule,

J(F n , x∗ ) = J(F, F n−1 (x∗ ))J(F n−1 , x∗ )


= J(F, F n−1 (x∗ ))J(F, F n−2 (x∗ ))J(F n−2 , x∗ )
.
= ..
= J(F, F n−1 (x∗ ))J(F, F n−2 (x∗ )) · · · J(F, F (x∗ ))J(F, x∗ ).

Since x∗ is a fixed point of F , we have J(F n , x∗ ) = J(F, x∗ )n .

Let λ1 , λ2 , . . . , λm be the distinct (and possibly complex) eigenvalues of J(Φ, x∗ ), where m ≤ d.


Since J(F, x∗ ) is a square matrix with real entries, we can put it into Jordan normal form. In
particular, J(F, x∗ ) = SJS −1 where J is a block diagonal matrix consisting of Jordan blocks
J1 , J2 , . . . , Jm , and S is some d × d invertible matrix. Note that since J(F, x∗ ) = SJS −1 , we
have J(F t , x∗ ) = J(F, x∗ )t = SJ t S −1 . By Lemma .0.7 in the Appendix, we know that
 t 
J1
Jt = 
 .. ,

(1.12)
.
t
Jm

and  t 
λi a1,2 a1,3 a1,4 ··· a1,p
0
 λti a2,3 a2,4 ··· a2,p 

0 0 λti a3,4 ··· a3,p 
t
Ji =  (1.13)
 
.. 
0
 0 0 . 

0 0 0 0 λti ap−1,p 
0 0 0 0 0 λti
for i = 1, . . . , m, and where p is the multiplicity of λi . The precise values of the entries of Jit are
 
t t−(s−r)
ar,s = λ ,
s−r i

for 1 ≤ r < s ≤ p, by Lemma .0.6 in the Appendix. The point of all this is that since |λi | < 1,
we have limt→∞ λti = 0 and limt→∞ ar,s = 0, and therefore limt→∞ J t = 0M , where 0M is the zero
matrix. The first statement of the Lemma follows. The proof of the second statement is similar,
except we now have that |λi |t → ∞ as t → ∞ for any λi that has magnitude strictly greater than
1. 

37
Theorem 1.3.1:
Let X ⊆ Rd and let F : X → X be a differentiable map with a fixed point x∗ . If λ1 , λ2 , . . . , λd are
the eigenvalues of J(F, x∗ ), then

1. If |λi | < 1 for all integers i such that 1 ≤ i ≤ d, then x∗ is stable and attracting.
2. If |λi | > 1 for any integer i such that 1 ≤ i ≤ d, then x∗ is unstable.

Proof (of 1.):


Let a positive ε be given, and let us assume that ε < 1 with no loss of generality. By the previ-
ous Lemma, there exists a positive integer k such that for all integers t such that t ≥ k, we have
||J(F t , x∗ )|| | < ε/2.

We begin by proceeding backwards from k, using the continuity of F to argue as follows.

There exists a positive δk−1 such that

F k−1 (x) − F k−1 (x∗ ) < δk−1 ⇒ F k (x) − F k (x∗ ) < ε.

There exists a positive δk−2 such that

F k−2 (x) − F k−2 (x∗ ) < δk−2 ⇒ F k−1 (x) − F k−1 (x∗ ) < min{ε, δk−1 }

..
.

There exists a positive δ1 such that

||F (x) − F (x∗ )|| < δ1 ⇒ F 2 (x) − F 2 (x∗ ) < min{ε, δ2 }.

There exists a positive δ0 such that

||x − x∗ || < δ0 ⇒ F 1 (x) − F 1 (x∗ ) < min{ε, δ1 }.

Thus, setting δ = min{1, δ0 , δ1 , . . . , δk−1 }, we get that

||x − x∗ || < δ ⇒ F t ((x) − x∗ < ε

for all integers t such that 1 ≤ t ≤ k.


We now proceed forward from k as follows. Since F k is differentiable at x∗ , there exists a positive
δk such that if ||x − x∗ || < δk then
ε
F k (x) − F k (x∗ ) − J(F k , a)(x − a) < ||x − x∗ || . (1.14)
2

38
Therefore, we have

F k (x) − F k (x∗ ) = F k (x) − F k (x∗ ) − J(F k , a)(x − a) + J(F k , a)(x − a)


≤ F k (x) − F k (x∗ ) − J(F k , a)(x − a) + J(F k , a)(x − a)
ε ε
≤ ||x − x∗ || + ||(x − x∗ )||
2 2
≤ ε ||x − x∗ || . (1.15)

Note that F k (x) is closer to x∗ than x was. Thus, (1.14) holds with F k (x) in place of x and may
iterate (1.15) to get that

F 2k (x) − F 2k (x∗ ) < ε F k x − x∗ < ε2 ||x − x∗ || .

Proceeding in this fashion yields

F mk (x) − F mk (x∗ ) < εm ||x − x∗ ||

for every positive integer m. Since ε < 1 we get

lim F tk (x) = x∗ .
t→∞

To complete the proof, we note that we can apply this argument to every integer r such that
1 ≤ r ≤ k − 1 to get a δk+r for which

F mk+r (x) − F mk+r (x∗ ) < εm ||F r (x) − x∗ ||

for every point x such that ||F r (x) − x∗ || < min{δ, δk+r }.

We conclude that for every point x such that ||x − x∗ || < min{δ, δk , δk+1 , . . . , δ2k−1 } we have
limt→∞ F k×t+r (x) = x∗ for every integer r such that 0 < r < k, and thus limt→∞ F t (x) = x∗ .
This shows that x∗ is attracting, and in fact we have also shown that ||F t (x) − x∗ || < ε for all
positive integers t, so that x∗ is stable as well. 

Proof (of 2.):


By the previous Lemma, we know that ||J(F t , x∗ )|| is unbounded. Thus, we may find a positive inte-
ger t0 such that ||J(F t , x∗ )|| > M > 1 for all integers t such that t ≥ t0 . Let us note that by the defini-
0 0
tion of our matrix norm, there exists a vector v such that ||v|| ≤ 1 and J(F t , x∗ ) = J(F t , x∗ )v .
0
Since F t is differentiable at x∗ , we know that there exists a positive δc such that if x is any
point such that
||x − x∗ || < δc , (1.16)
then
0 0 0
F t (x) − F t (x∗ ) − J(F t , x∗ )(x − x∗ ) M
< . (1.17)
||x − x∗ || 4 ||v||

39
Let us now set ε = δc and let a positive δ be given. Without loss of generality, we will assume δ < δc .

Consider the point



x = x∗ + .
2 ||v||
Note that ||x − x∗ || < δ, so we now have
0 0 0 0 0
F t (x) − x∗ ≥ J(F t , x∗ )(x − x∗ ) − J(F t , x∗ )(x − x∗ ) − F t (x) + F t (x∗ )
0 vδ 0 0 0
= J(F t , x∗ ) − F t (x) − F t (x∗ ) − J(F t , x∗ )(x − x∗ )
2 ||v||
δ 0 M ||x − x∗ ||
> J(F t , x∗ )v −
2 ||v|| 4 ||v||
δ 0 M ||x − x∗ ||
= J(F t , x∗ ) −
2 ||v|| 4 ||v||
||x − x∗ || 0 M ||x − x∗ ||
> J(F t , x∗ ) −
2 ||v|| 4 ||v||
= M ||x − x∗ || .
0 0
If F t (x) − x∗ ≥ ε then we are done. Otherwise, we have F t (x) − x∗ < ε = δc and thus
t0
condition 1.16 holds with F (x) in place of x, which allows us to iterate to produce
0 0
F 2t (x) − x∗ > M F t (x) − x∗ > M 2 ||x − x∗ || .
0
Repeating these arguments yields that F kt (x) − x∗ > M k ||x − x∗ || for all positive integers k.
0
Since M > 1, we must have F kt (x) − x∗ > ε for some positive integer k, and we are done. 

Example 1.3.8:
Recall from example 1.3.5 that the fixed points of the delayed logistic map, Dµ (x1 , x2 ) = (µx1 (1 −
x2 ), x1 ), are
(0, 0)
and
µ−1 µ−1
(, ).
µ µ
The Jacobian J(Dµ , (x1 , x2 )) of Dµ at a point (x1 , x2 ) is given by
 
µ − µx2 −µx1
.
1 0
Its eigenvalues are the roots of
 
µ − µx2 − λ −µx1
det = (µ − µx2 − λ)(−λ) + µx1
1 −λ
= λ2 − λ(µ − µx2 ) + µx1 . (1.18)

40
Let us focus on the fixed point (0, 0) first. In this case the eigenvalues of J(Dµ , (0, 0)) are the roots
of λ2 − λµ, which are 0 and µ. Thus (0, 0) is stable and attracting when |µ| < 1 and unstable when
|µ| > 1.

Now let us look at the other fixed point, ((µ − 1)/u, (µ − 1)/µ), which we will denote with x∗ .
Equation (1.18) with x1 = x2 = (µ − 1)/µ becomes

λ2 + λ + µ − 1

and thus the eigenvalues of J(Dµ , x∗ ) are



−1 ± 5 − 4µ
.
2
When µ < 1, we may note that √
−1 − 5 − 4µ
< −1,
2
so that x∗ is unstable.

When µ = 1, the eigenvalues are 0 and −1, so that Theorem 1.3.1 is inconclusive.

When µ = 5/4, we have only one eigenvalue, which is −1/2. Since | − 1/2| < 1, we know that
x∗ is stable and attracting.

Let us now suppose that 1 < µ < 5/4. We note that

1 < µ < 5/4 ⇒ 4 < 4µ < 1


⇒ − 4 > −4µ > −5
⇒ 1 > 5 − 4µ > 0
p
⇒ 1 > 5 − 4µ > 0 since 5 − 4µ > 0

1 5 − 4µ
⇒ > > 0,
2 2
so that √
−1 ± 5 − 4µ
< 1,
2
and thus x∗ is stable and attracting.

Finally, let us now suppose that µ > 5/4. In this case the eigenvalues are complex, and there-
fore √ √ r
−1 ± 5 − 4µ −1 ± i 4µ − 5 1 4µ − 5 √
= = + = u − 1.
2 2 4 4
Therefore, in this case there is only one distinct eigenvalue and it has magnitude less than 1 only
when µ < 2.

41
In summary, the fixed point (0, 0) is stable and attracting when |µ| < 1 and unstable when |µ| > 1.
When |µ| = 1 our analysis is inconclusive. As for the other fixed point, x∗ = ((µ − 1)/µ, (µ − 1)/µ),
we have that x∗ is unstable when µ < 1, stable and attracting when 1 < µ < 2, and unstable again
when µ > 2. Our analysis about x∗ is inconclusive when µ = 1 or µ = 2. 

The examples from this section can be also be found in [11], [31] and [10]. The definitions in this
section are similar to those found in [10]. Theorem 1.3.1 can be found in most texts, and in fact
it is taken as the definition of stability in most ([8], for example). Reference [10], from where we
borrowed our definitions, only proves the theorem in the two-dimensional case.

1.3.3 Lyapunov exponents in general.

Let us return to the general setting where Φ : X → X is any map (and not necessarily differentiable),
and X ⊆ Rd . For every point x in the interior of X, and for every y in Rd , the local Lyapunov
exponent of Φ at x with respect to direction y is defined as the limit (if it exists)

||Φ(x + δy) − Φ(x)||


lim ln ;
δ→0 ||δy||

the global Lyapunov exponent of Φ at x with respect to y is defined as the limit (if it exists)

1 ||Φt (x + δy) − Φt (x)||


lim lim ln .
t→∞ t δ→0 ||δy||
If the global Lyapunov exponent of Φ at x with respect to y equals λ then, for all sufficiently large
values of t for all sufficiently small (with respect to the value of t) values of δ, we have

||Φt (x + δy) − Φt (x)||


≈ eλt .
||δy||

1.3.4 Lyapunov exponents of differentiable maps.

Let X ⊆ Rd and let F : X → X be a differentiable map. Recall that J(F, x) is the Jacobian matrix
of F evaluated at x. Once again, since

||F (x + δy) − F (x)|| = ||J(F, x)δy|| + o(||δy||) as δ → 0,

the local Lyapunov exponent of a differentiable map Φ at a point x with respect to a direction y
equals
||J(Φ, x)y||
ln
||y||
and the global Lyapunov exponent of Φ at x with respect to y, which we will denote by λ(x, y),
equals
1 ||J(Φn , x)y||
lim ln . (1.19)
n→∞ n ||y||

42
For future reference, let us note that
n−1
1 ||J(Φn , x)y|| 1X J(Φt+1 , x)y
ln = ln
n ||y|| n t=0 ||J(Φt , x)y||
n−1
1 X ||J(Φ, Φt (x))J(Φt , x)y||
= ln . (1.20)
n t=0 ||J(Φt , x)y||

This means that the global Lyapunov exponent of Φ at point x with respect to direction y is the
average of the local Lyapunov exponents of Φ at points Φ0 (x), Φ1 (x), Φ2 (x), . . . of the trajectory
of x with respect to direction J(Φt , x)y at each point Φt (x).

1.3.5 Computing global Lyapunov exponents.

How can we compute the global Lyapunov exponent of a prescribed differentiable map Φ : X → X
(such that X ⊆ Rd ) at a prescribed point x0 in the interior of X and with respect to a prescribed
direction y 0 ? Answers to this question depend on the way Φ is prescribed; let us assume that it is
prescribed by an oracle that, given any x in X, returns Φ(x).

In this situation, we can compute iteratively

xt = Φ(xt−1 ), y t = δt Φ(xt−1 + y t−1 ) − Φ(xt−1 ) ,




with each δt a (possibly negative) number small enough to ensure that

Φ(xt + y t ) − Φ(xt ) ≈ J(Φ, xt )y t ,

until the sequence of averages


n−1
1 X ||Φ(xt + y t ) − Φ(xt )||
ln
n t=0 ||y t ||
shows signs of convergence, at which time we return an estimate of its limit as an estimate of the
Lyapunov exponent.

To justify this policy, we use induction on t to show that

y t ≈ δ1 δ2 · · · δt J(Φt , x0 )y 0 : (1.21)

in the induction step, we argue that

y t+1 =δt+1 Φ(xt + y t ) − Φ(xt )




≈δt+1 J(Φ, xt )y t
≈δt+1 J(Φ, xt )δ1 δ2 · · · δt J(Φt , x0 )y 0
=δ1 δ2 · · · δt+1 J(Φ, Φt (x0 ))J(Φt , x0 )y 0
=δ1 δ2 · · · δt+1 J(Φt+1 , x0 )y 0 .

43
From (1.21), it follows that

||Φ(xt + y t ) − Φ(xt )|| y t+1 J(Φt+1 , x0 )y 0


= ≈ ,
||y t || |δt+1 | ||y t || ||J(Φt , x0 )y 0 ||
and so

n−1 n−1
1 X ||Φ(xt + y t ) − Φ(xt )|| 1X J(Φt+1 , x0 )y 0
ln ≈ ln ; (1.22)
n t=0 ||y t || n t=0 ||J(Φt , x0 )y 0 ||
as n tends to infinity, the right-hand side of (1.22) converges to the global Lyapunov exponent of Φ
at point x0 with respect to direction y 0 .

1.3.6 The spectrum of Lyapunov exponents.

We will now consider a specific differentiable map Φ : X → X with X ⊆ Rd . In Section 1.3.4, we


have noted that for every x such that x ∈ X and an arbitrary direction y such that y ∈ Rd ,
||J(Φn , x)y||
 
1
λ(x, y) = lim ln .
n→∞ n ||y||
Note that for every constant, non-zero c we have
||J(Φn , x)cy|| ||J(Φn , x)y||
   
1 1
λ(x, cy) = lim ln = lim ln = λ(x, y)
n→∞ n ||cy|| n→∞ n ||y||
so that we may assume ||y|| = 1 without any loss of generality. Thus,
1 1
ln ||J(Φn , x)y|| = lim ln y T J(Φn , x)T J(Φn , x)y .

λ(x, y) = lim
n→∞ n n→∞ 2n

Oseledets ([25]; see also [26] and Section 9.1 of [30]) proved that under certain conditions on Φ,
which are only mildly restrictive, there is a d × d real symmetric matrix A such that
1 1
ln y T J(Φn , x)T J(Φn , x)y = lim ln y T A2n y for all y in Rd .
 
lim (1.23)
n→∞ 2n n→∞ 2n

We should note here that in the one-dimensional case where d = 1, this reduces to saying that there
exists a number a such that
1
lim ln |Φ0 (Φn (x))| = a,
n→∞ n
which is simply a consequence of the Birkhoff Ergodic Theorem we referred to in Section 1.3.4.

The Spectral Theorem (see Appendix 2.2) guarantees that there are real numbers λ1 , . . . , λd and
Pd
an orthonormal basis y 1, . . . , y d of Rd such that Ay j = λj y j for all j. Writing y as i=1 ci y i (with
ci = y T y i ), we find that
!  
X d Xd d
X d
X d
X
y TA2n y = ci y i TA2n  cj y j  = ci (y i )T cj λ2n
j y j
= c2i λ2n
i ,
i=1 j=1 i=1 j=1 i=1

44
and so
λ(x, y) = ln max{|λi | : ci 6= 0}.

Suppose we reorder the λi ’s so that |λ1 | ≥ |λ2 | ≥ . . . ≥ |λd |, with corresponding (orthogonal)
eigenvectors y 1 , . . . , y k . Then the ith Lyapunov exponent of Φ at x is λ(x, y i ) = ln |λi |. In particular,
λ(x, y 1 ) = ln |λ1 | is the maximal Lyapunov exponent of Φ at x. Furthermore, for any direction y
such that y ∈ Rd , if
i = min {i : y T y i 6= 0},
i=1,...,d

then λ(x, y) = λ(x, y i ) = ln |λi |.

For future reference, let us make note that for a given point x, a randomly chosen direction y
in Rd will almost surely have λ(x, y) equal to the maximal Lyapunov exponent: all exceptions lie in
the (d − 1)-dimensional space of vectors orthogonal to any of the vectors y k , for which the eigenvalue
λk of A has the largest absolute value.

1.3.7 Avoiding Oseledets’ Theorem.

The use of Oseledets’ Theorem in the previous section is not always necessary. Since the statement
of the theorem guarantees the existence of a real symmetric matrix A in (1.23), but gives no way of
finding it, we will now consider situations where A can be found directly.

Constant, symmetric Jacobians.

Theorem 1.3.2:
Suppose Φ : X → X (X ⊆ Rd ) is a differentiable map such that for a certain point x ∈ X, we have

1. J(Φ, Φt (x)) = J(Φ, x) for all positive integers t, and


2. J(Φ, x) is symmetric.

Furthermore, let |λ1 | ≥ |λ2 | ≥ · · · ≥ |λd | be the (not necessarily distinct) eigenvalues of J(Φ, x), and
let y 1 , y 2 , . . . , y d be their corresponding eigenvectors. For every direction y in Rd we have

λ(x, y) = ln |λk |,

where
k = min {i : y T y i 6= 0}.
i=1,...,d

Proof:
In this case we may immediately set A = J(Φ, x) and see that

J(Φn , x) = J(Φ, Φn−1 (x)) · · · J(Φ, Φ(x))J(Φ, x) = An

45
so that
J(Φn , x)T J(Φn , x) = (J(Φ, x)T )n J(Φ, x)n = A2n .
Now equation (1.23) trivially holds. As before, the Spectral Theorem is applied, and we conclude
that the spectrum of Lyapunov exponents is given by the natural logarithm of the absolute values
of eigenvalues of J(Φ, x) (although they are all positive in this case, since J(Φ, x) is symmetric). 

Example 1.3.9:
Let us recall Arnold’s cat map:

Φ(x1 , x2 ) = (x1 + x2 mod 1, x1 + 2x2 mod 1),

in which case  
1 1
J(Φ, (x1 , x2 )) =
1 2
as long as x1 + x2 ∈
/ Z, and x1 + 2x2 ∈ / Z, in order to avoid discontinuities. Let us assume that x
is a point in R2 such that the trajectory x, Φ(x), Φ2 (x), . . . never hits any of these discontinuities.
Note that J(Φ, x) is constant and symmetric along any such trajectory. Therefore, the Lyapunov
exponents are given by ln |λ1 |, ln |λ2 |, where λ1 , λ2 are the eigenvalues of J(Φ, x).

The eigenvalues of J(Φ, x) are the roots of


 
1−λ 1
det = (1 − λ)(2 − λ) − 1
1 2−λ
= 2 − 2λ − λ + λ2 − 1
= λ2 − 3λ + 1.

√ √
Thus, the √ (3 + 5)/2, λ2 = (3 − 5)/2, with respective
√ eigenvalues are λ1 = eigenvectors e1 =
(2/(1 + 5), 1), e2 = (2/(1 − 5), 1), which form an√orthonormal basis of R2 . Therefore, if y is
0
any direction not√ parallel to e2 , then λ(x, y) = (3 + 5)/2. If, however, y is parallel to e2 , then
λ(x, y 0 ) = (3 − 5)/2. 

Constant Jacobians.

In fact, we may drop the symmetry condition for a slightly weaker condition.

Theorem 1.3.3:
Suppose Φ : X → X (X ⊆ Rd ) is a differentiable map such that for a certain point x ∈ X, we have

1. J(Φ, Φt (x)) = J(Φ, x) for all positive integers t, and


2. J(Φ, x) has d distinct, nonzero eigenvalues λ1 , . . . , λd .

46
Then the spectrum of Lyapunov exponents of the trajectory of x is given by

ln |λ1 |, ln |λ2 |, . . . , ln |λd |.

Proof:
Let λ1 , . . . , λd be the eigenvalues of J(Φ, x), and let v1 , . . . , vd be the corresponding eigenvectors.
We will assume |λ1 | ≥ |λ2 | ≥ . . . ≥ |λd |.

Let us begin with an arbitrary direction y, where y ∈ Rd . Since the d eigenvalues are distinct,
the d eigenvectors are linearly independent (see Lemma .0.11 in Appendix), and we can decompose
Pd
y as y = i=1 ci vi for some constant ci ’s. Then
d
X
J(Φn , x)y = J(Φ, x)n y = ci λni vi . (1.24)
i=1

The next step is to consider the action of left-multiplying (1.24) by J(Φn , x)T . Since J(Φ, x) is not
necessarily symmetric, we will do this by decomposing each of the eigenvectors of J(Φ, x) in terms
of the eigenvectors of J(Φ, x)T . By Lemma .0.8 in the Appendix we know that J(Φ, x) and J(Φ, x)T
have the same eigenvalues. Let w1 , . . . , wd be the eigenvectors of J(Φ, x)T that correspond to the
Pd
eigenvalues λ1 , . . . , λd , respectively. For each vi we may write vi = j=1 bij wj , for some constants
bij . We now have
d
X
y T J(Φn , x)T J(Φn , x)y = y T J(Φn , x)T ci λni vi
i=1
d
X Xd
= y T J(Φn , x)T ci λni ( bij wj )
i=1 j=1
d X
X d
= y T J(Φn , x)T ci bij λni wj
i=1 j=1
d X
X d
= yT ci bij λni λnj wj
i=1 j=1
d
! d X
d
X X
T
= ci vi × ci bij λni λnj wj
i=1 i=1 j=1
d X
X d X
d
= ci ck bij λni λnj vk T wj .
i=1 j=1 k=1

47
By Lemma .0.9, vk T wj = 0 if k 6= j, and so we may proceed with
d X
X d
y T J(Φn , x)T J(Φn , x)y = ci cj bij λni λnj vj T wj
i=1 j=1
d
X d
X X
= c2i bii λ2n T
i vi wi + ci cj bij λni λnj vj T wj
i=1 i=1 j=1,...,d,
j6=i

By Lemma .0.10, bii 6= 0 and viT wi 6= 0 for all i. Therefore

1
lim ln y T J(Φn , x)T J(Φn , x)y = ln |λ1 |,
n→∞ 2n
unless c1 = 0. Similarly, if c1 = 0, then
1
lim ln y T J(Φn , x)T J(Φn , x)y = ln |λ2 |,
n→∞ 2n
unless c2 = 0, and so on. Thus, the spectrum of Lyapunov exponents of Φ at x are ln |λ1 |, ln |λ2 |, . . . , ln |λd |.

Remark:
The coefficients ci correspond to the directions of the eigenvectors of J(Φ, x), which need not be
orthogonal. How does this relate to the matrix A in equation (1.23)? We want to know what the d
orthogonal directions at x are that would yield the d different Lyapunov exponents of Φ at x. What
we are looking for is a matrix A such that the eigenvalues of A are the same as the eigenvalues of
J(Φ, x) and whose corresponding eigenvectors y1 , . . . , yd are such that

y 1 = v1
y2 is orthogonal to y1 and span{y1 , y2 } = span{v1 , v2 }
y3 is orthogonal to y1 and y2 , and span{y1 , y2 , y3 } = span{v1 , v2 , v3 }
..
.
yd is orthogonal to y1 , y2 , . . . , yd , and span{y1 , y2 , . . . , yd } = span{v1 , v2 , . . . , vd }.

This can be done via Gram-Schmidt orthonormalization. We now seek a matrix A, such that
a) the eigenvalues of A are the eigenvalues of J(Φ, x), namely λ1 , . . . , λd , and such that
b) the eigenvectors of A are the yi specified above.

To this end, let S be the matrix whose columns are the orthonormal vectors y1T , . . . , ydT and let
D be the diagonal matrix whose diagonal entries are λ1 , . . . , λd . Then A = SDS T is the matrix we
are seeking. An easy check shows that Ayi = SDS T yi = λi yi .

48
In fact, the requirement that J(Φ, x) be constant along a trajectory can be dropped for the slightly
weaker assumption that the eigenvectors and eigenvalues be the same for all J(Φ, Φt (x)). All of the
above arguments still hold and allow us to reach the same conclusion.

Example 1.3.10:
Consider a modified version of Arnold’s cat map:

Φ(x1 , x2 ) = (2x1 + 2x2 mod 1, x1 + 2x2 mod 1),

in which case  
2 2
J(Φ, (x1 , x2 )) = .
1 2
Noe that J(Φ, (x1 , x2 )) is not symmetric, but that it is the same throughout the trajectory beginning
at any point (x1 , x2 ) (as long as the trajectory never hits a point with an integer coordinate). Thus,
Theorem 1.3.3 may be applied. The eigenvalues are given by the roots of
 
2−λ 2
det = (2 − λ)2 − 2,
1 2−λ

so that√the eigenvalues are 2 ± 2. Therefore, the spectrum of Lyapunov exponents is given by
ln |2 ± 2|. 

A slight improvement.

We may improve slightly on the previous setting as follows. We consider a differentiable map and
a trajectory such that the Jacobians have d eigenvalues and d − 1 eigenvectors in common. More
precisely, we have the following.

Theorem 1.3.4:
Consider a differentiable map Φ : X → X (X ⊆ Rd ) such that there is a specific x ∈ X for which
the following requirements are met:

1. J(Φ, x), J(Φ, Φ2 (x)), J(Φ, Φ3 (x)), . . . each have the same d distinct, non-zero eigenvalues λ1 , . . . , λd
2. There are vectors v1 , v2 , . . . , vd−1 in Rd such that for all positive integers t we have

J(Φ, Φt (x))vi = λi vi .

Then the spectrum of Lyapunov exponents of the trajectory of x is given by

ln |λ1 |, ln |λ2 |, . . . , ln |λd |.


Proof:
The proof proceeds similarly to that of the previous section. To start, let

λ1 , λ2 , . . . , λd

49
and
v1 , v2 , . . . , vd−1
be the d eigenvalues and d − 1 eigenvectors, respectively, of J(Φ, x) (which are the same as those
of J(Φ, Φt (x)) for all t). Let vdt be the dth eigenvector of J(Φ, Φt−1 (x)). The main idea is that the
eigenvector vdt can be decomposed in terms of the eigenvectors of J(Φ, Φt (x)) as
d−1
X
vdt = ct+1
i vi + ct+1 t+1
d vd ,
i=1

for some constants ct+1 t+1 t+1


1 , c2 , . . . , cd .
Pd−1
We begin with a direction y and decompose it as y = i=1 c1i vi + c1d vd1 . For simplicity, when
n > m, let us write

J(n, m) = J(Φ, Φn−1 (x))J(Φ, Φn−2 (x)) · · · J(Φ, Φm−1 (x)).

50
We now have
d−1
!
X
J(n, 1)y = J(n, 1) c1i vi + c1d vd1
i=1
d−1
!
X
= J(n, 2) c1i λi vi + c1d λd vd1
i=1
d−1 d−1
!
X X
= J(n, 2) c1i λi vi + c1d λd ( c2i vi + c2d vd2 )
i=1 i=1
d−1
!
X
= J(n, 2) (c1i λi vi + c1d c2i λd vi ) + c1d c2d λd vd2
i=1
d−1
!
X
= J(n, 3) (c1i λ2i vi + c1d c2i λd λi vi ) + c1d c2d λ2d vd2
i=1
d−1 d−1
!
X X
= J(n, 3) (c1i λ2i vi + c1d c2i λd λi vi ) + c1d c2d λ2d ( c3i vi + c3d vd3 )
i=1 i=1
d−1
!
X
= J(n, 3) (c1i λ2i vi + c1d c2i λd λi vi + c1d c2d c3i λ2d vi ) + c1d c2d c3d λ2d vd3
i=1
d−1
!
X
= J(n, 4) (c1i λ3i vi + c1d c2i λd λ2i vi + c1d c2d c3i λ2d λi vi ) + c1d c2d c3d λ3d vd3
i=1
..
= .
d−1
X
c1i λni + c1d c2i λd λin−1 + c1d c2d c3i λ2d λn−2 + · · · + c1d c2d · · · cn−1 cni λdn−1 λi vi

= i d
i=1
+ c1d c2d c3d · · · cnd λnd vdn .

Similarly, we will let w1 , w2 , . . . , wdt be the eigenvectors of J(Φ, Φt (x))T . By Lemma .0.9, we may
write vi = bni wi for i = 1, . . . , d − 1, and vdn = bnd wdn for some constants bni . Furthermore, for
t = 2, . . . , n there are constants bit−1 such that
d−1
X
wdt = bt−1
i wi + bt−1 t−1
d wd .
i=1

Before we continue, let us simplify the expressions by writing


C = c1d c2d · · · cnd
and
βi = c1i λni + c1d c2i λd λn−1
i + c1d c2d c3i λ2d λin−2 + · · · + c1d c2d · · · cn−1
d cni λn−1
d λi . (1.25)

51
We may now proceed with
"d−1 #
X
T T T T
y J(n, 1) J(n, 1)y = y J(n, 1) βi v i + Cλnd vdn
i=1
"d−1 #
X
= y T J(n, 1)T βi bni wi + Cλnd bnd wdn
i=1
" d−1 #
X
T
= y J(n − 1, 1) T
βi bni λi wi + Cbnd λn+1
d wdn
i=1
" d−1 d−1
!#
X X
T
= y J(n − 1, 1) T
βi bni λi wi + Cbnd λn+1
d bn−1
i wi + bn−1
d wdn−1
i=1 i=1
" d−1 #
X
T T n n n−1 n+1 n n−1 n+1 n−1
= y J(n − 1, 1) (βi bi λi + Cbd bi λd )wi + Cbd bd λd wd
i=1
" d−1 #
X
= y T J(n − 2, 1)T (βi bni λ2i + Cbnd bn−1
i λn+1
d λi )wi + Cbnd bn−1
d λn+2
d wdn−1
i=1
" d−1
X
T
= y J(n − 2, 1) T
(βi bni λ2i + Cbnd bn−1
i λn+1
d λi )wi
i=1
d−1
!#
X
+ Cbnd bn−1
d λn+2
d bn−2
i wi + bn−2
d wdn−2
i=1
" d−1
X
T
= y J(n − 2, 1) T
(βi bni λ2i + Cbnd bn−1
i λn+1
d λi + Cbnd bn−1
d bin−2 λn+2
d )wi
i=1
#
+ Cbnd bn−1
d bn−2
d λn+2
d wdn−2
" d−1
X
T
= y J(n − 3, 1) T
(βi bni λ3i + Cbnd bn−1
i λn+1
d λ2i + Cbnd bdn−1 bn−2
i λn+2
d λi )wi
i=1
#
+ Cbnd bn−1
d bn−2
d λn+3
d wdn−2

..
.

52
So that
" d−1
X
T T
y J(n, 1) J(n, 1)y = y (βi bni λni + Cbnd bn−1
T
i λn+1
d λn−1
i + Cbnd bn−1
d bn−2
i λn+2
d λin−2
i=1
+ · · · + Cbnd bn−1
d · · · b2d b1i λ2n−1 λi )wi
# d
+ Cbnd bn−1
d bn−2
d · · · b1d λ2n 1
d wd .

Pd−1
Now, with y T = i=1 c1i viT + c1d (vd1 )T , and applying Lemma .0.9, we get
d−1
X
y T J(n, 1)T J(n, 1)y = (βi bni λni + Cbnd bn−1
i λn+1
d λn−1
i + Cbnd bn−1
d bn−2
i λn+2
d λn−2
i
i=1
+ · · · + Cbnd bdn−1 · · · b2d b1i λ2n−1
d λi )c1i viT wi
+ Cbnd bn−1
d bn−2
d · · · b1d λ2n 1 1 T 1
d cd (vd ) wd .

Finally, plugging expression (1.25) for βi , we get


d−1
"
X
T T 2n−1
y J(n, 1) J(n, 1)y = (c1i λ2n 1 2
i + cd ci λd λi + c1d c2d c3i λ2d λi 2n − 2
i=1
+ · · · + c1d c2d · · · cn−1
d cni λn−1
d λn+1
i )bni
+ Cbnd bn−1
i λn+1
d λn−1
i + Cbnd bn−1
d bn−2
i λn+2
d λn−2
i
#
+ · · · + Cbnd bn−1
d · · · b2d b1i λ2n−1
d λi c1i viT wi

+ Cbnd bn−1
d bdn−2 · · · b1d λ2n 1 1 T 1
d cd (vd ) wd .

The expression Cbnd bn−1 d bdn−2 · · · b1d c1d (vd1 )T wd1 can only be 0 if c1d = 0; btd cannot be 0 for any
t ∈ {2, 3, . . . , n}, since this would cause w1 , w2 , . . . , wdt+1 to be linearly dependent, which they are
not. Similarly, ctd 6= 0 for any t ∈ {2, 3, . . . , n}. Furthermore, (vd1 )T wd1 6= 0 by Lemma .0.10.

Since λd always appears with a coefficient c1d , this allows us to reach the same conclusion as the
previous section. Namely,
1 T
lim y J(n, 1)T J(n, 1)y = ln max{|λi | : c1i 6= 0}. 
n→∞ n

Example 1.3.11:
Recall the Kaplan-Yorke map:

Φ(x1 , x2 ) = (ax1 mod 1, bx2 + cos(4πx1 ) mod 1).

53
As long as ax1 ∈
/ Z and bx2 + cos(4πx1 ) ∈/ Z, we have
 
a 0
J(Φ, (x1 , x2 )) = .
−4π sin(4πx1 ) b

The eigenvalues of J(Φ, (x1 , x2 )) are the roots of


 
a−λ 0
det = (a − λ)(b − λ)
−4π sin(4πx1 ) b − λ
= ab − (a + b)λ + λ2 ,

which are a and b. The corresponding eigenvectors are


a−b
( , 1)
−4π sin 4πx1
and
(0, 1),
respectively. Thus, we are in a situation where the eigenvalues remain constant along a trajectory,
and all but one their corresponding eigenvectors remain the same as well. We can apply Theorem
1.3.4 to conclude that the Lyapunov exponents of Φ at x are ln |a| and ln |b|, as long as the trajectory
starting at x never hits any of the discontinuities at integer values. 

Example 1.3.9 can be found in [16]. The exact Lyapunov exponents calculated in Example 1.3.11
are stated in [36], but without any explanations. Our justifications through the manipulation of
eigenvectors as we have presented them in Theorems 1.3.3 and 1.3.4 seem to be absent from the
literature.

1.3.8 From a trajectory to its maximal Lyapunov exponent.

Given only the trajectory x0 , x1 , x2 , . . . of a point in Rd under some otherwise unknown differentiable
map Φ : Rd → Rd , we can estimate the maximal Lyapunov exponent of Φ at x0 by choosing
iteratively suitable nonnegative integers s(0), s(1), s(2), . . . until the sequence of averages
n−1
1X xs(t)+1 − xt+1
ln (1.26)
n t=0 xs(t) − xt

shows signs of convergence, at which time we return an estimate of its limit as an estimate of
the maximal Lyapunov exponent of the trajectory. (What we are actually estimating here is the
global Lyapunov exponent of Φ at x0 with respect to direction xs(0) − x0 ; as noted at the end of Sec-
tion 1.3.6, this Lyapunov exponent is likely to be the maximal Lyapunov exponent of the trajectory.)

This procedure is a close relative of the procedure described in Section 1.3.5: with y t standing
for xs(t) − xt , we have
xs(t)+1 − xt+1 ||Φ(xt + y t ) − Φ(xt )||
= .
x s(t) −x t ||y t ||

54
With this in mind, we must establish how to choose s(t) in a suitable fashion.

How to choose s(t).

Given a nonnegative threshold z − and positive threshold z + , representing distances that are consid-
ered to be too close (possibly due to noise) and too far, respectively, we proceed as follows. Choose
s(0) to make xs(0) − x0 as small as possible but larger than z − . The policy of Section 1.3.5 would
guide us to choose each subsequent s(t) so that it is different from t and so that
 
xs(t) = xt + δt xs(t−1)+1 − xt

with each δt a (possibly negative) number small enough in magnitude to ensure that

Φ(xs(t) ) − Φ(xt ) ≈ J(Φ, xt )(xs(t) − xt ), (1.27)

but big enough to ensure that the distance between xs(t) and xt is above the prescribed level of
noise, z − . Let us write
Ct = {xk : z − < xk − xt < z + }.
If xs(t−1)+1 ∈ Ct then we declare δt = 1 to be small enough to satisfy (1.27), and we simply set
s(t) = s(t − 1) + 1. Otherwise we consider all candidates in the set {xs(t−1)+1 } ∪ Ct . From this set
we choose xs(t) so that xs(t) −xt approximates a multiple of xs(t−1)+1 −xt that is small in magnitude.
In order to do this, we note the following.
All multiples of xs(t−1)+1 − xt lie on a line passing through the origin; all multiples of xk − xt lie on
another line passing through the origin; the cosine of the angle between these two lines is

(xs(t−1)+1 − xt ) · (xk − xt )
. (1.28)
xs(t−1)+1 − xt ||xk − xt ||

This quantity lies between −1 and 1. Choices of k for which xk − xt is a good approximation of
xs(t−1)+1 − xt will have the magnitude of this quantity lying closer to 1. Consequently, candidates
for s(t) are ranked by a partial order  : with ct (k) standing for the magnitude of the value in (1.28),
we have
xi − xt ≤ xj − xt and ct (i) ≥ ct (j) ⇒ i  j (1.29)
The selected s(t) is a maximal element in this partial order.

The fact that  is only a partial ordering leads to some ambiguity: different rules for breaking
ties may lead to different outputs from the algorithm. In order to remove this ambiguity we will
assume that we are given a linear order that satisfies (1.29). We will refer to this input as an operator
(we can imagine someone manually running the algorithm and hand-picking a partner for each xt ).
The following two definitions will make this idea precise.

55
Definition 1.3.4:
We call a linear order B on [0, 1] × (0, ∞) monotone if for all c1 , c2 , d1 , d2 such that c1 , c2 ∈ [0, 1]
and d1 , d2 ∈ (0, ∞) we have

c1 ≥ c2 and d1 ≤ d2 ⇒ (c1 , d1 ) B (c2 , d2 ).

Definition 1.3.5:
By an operator we mean a triple (z − , z + , B) such that

• z − ≥ 0,
• z + > z − , and
• B is a monotone order on [0, 1] × (0, ∞).

Given a sequence of points x0 , x1 , x2 , . . . in Rd and an index t, an operator (z − , z + , B) selects s(t)


as follows:

1: if t=0 then
2: s(t) = argmin{ xk − x0 : z − < xk − x0 }
k
3: else if t > 0 then
4: if z − < xs(t−1)+1 − xt < z + then
5: s(t) = s(t − 1) + 1
6: else
− k t +
7: Let C = {k( :z < x −x <z } ! )
(xk − xt ) · (xs(t−1)+1 − xt )
8: Let D = , xk − xt : k ∈ {s(t − 1) + 1} ∪ C
||xk − xt || xs(t−1)+1 − xt
9: s(t) = argmaxD with respect to B
k

Algorithm 1.3.1: Selecting s(t).

Note that in the case where the set C is empty (there are no valid candidates) we resort to setting
s(t) = s(t − 1) + 1, even if xs(t−1)+1 − xt ≤ z − or xs(t−1)+1 − xt ≥ z + . We are not left with
any other choice.

We will now use λ(x0 , (z − , z + , B)) to denote the maximal Lyapunov exponent of the sequence
x0 , x1 , . . . with respect to the operator (z − , z + , B), which can be computed via the sum in (1.26).

56
Distance biased operators.

One particularly simple class of operators use angles only to break ties.

Definition 1.3.6:
A linear order B is distance biased if for all c1 , c2 , d1 , d2 such that c1 , c2 ∈ [0, 1] and d1 , d2 ∈ (0, ∞)
we have
d1 < d2 ⇒ (c1 , d1 ) B (c2 , d2 ).
and
d1 = d2 and c1 ≥ c2 ⇒ (c1 , d1 ) B (c2 , d2 )
with respect to B.

We will call (z − , z + , B) a distance biased operator if B is distance biased.

We will consider distance biased operators in more detail in Section 2.2.

A seminal operator.

The operator described in a seminal paper of Wolf et al. [40] is as follows.


The choice of s(0) is such that xs(0) − x0 is as small as possible, but above the prescribed level
of noise, z − .
When t > 0, there are four values α1 , α2 , α3 , α4 such that

1 ≥ α1 > α2 > α3 > α4 ≥ 0

and four distances z1 , z2 , z3 , z4 such that

z1 < z2 < z3 < z4 .

Letting αt (k) stand for the magnitude of the cosine of

(xs(t−1)+1 − xt ) · (xk − xt )
,
xs(t−1)+1 − xt ||xk − xt ||
if there is no k with αt (k) ≥ α4 and z < xk − xt ≤ z4 , then we set

s(t) = s(t − 1) + 1. (1.30)

Otherwise we find

• first the smallest m for which there is a k with αt (k) ≥ αm and z < xk − xt ≤ z4 ,

57
• then the smallest n for which there is a k with αt (k) ≥ αm and xk − xt ≤ zn ,

• and finally a k that maximizes αt (k) subject to z < xk − xt ≤ zn ;

then we set s(t) = k. (The “fixed evolution time” program of Wolf et al. [40] resorts to this policy
only when t is a positive integer multiple of a prescribed parameter evolv and uses the default
(1.30) for all other values of t.)

58
Figure 1.15: The ordering given by Wolf et al. The middle point is the current reference
point, xt . The four outer circles correspond to the four distances z1 , z2 , z3 ,
z4 and the innermost circle corresponds to the noise threshold, z − . Dotted
lines correspond to the angles associated with α1 , α2 , α3 , α4 . Points from
regions with smaller numbers are preferred.

59
The finite case.

In the case of a finite sequence of points x0 , x1 , . . . , xN in Rd coming from some unknown differen-
tiable map Φ, the procedure we just described remains the same, except that instead of waiting for
equation (1.26) to show signs of convergence, we simply take the finite sum
N −1
1 X xs(t)+1 − xt+1
ln
N t=0 xs(t) − xt

as an approximation of the maximal Lyapunov exponent of Φ at x0 .

Let us note here that in this case there is an extra condition imposed on the choice of s(t). Namely
that s(t) < N , so that xs(t)+1 is a point we have access to. Accordingly, we modify the algorithm
for selecting s(t) to the following.

1: if t=0 then
2: s(t) = argmin{ xk − x0 : z − < xk − x0 }
0<k<N
3: else if t > 0 and s(t − 1) + 1 < N then
4: if z − < xs(t−1)+1 − xt < z + then
5: s(t) = s(t − 1) + 1
6: else
− k t +
7: Let C = {k( : z < x − x < z , 0 ≤ k < N} ! )
(xk − xt ) · (xs(t−1)+1 − xt )
8: Let D = , xk − xt : k ∈ {s(t − 1) + 1} ∪ C
||xk − xt || xs(t−1)+1 − xt
9: s(t) = argmaxD with respect to B
0≤k<N

Algorithm 1.3.2: Selecting s(t) in the finite case.

1.3.9 Trajectory-like sequences.

We began Section 1.3.8 by supposing we were given a sequence of points x0 , x1 , x2 , . . . in Rd , and


that the sequence came from some unknown differentiable map. In practical situations, however,
it is likely that we will be given only a sequence of points, with no knowledge of what may have
been the generating force behind it. In such instances the method for approximating the maximal
Lyapunov exponent of a sequence of points that we have described may still be applied, as long as
the sequence obeys the deterministic nature of a dynamical system. The following is really what we
are after.

Fact:
Given any sequence of points x0 , x1 , x2 , . . . in Rd , the following are logically equivalent:

1. There exists a map Φ : Rd → Rd such that for all nonnegative integers i, we have Φ(xi ) = xi+1 .

60
2. For all nonnegative integers i, j, k we have that
xi = xj ⇒ xi+k = xj+k .

Let us give a name to the sequences for which these statements hold.

Definition 1.3.7:
The sequence of points in Rd given by x0 , x1 , x2 , . . . is called trajectory-like if for all nonnegative
integers i, j, k we have that
xi = xj ⇒ xi+k = xj+k .
We will also refer to a finite sequence of points as trajectory-like, if it is the truncated sequence
of some infinite trajectory-like sequence.

Example 1.3.12:
We should note that a sequence can be trajectory-like without there necessarily existing a continuous
map that generates it. Consider, for example, the sequence x1 , x2 , x3 , . . . of points in R given by
1

0 +
 if i = 1 (mod 4),



 i


1 1


i + if i = 0 (mod 2),
x = 3 i




 2 1
+ if i = 3 (mod 4).


3 i


No number appears more than once in the sequence, so it is trivially trajectory-like. However, there
are points arbitrarily close to 1/3 that get mapped arbitrarily close to 0, and points arbitrarily
close to 1/3 that get mapped arbitrarily close to 2/3, so no map generating this sequence can be
continuous at 1/3. Of course, this means that trajectory-like sequences need not have a differentiable
map generating them. 

With these observations out of the way, how does the algorithm behave on specific trajectory-like
sequences? Our first result is about trajectory-like sequences that end where they begin. We start
with the following Lemma.

Lemma 1.3.2:
For every trajectory-like sequence x0 , x1 , x2 , . . . , xN of points in Rd , for every operator (z − , z + , B),
and for all integers t such that 0 < t < N we have that xs(t) − xt ≤ xs(t−1)+1 − xt .
Proof:
The proof proceeds by examining algorithm 1.3.2. Let t be any integer such that 0 < t < N . We see
that if z − < xs(t−1)+1 − xt < z + , then s(t) = s(t − 1) + 1 and the statements holds trivially. If

61
this is not the case, then we note that s(t) is such that it results in a maximal element in the set D.
This means that with v standing for xs(t) − xt , and w standing for xs(t−1)+1 − xt , then with respect
to B we must have    
v·w w·w
, ||v|| ≥ , ||w|| = (1, ||w||).
||v|| ||w|| ||w|| ||w||
Since
v·w
≤ 1,
||v|| ||w||
we must have ||v|| ≥ ||w||. 

Theorem 1.3.5:
If x0 , x1 , x2 , . . . , xN is a trajectory-like sequence of points in Rd such that x0 = xN , then for every
operator (z − , z + , B) such that z − = 0 we have λ(x0 , (z − , z + , B)) ≥ 0.
Proof:
Recall that the maximal Lyapunov exponent of the trajectory is
N −1
1 X xs(t)+1 − xt+1
ln ,
N t=0 xs(t) − xt

which is equal to
N −1
1 Xh i
ln xs(t)+1 − xt+1 − ln xs(t) − xt .
N t=0
By the previous Lemma, we must have

xs(t)+1 − xt+1 ≥ xs(t+1) − xt+1

for all integers t such that 0 ≤ t ≤ N − 2. Thus, we have


N
X −1 h i N
X −2 h i
ln xs(t)+1 − xt+1 − ln xs(t) − xt = ln xs(t)+1 − xt+1 − ln xs(t+1) − xt+1
t=0 t=0

+ ln xs(N −1)+1 − xN − ln xs(0) − x0

≥ − ln xs(0) − x0 + ln xs(N −1)+1 − xN .

Since z − = 0 and we know that s(0) = argmin{ xk − x0 : z − < xk − x0 }, we must have


k

xs(N −1)+1 − x0 ≥ xs(0) − x0 , (1.31)

unless xs(N −1)+1 − x0 ≤ z − = 0. This cannot be the case, as we will now show. Suppose first that
x0 appears exactly twice in the sequence (as x0 and xN ). In order to have xs(N −1)+1 − x0 = 0,
we must have xs(N −1)+1 = x0 , and therefore we must have that xs(N −1) = x−1 , which is not allowed,

62
or xs(N −1) = xN −1 , which is not allowed. Now suppose that x0 appears more than twice. Let p be
the smallest positive integer for which x0 = xp . Noting that

xs(N −1)+1 − x0 = 0 ⇔ xs(N −1)+1 = x0 ,

let k be the positive integer such that s(N −1)+1 = kp. Since the sequence in question is trajectory-
like, we have
x0 = xp = x2p = · · · = xkp = · · · xN
and therefore,
xp−1 = x2p−1 = · · · = xkp−1 = · · · xN −1 .
We cannot have xkp−1 = xs(N −1) = xN −1 , because we require xs(N −1) − xN −1 > z − .

We conclude that the maximal Lyapunov exponent of x0 , x1 , x2 , . . . , xN −1 , x0 is nonnegative, with


respect to the given operator. 

Note that the proof shows that the statement of the Theorem actually holds for every z − such that
xs(N −1)+1 − x0 ≤ z − .
This immediately lends itself to the following observation.

Corollary 1.3.1:
Let x0 , x1 , x2 , . . . , xN be a trajectory-like sequence of points in Rd and let (z − , z + , B) be an operator
such that z − = 0. If the sequence of points x0 , x1 , x2 , . . . , xN , x0 is trajectory-like, then its maximal
Lyapunov exponent is nonnegative with respect to (z − , z + , B).

This Corollary is particularly interesting if we apply it to a trajectory-like sequence of points that


has a negative maximal Lyapunov exponent, which we will encounter in Section 2.2.
Note also that if
x0 , x1 , x2 , . . . , xN
has no repeating points and is trajectory-like, then

x0 , x1 , x2 , . . . , xN , x0

is trajectory-like.

1.3.10 Primitive trajectories.

There is a special case in which the sum in equation 1.26 is trivial. Before explicitly stating that
case, we will introduce the notion of a forward-primitive trajectory.

63
Definition 1.3.8:
A trajectory-like sequence of points x0 , x1 , . . . , xN in Rd is forward-primitive with respect to an
operator (z − , z + , B) if for every integer t such that 0 ≤ t ≤ N − 2 we have s(t) = t + 1.

Here is why these trajectories are special.

Theorem 1.3.6:
If the sequence of trajectory-like points in Rd given by x0 , x1 , . . . , xN is forward-primitive with respect
to an operator (z − , z + , B), then
!
0 − + 1 1 0 N N −1 xs(N −1)+1 − xN
λ(x , (z , z , B)) = − ln x − x + ln x − x + ln .
N xs(N −1) − xN −1
Proof:
In this case, equation (1.26) simply telescopes and becomes
N −1
1 X xs(t)+1 − xt+1
ln
N t=0 xs(t) − xt
−2
N
!
1 xs(0)+1 − x1 X xs(t)+1 − xt+1 xs(N −1)+1 − xN
= ln + ln + ln
N xs(0) − x0 t=1
xs(t) − xt xs(N −1) − xN −1
−2
N
!
1 2
x −x 1 X
t+2 t+1 t+1 t
 xs(N −1)+1 − xN
= ln + ln x −x − ln x −x + ln
N ||x1 − x0 || t=1
xs(N −1) − xN −1
!
1 xs(N −1)+1 − xN
= − ln x1 − x0 + ln xN − xN −1 + ln . 
N xs(N −1) − xN −1

Similarly, a trajectory may be backward-primitive.

Definition 1.3.9:
A trajectory-like sequence of points x0 , x1 , . . . , xN in Rd is backward-primitive with respect to
an operator (z − , z + , B) if for every integer t such that 1 ≤ t ≤ N − 1 we have s(t) = t − 1.

Theorem 1.3.7:
If the sequence of trajectory-like points in Rd given by x0 , x1 , . . . , xN is backward-primitive with
respect to an operator (z − , z + , B), then
!
s(0)+1 1
1 x − x
λ(x0 , (z − , z + , B)) = ln + ln xN −1 − xN − ln x0 − x1 .
N xs(0) − x0

64
Proof:
Once again, equation (1.26) telescopes and we get

N −1
1 X xs(t)+1 − xt+1
λ= ln
N t=0 xs(t) − xt
−1
N
!
1 xs(0)+1 − x1 X xs(t)+1 − xt+1
= ln + ln
N xs(0) − x0 t=1
xs(t) − xt
−1
N
!
s(0)+1 1
1 x −x X
t t+1 t−1 t

= ln s(0)
+ ln x − x − ln x −x
N x − x0 t=1
!
1 xs(0)+1 − x1
= ln + ln xN −1 − xN − ln x0 − x1 . 
N xs(0) − x0

65
Chapter 2

Time Series and Lyapunov


Exponents

2.1 The maximal Lyapunov exponent of a time series.

Given positive integers d, τ and a sequence T = ξ0 , ξ1 , ξ2 , . . . of real numbers, let us write

E(T, τ, d) = x0 , x1 , x2 , . . .

where
xt = (ξt , ξt+τ , ξt+2τ , . . . , ξt+(d−1)τ )
for all nonnegative integers t. This process is usually found in the literature under the term phase
space reconstruction or method of delays.
If E(T, τ, d) is trajectory-like then we refer to its maximal Lyapunov exponent (computed as in Sec-
tion 1.3.8) as the maximal Lyapunov exponent of T in dimension d and with time delay τ with respect
to the operator (z − , z + , B), and we will denote this by λ(E(T, τ, d), (z − , z + , B)). This notation is
becoming cumbersome, but we will leave it this way as a reminder that there is no such thing as
“the maximal Lyapunov exponent of a time series”: many parameters are required.

Takens’ Theorem.

Here are a few comments on the notion we have just described: When a map Φ : X → X describes
deterministic evolution of some physical system, measurements in this system define functions π :
X → R. The trajectory x0 , x1 , x2 , . . . of a state x0 under Φ can be often reconstructed from the
measured values π(x0 ), π(x1 ), π(x2 ), . . .. The celebrated and often-cited Theorem of Takens (proved
in 1981, see [38] for the proof or [32] for an in depth discussion) states that under certain conditions

66
(for instance Φ and π must be smooth1 ), the mapping

xt → (π(xt ), π(Φ(xt )), π(Φ2 (xt )), . . . , π(Φd−1 (xt )))

is a diffeomorphism2 , provided d is large enough. Thus, with the notation

et = (π(xt ), π(xt+τ ), π(xt+2τ ), . . . , π(xt+(d−1)τ )),


x

the asymptotic behaviour of the trajectory

e0 , x
x e1 , x
e2 , . . .

can be thought of as a model of the trajectory

x0 , x1 , x2 , . . . .

The maximal Lyapunov exponent of the trajectory x0 , x1 , x2 , . . . is a quantity that has a physi-
cal meaning; the maximal Lyapunov exponent of the trajectory x e0 , x
e1 , x
e2 , . . . (also known as the
0 1 2
maximal Lyapunov exponent of the time series π(x ), π(x ), π(x ), . . . ) is a quantity that we can
compute.

Example 2.1.1:
The Hénon map H : R2 → R2 defined by

H(x, y) = (y + 1 − ax2 , bx)

was introduced by Michel Hénon in 1976 (see [13]) and will help clarify the meaning of Takens’
Theorem. In particular, the values of a and b that make the map interesting are a = 1.4 and b = 0.3.
A typical trajectory results in the following set of points in the plane.
1A function is smooth if it has derivatives of all orders.
2A bijective map between manifolds that is differentiable and has a differentiable inverse.

67
Figure 2.1: Hénon attractor with initial condition (0.13, 0.24).

If we ignore the second coordinate of each point along the trajectory, we get a one-dimensional
trajectory. More precisely, we are taking the measurement function to be π(x, y) = x, for every
point (x, y) on the trajectory. This results in a sequence of real numbers, which we may view as a
time series.

Figure 2.2: First coordinates of a trajectory of the Hénon map as a time series.

68

We may now embeded the time series in R2 by taking xt = π (H t (x, y)) , π H t+1 (x, y) for all
nonnegative integers t.

Figure 2.3: Embeded trajectory.

What results is the familiar shape of the Hénon attractor, although oriented in a different manner.

Finite case.

Before proceeding, let us make note that in the case of a finite time series T = ξ0 , ξ1 , . . . , ξN and
embedding dimension d, the remark about the restriction on s(t) at the end of Section 1.3.8 now
becomes that s(t) cannot be larger than N − d.

2.2 The maximal Lyapunov exponent of strictly monotonic


time series.

In order to help build some intuition about the connection between what a time series looks like
and what its maximal Lyapunov exponent is, we should begin by taking a look at some simple
time series. If we can find situations in which the procedure for choosing s(t) from Section 1.3.8 is
straightforward, then the calculation is vastly simplified.

We shall begin by dealing with strictly monotonic time series.

69
Definition 2.2.1:
A time series T = ξ0 , ξ1 , ξ2 , . . . is strictly monotonic if it is strictly increasing or strictly decreas-
ing.

Note that if T is strictly monotonic then E(T, τ, d) is trivially trajectory-like.


For the rest of this section we will assume that we are given a distance biased operator (operators
that only consider angles in order to break distance ties). Furthermore, since we are in a pure setting
we will assume that our time series are noiseless, so that we may take z − = 0. For this same reason,
it will be with no loss of generality that we may set τ = 1.

Lemma 2.2.1:
If T = ξ0 , ξ1 , ξ2 , . . . , ξN is a strictly monotonic time series then for every distance biased operator
(0, z + , B) we have s(0) = 1.

Proof:
Recall that
s(0) = argmin { xk − x0 : z − < xk − x0 }.
0<k<N −d+1

Suppose that T is strictly increasing. Then for every positive integer k such that k < N − d + 1 we
have
ξ0 < ξ1 < ξ2 < . . . < ξk
and therefore
ξk − ξ0 > ξk−1 − ξ0 > ξk−2 − ξ0 > . . . > ξ1 − ξ0 .
Since each of these differences is positive, we have

(ξk − ξ0 )2 > (ξk−1 − ξ0 )2 > (ξk−2 − ξ0 )2 > . . . > (ξ1 − ξ0 )2 .

Thus, for every positive integer k such that 1 < k < N − d + 1, we have
v v
ud−1 ud−1
uX uX
1 0
0< x −x = t (ξi+1 − ξi ) < t (ξi+k − ξi )2 = xk − x0 .
2

i=0 i=0

Therefore s(0) = 1. The proof for the case where T is strictly decreasing is similar. 

Lemma 2.2.2:
If T = ξ0 , ξ1 , ξ2 , . . . , ξN is a strictly monotonic time series, then

1. for all integers k such that t + 1 < k ≤ N − d + 1 we have xt+1 − xt < xk − xt , and

70
2. for all integers k such that 0 ≤ k < t − 1 we have xt−1 − xt < xk − xt .
Proof:
Suppose that T is strictly increasing. To prove the first claim, note that for all integers k such that
t + 1 < k ≤ N − d + 1, we have
v v
ud−1 ud−1
uX uX
xt+1 − xt = t (ξt+1+i − ξt+i )2 < t (ξk+i − ξt+i )2 = xk − xt ,
i=0 i=0

since k > t + 1 implies ξt+1+i < ξk+i for all integers k such that t + 1 < k ≤ N − d + 1.

Similarly, the second claim follows by noting that for all integers k such that 0 ≤ k < t − 1,
we have v v
ud−1 ud−1
uX uX
t−1 t
x −x = t (ξt−1+i − ξt+i ) < t (ξk+i − ξt+i )2 = xk − xt ,
2

i=0 i=0

since k < t − 1 implies ξt−1+i < ξk+i for all integers k such that 0 ≤ k < t − 1. The case where T is
strictly decreasing is handled similarly. 

Lemma 2.2.3:
If T = ξ0 , ξ1 , ξ2 , . . . , ξN is a strictly monotonic time series, then for every positive integer t such
that 1 ≤ t ≤ N − d and for every distance biased operator (0, z + , B) we have that either s(t) = t + 1
or s(t) = t − 1.
Proof:
We will prove this by induction on t. When t = 0, the statement holds by Lemma 2.2.1. Suppose
now that the statement is true for t − 1, which means that s(t − 1) = t − 2 or s(t − 1) = t.
If
0 < xs(t−1)+1 − xt < z +

then our algorithm for choosing s(t) simply sets s(t) = s(t − 1) + 1, which is either t − 1 or t + 1, so
the statement holds.
If
xs(t−1)+1 − xt ≥ z +

then if there is no k for which 0 < xk − xt < z + , our algorithm sets s(t) = s(t − 1) + 1 and we
are done by the arguments in the previous case. Otherwise, by Lemma 2.2.2 we set s(t) = t − 1 or
s(t) = t + 1.
Finally, it cannot be the case that

xs(t−1)+1 − xt = 0,

since if s(t − 1) = t − 2 we have xt−1 − xt = 0, which is impossible since T is strictly monotonic.


Similarly, if s(t − 1) = t then xt+1 − xt = 0 which is impossible since T is strictly monotonic. 

71
Our plan from here is to get our hands on a family of time series that will turn out to be primitive,
in either the forward or backward sense, as defined in Section 1.3.10. In order to arrive at this, let
us introduce the notion of convex and concave sequences.

Definition 2.2.2:
If ξ0 , ξ1 , ξ2 , . . . , ξN is a sequence of real numbers such that for all integers t such that 0 ≤ t ≤
N − 2 we have ξt+1 < (ξt + ξt+2 )/2, then the sequence is strictly convex.

Definition 2.2.3:
If ξ0 , ξ1 , ξ2 , . . . , ξN is a sequence of real numbers such that for all integers t such that 0 ≤ t ≤
N − 2 we have ξt+1 > (ξt + ξt+2 )/2, then the sequence is strictly concave.

Lemma 2.2.4:
Let T = ξ0 , ξ1 , ξ2 , . . . , ξN be a strictly monotonic time series. Then the sequence of points E(T, 1, d) =
x0 , x1 , . . . , xN −d+1 in Rd is such that

1. If T is strictly convex then for all positive integers t and k such that t < k ≤ N − d + 1 we
have xt − xt−1 < xk − xk−1 .
2. If T is strictly concave then for all positive integers t and k such that t < k ≤ N − d + 1 we
have xt − xt−1 > xk − xk−1 .
Proof:
Let us fix an integer t, where 0 ≤ t ≤ N − d. First let us show that the first statement is true for
k = t + 1. For every integer i such that 0 ≤ i + t ≤ N − 2 we have 2ξi+t < ξi+t−1 + ξi+t+1 by the
definition of a strictly convex time series.

Thus, if T is strictly increasing then ξi+t − ξi+t−1 < ξi+t+1 − ξi+t , which implies (ξi+t − ξi+t−1 )2 <
(ξi+t+1 − ξi+t )2 .

If T is strictly decreasing then ξi+t − ξi+t+1 < ξi+t−1 − ξi+t , which implies (ξi+t − ξi+t+1 )2 <
(ξi+t−1 − ξi+t )2 .

In either case, this implies


v v
ud−1 ud−1
uX uX
2 2
t−1+i − ξt+i ) < t+1+i − ξt+i )
t (ξ t (ξ
i=0 i=0

and thus xt−1 − xt < xt+1 − xt . Iterating this inequality gives us the desired result. The
proof in the concave case follows along the same lines. 

72
We are now ready to return to the notion of a primitive trajectory.

Theorem 2.2.1:
If T = ξ0 , ξ1 , ξ2 , . . . , ξN is a strictly monotonic time series, then

1. If T is strictly convex then for every distance biased operator (0, z + , B) such that x0 − x1 <
z + ≤ x1 − x2 we have that E(T, 1, d) is backward-primitive.
2. If T is strictly concave then for every distance biased operator (0, z + , B) we have that E(T, 1, d)
is forward-primitive.
Proof:
Let us first prove the second statement by induction. We know that s(0) = 1 by Lemma 2.2.1.
Suppose now that s(t − 1) = t for some t such that t < N − d − 1. If 0 < xs(t−1)+1 − xt < z +
then we are done, since our algorithm sets s(t) = s(t − 1) + 1 = t + 1. Otherwise, by Lemma
2.2.3 there is only one candidate to consider: xt−1 . However, by the previous Lemma we know that
xt−1 − xt > xt+1 − xt > z + so that xt−1 is not a candidate either. Thus, the algorithm choose
s(t) = t + 1 by default, and we are done.
Let us now prove the first statement. First, we know that s(0) = 1. We will prove the statement
by induction on t. Note that we have chosen z + conveniently enough so that s(1) = 0. Now take
any t such that t < N − d. We inductively assume s(t − 1) = t − 2. We have two candidates to
consider for s(t): t − 1 and t + 1. By the previous Lemma, xt+1 − xt > x2 − x1 ≥ z + , so that
our algorithm resorts to the default choice of s(t) = s(t − 1) + 1 = t − 1. 

Corollary 2.2.1:
Let T = ξ0 , ξ1 , . . . , ξN be a strictly monotonic time series. We have the following.

1. If T is strictly convex then for every distance biased operator (0, z + , B) such that x0 − x1 <
z + ≤ x1 − x2 we have

−2 ln x1 − x0 + ln x2 − x1 + ln xN −d+1 − xN −d
λ(E(T, 1, d), (0, z + , B)) = . (2.1)
N −d+1

2. If T is strictly concave then for every distance biased operator (0, z + , B) we have
− ln x1 − x0 − ln xN −d − xN −d−1 + 2 ln xN −d+1 − xN −d
λ(E(T, 1, d), (0, z + , B)) = .
N −d+1
(2.2)
Proof:
In either case, by Lemma 2.2.1 we know that s(0) = 1. Let us suppose that T is strictly convex.
Then by Theorem 2.2.1 we know that E(T, 1, d) is backward-primitive. Thus we can apply Theorem
1.3.7.

Suppose now that T is strictly concave. Then by Theorem 2.2.1 we know that E(T, 1, d) is forward-
primitive and we can apply Theorem 1.3.6. 

73
Corollary 2.2.2:
Let T = ξ0 , ξ1 , . . . , ξN be a strictly monotonic time series that is strictly convex. If we let T̄ denote
T in reverse order (T̄ = ω0 , ω1 , . . . , ωN where ωi = ξN −i ) then for every distance biased operator
(0, z + , B) such that x0 − x1 < z + ≤ x1 − x2 we have

λ(E(T, 1, d), (0, z + , B)) = −λ(E(T̄ , 1, d), (0, z + , B)).


Proof:
For every d and for all integers i such that 1 ≤ i ≤ N − d + 1 let us write

y i = (ξN −i , ξN −i−1 , . . . , ξN −i−d+1 ).

Now we may note that for every positive integer i such that 1 ≤ i ≤ N − d + 1 we have

y i − y i−1 = ||(ξN −i − ξN −i+1 , ξN −i−1 − ξN −i , . . . , ξN −i−d+1 − ξN −i−d+2 )||


= ||(ξN −i−d+1 − ξN −i−d+2 , . . . , ξN −i−1 − ξN −i , . . . , ξN −i − ξN −i+1 )||
= xN −i−d+1 − xN −i−d+2 .

If T is strictly convex and strictly monotonic then T̄ is strictly concave and strictly monotonic. Thus
we may apply equation (2.2) to get that

− ln y 1 − y 0 − ln y N −d − y N −d−1 + 2 ln y N −d+1 − y N −d
λ(E(T, 1, d), (0, z + , B)) =
N −d+1
− ln xN −d − xN −d+1 − ln x1 − x2 + 2 ln x0 − x1
=
N −d+1
= −λ(E(T̄ , 1, d), (0, z + , B)). 

Similarly, we have the following.

Corollary 2.2.3:
Let T = ξ0 , ξ1 , . . . , ξN be a strictly monotonic time series that is strictly concave. If we let T̄ denote
T in reverse order (T̄ = ω0 , ω1 , . . . , ωN where ωi = ξN −i ) then for every distance biased operator
(0, z + , B) such that xN −d+1 − xN −d < z + ≤ xN −d − xN −d−1 we have

λ(E(T, 1, d), (0, z + , B)) = −λ(E(T̄ , 1, d), (0, z + , B)).

As we have mentioned before, there is often much emphasis placed on whether the maximal Lyapunov
exponent is positive or negative. The following Corollary tells that this is easily deduced for the
types of time series we have been considering.

Corollary 2.2.4:
If T = ξ0 , ξ1 , . . . , ξN is a strictly monotonic time series, then

1. If T is strictly convex then for every distance biased operator (0, z + , B) such that x0 − x1 <
z + ≤ x1 − x2 we have
λ(E(T, 1, d), (0, z + , B)) > 0.

74
2. If T is strictly concave then for every distance biased operator (0, z + , B) we have

λ(E(T, 1, d), (0, z + , B)) < 0.

Proof:
Suppose T is strictly convex. Then by the equation (2.2) we have,

−2 ln x1 − x0 + ln x2 − x1 + ln xN −d+1 − xN −d
λ(E(T, 1, d)) = .
N −d+1

By Lemma 2.2.4 we know that x2 − x1 > x1 − x0 and xN −d − xN −d+1 > x1 − x0 , so


that λ(E(T, 1, d)) > 0. The strictly concave case is dealt with similarly. 

Now that we have an example of a time series that has a negative Lyapunov exponent, we can return
to Corollary 1.3.1. Combining it with the second part of Corollary 2.2.4, it tells us that if we take
a strictly monotonic strictly concave time series, its Lyapunov exponent becomes nonnegative by
simply appending the first point to the end of the time series.
In other words, given a time series T = ξ0 , ξ1 , . . . , ξN that is strictly monotonic and strictly concave,
for every distance biased operator (0, z + , B) we have that

λ(E(T, 1, d), (z − , z + , B)) < 0, where T = ξ0 , ξ1 , . . . , ξN


λ(E(T0 , 1, 1), (z − , z + , B)) ≥ 0, where T0 = ξ0 , ξ1 , . . . , ξN , ξ0
λ(E(T1 , 1, 2), (z − , z + , B)) ≥ 0, where T1 = ξ0 , ξ1 , . . . , ξN , ξ0 , ξ1
λ(E(T2 , 1, 3), (z − , z + , B)) ≥ 0, where T2 = ξ0 , ξ1 , . . . , ξN , ξ0 , ξ1 , ξ2

and so on.
This unusual result demonstrates that the Lyapunov exponent of a time series may be quite sensitive
to the addition of a single point. In pictures, we have the following.

75

Figure 2.4: Time series T = ξ0 , ξ1 , . . . , ξ97 , where ξt = t. The first time series has a negative Lya-
punov exponent in every embedding dimension. The second, third and fourth time series have
nonnegative Lyapunov exponents in embedding dimensions 1, 2 and 3, respectively.

Before concluding, let us note that most of the results from this section actually hold for all operators,
and not just distance biased operators. In particular, all of the results for strictly monotonic strictly
concave time series will hold. The fact that consecutive distances shrink means that we will never
be in a situation where we need to consider new candidates; the default is always used. The case of
strictly monotonic strictly convex time series is almost identical, except that the default is not used
for s(1) (the default is 2 in this case). In order to keep the arguments for showing that s(1) = 0
simple, we restricted our attention to distance biased operators only.

76
Concluding Remarks and Further
Work.

We have introduced the necessary background for understanding what Lyapunov exponents are with
respect to pure mathematics. In doing so, we have given a treatment of many of the notions typically
found in the theory of dynamical systems. In particular, those of fixed points, stability, and periodic
trajectories. Once a foundation was in place, we described what the spectrum of Lyapunov exponents
of a trajectory of a map are, and how they are calculated. We then gave a rigorous exposition of
the algorithm given in [40] for estimating the maximal Lyapunov exponent of trajectory. Finally,
we ended Chapter 1 with some definitions which will facilitate future arguments in the theory of
Lyapunov exponents.
Chapter 2 explained how the maximal Lyapunov exponent of a time series (with respect to
various parameters) can be calculated, via the algorithm found in [40]. Results on strictly monotonic
and strictly concave/convex time series were then presented, and displayed how the previously laid
foundation can be used to rigorously prove statements about particular time series, without the use
of numerical estimation.
The application of the theory of Lyapunov exponents to time series is complicated and still seems
to require some leaps of faith. We have developed what we feel to be a starting point for a theory
that was in desperate need of clarification. With it in place, the door is open not only to give precise
answers, but to ask precise questions. For example, is there a bound on the maximal Lyapunov
exponent of all time series that take on values within a specified range?
Intimately connected to the theory we have just presented is that of Takens’ Theorem. In a sense,
it seems to be abused in the same way Lyapunov exponents are, and often by the same people. We
hope to provide a similar treatment of this issue in the future, so that it is not blindly appealed to
when it happens to sound convenient.

77
Bibliography

[1] K. T. Alligood, T. D. Sauer, and J. A. Yorke, Chaos, Springer, 2000.

[2] V. I. Arnold and A. Avez, Ergodic problems in classical mechanics, New York: Benjamin, 1968.
[3] M. Barnsley, Fractals everywhere, Morgan Kaufmann, 2000.
[4] G. D. Birkhoff, Proof of the ergodic theorem, Proc. Natl. Acad. Sci. USA 17 (1931), 656660.

[5] L. S. Block and W. A. Coppel, Dynamics in one dimension, Lecture notes in mathematics, vol.
1513, Springer-Verlag, 1992.
[6] V. Chvátal, Notes on the maximal Lyapunov exponent of a time series,
http://users.encs.concordia.ca/ chvatal/neuro/lyapunov.pdf, 2009.
[7] I.P. Cornfeld, S.V. Fomin, and Y. G. Sinai, Ergodic theory, Springer-Verlag, 1982.

[8] R. L. Devaney, An introduction to chaotic dynamical systems, Addison-Wesley Publishing Com-


pany, 1989.
[9] P. G. Drazin, Nonlinear systems, Cambridge University Press, 1992.
[10] S. N. Elaydi, Discrete chaos, Chapman & Hall/CRC, 2008.

[11] R. H. Enns and G. McGuire, Nonlinear physics with mathematica for scientists and engineers,
Birkhäuser, 2001.
[12] J. Gleick, Chaos: Making a new science, Penguin, 1988.
[13] M. Hénon, A two-dimensional mapping with a strange attractor, Communications in Mathe-
matical Physics 50 (1976), 69–77.
[14] R. C. Hilborn, Sea gulls, butterflies, and grasshoppers: A brief history of the butterfly effect in
nonlinear dynamics, American Journal of Physics 72 (2004), 425–427.
[15] K. Hoffman and R. A. Kunze, Linear algebra, Prentice Hall, 1971.

[16] A. V. Holden, Chaos, John Wiley & Sons Australia, 1986.


[17] R. A. Horn and C. R. Johnson, Matrix analysis, Cambridge University Press, 1990.

78
[18] L. D. Iasemidis, D.-S. Shiau, J. C. Sackellares, P. M. Pardalos, and A. Prasad, Dynamical
resetting of the human brain at epileptic seizures: Application of nonlinear dynamics and global
optimization techniques, IEEE Transactions On Biomedical Engineering 51 (2004), no. 3.
[19] Leonidas Iasemidis, On the dynamics of the human brain in temporal lobe epilepsy, Ph.D. thesis,
University of Michigan, 1991.

[20] J. L. Kaplan and J. A. Yorke, Functional differential equations and approximations of fixed
points, Lecture notes in Mathematics 730 (1979).
[21] E. N. Lorenz, The predictability of hydrodynamic flows, Trans. N. Y Acad. Sci. 25 (1963),
409–432.

[22] S. Lynch, Dynamical systems with applications using matlab, Birkhäuser, 2004.
[23] R. M. May, Simple mathematical models with very complicated dynamics, Nature 261 (1976),
459.
[24] S. P. Nair, D.-C. Shiau, J. Principe, C. Leonidas, D. Iasemidis, P. M. Pardalos, W. M. Norman,
P. R. Carney, K. M. Kelly, and J. C. Sackellares, An investigation of EEG dynamics in an
animal model of temporal lobe epilepsy using the maximum Lyapunov exponent, Experimental
Neurology 216 (2009).
[25] V. Oseledets, A multiplicative ergodic theorem. Lyapunov characteristic numbers for dynamical
systems, Trudy Moskov. Mat. Obšč 19 (1968), 179–210 (Russian).

[26] V. Oseledets, Oseledets’ theorem, http://www.scholarpedia.org/article/Oseledets theorem,


2008.
[27] N. H. Packard, J. P. Crutchfield, J. D. Farmer, and R. S. Shaw, Geometry from a time series,
Physical Review Letters 45 (1980), 712–716.
[28] Y. Pesin and V. Climenhagam, Lectures on fractal geometry and dynamical systems, American
Mathematical Society, 2009.
[29] C. Robinson, Dynamical systems: stability, symbolic dynamics, and chaos, CRC, 1998.
[30] D. Ruelle, Chaotic evolution and strange attractors, Lincei Lectures, Cambridge University
Press, 1989.

[31] S. Sastry, Nonlinear systems: analysis, stability, and control, Springer, 1999.
[32] T. Sauer, J. A. Yorke, and M. Casdagli, Embedology, Journal of Statistical Physics 65 (1991).
[33] E. R. Scheinerman, Invitation to dynamical systems, Prentice-Hall, 1996.
[34] H. G. Schuster, Deterministic chaos: an introduction, Wiley-VCH, 1988.

[35] L. Smith, Linear algebra, Springer, 1998.


[36] J. C. Sprott, Chaos and time-series analysis, Oxford University Press, 2003.

79
[37] S. H. Strogatz, Nonlinear dynamics and chaos: With applications to physics, biology, chemistry
and engineering, Westview Press, 1994.
[38] F. Takens, Detecting strange attractors in turbulence, Dynamical Systems and Turbulence, Lec-
ture Notes in Mathematics, vol. 898, 1981, pp. 366–381.

[39] P. F. Verhulst, Recherches mathématiques sur la loi d’accroissement de la population, Nouv.


mém. de l’Academie Royale des Sci. et Belles-Lettres de Bruxelles 18 (1845), 1–41.
[40] A. Wolf, J. B. Swift, H. L. Swinney, and J. A. Vastano, Determining Lyapunov exponents from
a time series, Physica D: Nonlinear Phenomena 16 (1985), 285–317.

80
Lyapunov Exponents and Epilepsy.

In this section we will briefly describe some of the claims in the literature ([10], [2]) about Lyapunov
exponents and their relation to the prediction of epileptic seizures.

Introduction.

Sufferers of epilepsy must currently live with the fact that seizures tend to occur spontaneously,
often with little or no warning signs. In attempts to study the disease, patients with severe cases of
epilepsy are continuously watched by medical staff for weeks at a time. During these periods EEG
(electroencephalogram) or ECoG (electrocorticography) recordings are made. The former of these
recordings results from the placement of multiple electrodes placed on the scalp of the patient. The
latter is the result of intracranial measurements, which involves invasive surgery so that a grid of
electrodes may be placed directly on the cortex of the patient. Thus, ECoG recordings are far more
robust in terms of analyzable data. This process is reserved for patients with the most severe forms
of the disease.

81
Figure 5: Montage of electrodes for EEG. (Image borrowed from reference [11]).

To the untrained eye, an EEG/ECoG recording looks like nothing more than an indecipherable
random series of data. However, specialized neurologists working in the field of epilepsy are trained
to read these recordings, and seem to develop some intuition as to what seizure activity looks like.
Mere visual inspection allows them to distinguish between preictal, ictal and postictal stages of a
seizure (corresponding to before the seizure, during the seizure, and after the seizure, respectively).

Figure 6: Temporal seizure. (Image borrowed from reference [11]).

82
Until recently, visual inspection of EEG data was a time consuming but unfortunately necessary
task for neurologists; reading through days of EEG data had no shortcuts. This was the problem of
seizure detection: given an EEG recording, can we tell when a seizure might have occured to a reliable
degree of accuracy? Although the solution to this problem may only offer a small improvement, if
any, to the quality of life for the patient, it is a great time saver for the medical staff. This problem
was essentially solved by Jean Gotman in the 80’s ([1]). Neurologists now have software at their
disposal that reads through EEG data and indicates the points in time that might be of interest,
allowing the neurologist to ignore much of the reading.
In contrast, the problem of seizure prediction asks: given some continuous EEG data, is a seizure
about to occur? If yes, then when? In 1991, the Ph.D. thesis written by Leonidas Iasemidis [6]
was one of the first of many publications written in an attempt to use techniques from dynamical
systems to solve this problem. In particular, emphasis was placed on the calculation of Lyapunov
exponents from a time series via the algorithm developed by Wolf et al. in 1985 [12]. We will now
say a few words on how this is done.

How they do it.

A patient is hooked up to the necessary equipment while EEG and/or ECoG recordings are made
for hours at a time. These measurements constitute of a sequence of real numbers and thus is the
time series under our consideration. In fact, each electrode gives rise to its own time series. We
mainly focus on just one particular electrode, and hence one particular time series. During the
monitoring period, an epileptic seizure occurs, typically lasting a few minutes in length. Given an
EEG/ECoG time series, a time delay τ and an embedding dimension d is chosen. We will not defend
or justify the choices here, but for the sake of having some numbers to look at, [2] and [10] each use
τ ≈ 14msec and an embedding dimension of d = 7. With this is place, the time series is embedded
using these values in the same way we described in the beginning of Section 2.1. The time series is
now considered as the trajectory of some d-dimensional point. This trajectory is divided into small
windows or epochs, of around 10 to 12 seconds (again, see [10]). Next, a modified3 version of the
algorithm found in [12] is ran on each of these epochs, resulting in a sequence of values, each of which
is referred to as ST Lmax or sometimes Lmax , depending on the paper ([2] and [10], respectively),
and is claimed to be an estimate of the maximal Lyapunov exponent of the epoch, or as “a reflexion
of the chaoticity of the signal” ([2]).

What they find.

The general findings in [10] are that Lmax takes on a mean value4 of approximately 6bits/sec during
the preictal stage, with occasional drops to 5bits/sec. Once a clinical seizure appears to be taking
place, a drastic drop of Lmax occurs, reaching as low as 2bits/sec. Finally, the seizure enters the
postictal stage and Lmax jumps back up to approximately 8bits/sec. The authors claim that not
3 The algorithm is essentially the same, but a slightly different operator is used. See [2] for the details.
4 The maximal Lyapunov exponent is calculated in base 2 instead of base e as we have presented it.

83
only do their algorithms detect these seizures (by drops in Lmax during the ictal stage), but that the
drops in the preictal stage, which can occur more than 20 minutes before the onset of the seizure,
can be used to predict that a seizure is on the way. The authors refer to these drops as entrainment.
Whether or not these methods have validity or meaning in terms of pure mathematics is difficult
to argue. However, the fact that interesting and useful results seem to be produced beg for as much
justification as possible, and we hope to continue developing and contributing to this process.

84
Bibliography for predicting
epileptic seizures

[1] J. Gotman, Automatic recognition of interictal spikes, Electroencephalography Clinical Neuro-


physiology Supplement (1985), 93–114.
[2] L. D. Iasemidis, J. C. Principe, and J. C. Sackellares, Measurement and quantification of spatio-
temporal dynamics of human epileptic seizures, Nonlinear Biomedical Signal Processing (2000),
294–318.
[3] L. D. Iasemidis and J. C. Sackellares, The evolution with time of the spatial distribution of the
largest Lyapunov exponent on the human epileptic cortex, Measuring Chaos in the human brain,
World Scientific, 1991.
[4] L. D. Iasemidis, D.-S. Shiau, J. C. Sackellares, P. M. Pardalos, and A. Prasad, Dynamical
resetting of the human brain at epileptic seizures: Application of nonlinear dynamics and global
optimization techniques, IEEE Transactions On Biomedical Engineering 51 (2004), no. 3.
[5] L. D. Iasemidis, K. Tsakalis, J. C. Sackellares, and P. M. Pardalos, Comment on “Inability of
Lyapunov exponents to predict epileptic seizures”, Physical Review Letters 94 (2005).
[6] Leonidas Iasemidis, On the dynamics of the human brain in temporal lobe epilepsy, Ph.D. thesis,
University of Michigan, 1991.
[7] A. D. Krystal, C. Zaidmana, H. S. Greenside, R. D. Weinera, and C. E. Coffey, The largest
Lyapunov exponent of the EEG during ECT seizures as a measure of ECT seizure adequacy,
Electroencephalography and Clinical Neurophysiology 103 (1997).
[8] Y.-C. Lai, Harrison M. A. F., M. G. Frei, and I. Osorio, Inability of Lyapunov exponents to
predict epileptic seizures, Physical Review Letters 91 (2003).
[9] S. P. Nair, D.-C. Shiau, J. Principe, C. Leonidas, D. Iasemidis, P. M. Pardalos, W. M. Norman,
P. R. Carney, K. M. Kelly, and J. C. Sackellares, An investigation of EEG dynamics in an
animal model of temporal lobe epilepsy using the maximum Lyapunov exponent, Experimental
Neurology 216 (2009).
[10] J. C. Sackerelles, L. D. Iasemidis, R. L. Golmore, and S. N. Rober, Epilepsy - when chaos fails,
Chaos in the Brain?, pp. 112–133, World Scientific, 2000.

85
[11] W. O. Tatum, Handbook of EEG interpretation, Demos Medical Publishing, 2007.
[12] A. Wolf, J. B. Swift, H. L. Swinney, and J. A. Vastano, Determining Lyapunov exponents from
a time series, Physica D: Nonlinear Phenomena 16 (1985), 285–317.

86
Linear Algebra

Norms

Definition .0.4:
Let V be a vector space over R. A function || · || : V → R is a norm if for all x and y such that
x, y ∈ V ,

1. ||x|| ≥ 0
2. ||x|| = 0 ⇔ x=0
3. ||cx|| = |c| ||x|| for all c such that c ∈ R

4. ||x + y|| ≤ ||x|| + ||y||

In particular, the only norm we use in this thesis is the Euclidean norm.

Definition .0.5:
When v = (v1 , v2 , . . . , vn ) is a vector in Rn , we use ||v|| to denote the usual Euclidean norm,
v
u n
uX
||v|| = t |vi |2 .
i=1

87
Definition .0.6:
When M is an m × n matrix, the Euclidean norm of M is

||M || = max{||M v|| : v ∈ Rn and ||v|| ≤ 1}.

Any standard text on linear algebra should contain proofs that these are indeed norms. For instance
see [2] or [1].

We also make use of the following fact.

Lemma .0.5:
If M is an m × n matrix and v ∈ Rn , then ||M v|| ≤ ||M || ||v||.

Proof:
Let v be any vector such that v ∈ Rn and let w = v/ ||v||. Note that ||w|| = 1, so that ||M w|| ≤ ||M ||
by definition of the matrix norm. Now

||M v|| = ||v|| ||M w|| ≤ ||M || ||v||

. 

Jordan Canonical Form.

Here we will introduce some of the definitions and claims about Jordan normal form. Material from
this section can be found in Section 3.1 of [2], with the exception of Lemmas .0.6 and .0.7.

Definition .0.7:
A Jordan block Jk (λ) is a k × k upper triangular matrix of the form
 
λ 1

 λ 1 

Jk (λ) = 
 .. .. .

 . . 
 λ 1
λ

The k diagonal entries are all λ, the k − 1 upper diagonal entries all 1, and all other entries are
0.

88
Definition .0.8:
A Jordan matrix J is an n × n matrix that is the direct sum of Jordan blocks:
 
Jn1 (λ1 )

 Jn2 (λ2 ) 

J =
 Jn3 (λ3 ) ,

 .. 
 . 
Jnk (λk )

where n1 + n2 + · · · + nk = n. The diagonal entries take the form of Jordan blocks, all other
entries are 0.

Theorem .0.2:
Let A be an n × n matrix with entries in C. There is an invertible n × n matrix S such that

A = SJS −1 ,
where  
Jn1 (λ1 )

 Jn2 (λ2 ) 

J =
 Jn3 (λ3 ) 

 .. 
 . 
Jnk (λk )
and n1 + n2 + · · · + nk = n. Furthermore, λ1 , λ2 , . . . , λk are the eigenvalues of A, which may not be
distinct.
Proof:
See [2] 

Lemma .0.6:
Let Jk (λ) be a Jordan block, let t be a positive integer, and let aij (t) denote the the entry in the ith
t
row and jth column of Jk (λ) . Then
 
t
ai,j (t) = λt−(j−i) ,
j−i
with the convention that ti = 0 if t > i.


Proof:
We will prove the statement by induction on t. When t = 1 the statement is trivial. Now suppose
the statement holds for t − 1, so that
 
t − 1 t−1−(j−i)
ai,j (t − 1) = λ
j−i

89
t
for all integers i and j such that 1 ≤ i ≤ k and 1 ≤ j ≤ k. Note that the entry ai,j (t) of Jk (λ) is
t−1
calculated by multiplying Jk (λ) on the right by the relatively simple matrix Jk (λ), so that
ai,j (t) = ai,j−1 + ai,j λ
   
t−1 t−1−(j−1−i) t − 1 t−1−(j−i)
= λ + λ
j−1−i j−i
 
t
= λt−(j−i) . 
j−i

Lemma .0.7:
If J is an n × n Jordan matrix, then
 t 
Jn1 (λ1 )
t
 Jn2 (λ2 ) 
t
 
t
J =
 Jn3 (λ3 ) ,

 .. 
 . 
t
Jnk (λk )
for every positive integer t. (All non-diagonal entries are 0).
Proof:
By induction on t. When t = 1 the statement is trivial. Suppose it holds for t − 1, then J t = J t−1 J
clearly has the correct form by inspecting
 t−1   
Jn1 (λ1 ) Jn1 (λ1 )
t−1
 Jn2 (λ2 )   Jn2 (λ2 ) 
× .
   
 .. ..
 .   . 
t−1 Jnk (λk )
Jnk (λk ) 

The Spectral Theorem.

Here we state the theorem as it is found on p. 316 of [3]. Its proof can be found there as well.

Theorem .0.3 (The Spectral Theorem):


Let T : V → V be a self-adjoint5 linear transformation in the finite-dimensional inner product space
V . Then there exists an orthonormal basis {v1 , . . . , vn } and number λ1 , . . . , λn such that
T (vi ) = λi vi
for all integers i such that 1 ≤ i ≤ n.
5 T ∗ : V → V is the adjoint of T if < T x, y >=< x, T ∗ y > for all x, y such that x, y ∈ V . Furthermore, T is

self-adjoint if T = T ∗ .

90
In particular, we note that in the case of a matrix with entries in R, all matrices M for which
M = M T are self-adjoint. Thus, if M is a real symmetric matrix then it is self-adjoint.

Other Lemmas.

The following Lemmas are used in Section 1.3.7.

Lemma .0.8:
Let M be an n × n matrix having n distinct, non-zero eigenvalues. Then M and M T have the same
eigenvalues.
Proof:
Suppose λ is an eigenvalue of M , then M v = λv for some non-zero v. Since (M − λI)v = 0 we know
that (M − λI) is not invertible. To complete the proof we will show that (M − λI)T = (M T − λI)
is not invertible either, which allows us to conclude that λ is an eigenvalue of M T . To this end, let
N = M − λI and suppose N T is invertible. Then I = N T (N T )−1 = ((N T )−1 )T N = N −1 N so that
N is invertible, which is contrary to our assumption. 

Lemma .0.9:
Let M be an n × n matrix with entries in R having n distinct, non-zero eigenvalues λ1 , . . . , λn . Let
vi and wi (i = 1, . . . , n) be the corresponding eigenvectors of M and M T , respectively. Then vi and
wj are orthogonal if i 6= j.

Proof:
For every i, M vi = λi vi , and so viT M T = λi viT . Now for every j 6= i, viT M T wj = λi viT wj . Since wj
is an eigenvector of M T , it follows that viT λj wj = λi viT wj . Since λi 6= λj , we must have viT wj = 0.

Lemma .0.10: Pn
If v1 , . . . , vn and w1 , . . . , wn are eigenvectors of M and M T , respectively, and vj = i=1 ai wi , then
aj 6= 0 and vjT wj 6= 0.

Proof: Pn
We have 0 6= viT vi = viT i=1 ai wi = ai viT wi by Lemma .0.9. 

Lemma .0.11:
Let M be an n×n matrix with n distinct eigenvalues λ1 , . . . , λn . Then the corresponding eigenvectors
v1 , . . . , vn are linearly independent.

91
Proof:
Suppose the eigenvectors are linearly dependent. Then there exists a smallest positive integer k such
that k < n and constants c1 , . . . , ck such that

vk = c1 v1 + c2 v2 + · · · + ck−1 vk−1 .

Multiplying by M , we get

M vk = c1 M v1 + c2 M v2 + · · · + ck−1 M vk−1 ,

which yields
λk vk = c1 λ1 v1 + c2 λ2 v2 + · · · + ck−1 λk−1 vk−1 .
However, we also have
λk vk = c1 λk v1 + c2 λk v2 + · · · + ck−1 λk vk−1 ,
so that
c1 (λk − λ1 )v1 + c2 (λk − λ2 )v2 + · · · + ck−1 (λk − λk−1 )vk−1 = 0.
This contradicts the minimality of k, since we now have a smaller linear dependent set of vectors.

92
References

[1] K. Hoffman and R. A. Kunze, Linear algebra, Prentice Hall, 1971.


[2] R. A. Horn and C. R. Johnson, Matrix analysis, Cambridge University Press, 1990.
[3] L. Smith, Linear algebra, Springer, 1998.

93

You might also like