2017 s2 Math3325 Essay
2017 s2 Math3325 Essay
2017 s2 Math3325 Essay
Keeley Hoek
October 27, 2017
In this essay we present a gentle introduction to several fundamental results of Ergodic theory. We do
this as much as possible from the perspective of functional analysis and spectral theory, developing the
necessary measure-theoretic tools as we go. We introduce the fundamental notions of ergodicity and
mixing of systems, naturally motivating the latter from the statement of the Mean Ergodic Theorem,
before discussing applications. Generalisations of the fundamental tools considered here yield proofs
of a vast array of results spanning the breadth mathematics.
By the the use of methods derived from ergodic theory, it is frequently found that previous results
may be sharpened or at least may be produced more efficiently. For instance, Szemerédi’s theorem of
arithmetic combinatorics was first proved via an intricate combinatorial argument, while Furstenberg’s
ergodic approach effectively generalises the Poincaré recurrence theorem—from which Szemerédi’s the-
orem then follows as a corollary. The progress of Einsiedler, Katok, and Lindenstrauss on Littlewood’s
conjecture—namely showing that the set of exceptions to the conjecture has Hausdorff dimension zero—
is another good example.
1
a natural measure, obtained locally in coordinate charts from the Lebesgue measure. In this context,
Liouville’s theorem guarantees that the measure of any measurable subset of the phase space is invari-
ant under the action of evolution transformations. The fact that many other generalised (non-classical)
systems obey2 this property motivates the following definition.
Definition 1.1. Let (X , M , µ) be a measure space and let ψ : X → X be a measurable function. Then the
map ψ is µ- or measure-preserving and µ is ψ-invariant if
for every S ∈ M .
The idea of a measure-preserving transformation extends naturally to maps between measure spaces,
but we will not deal with such maps here. The notion of a µ-preserving map then leads to the following
conveniently packaged object.
Working within the framework of modern measure theory, the following foundational result of er-
godic theory now follows quickly. The theorem answers a question of a form typical of ergodic theory;
for a given initial state, how often will we return to a fixed neighbourhood of that state?—in this case, the
answer is (almost-certainly) at least infinitely often.
Theorem 1.3 (Poincaré). Let (X , M , µ, ψ) be a probability system, and let S ∈ M . Then for almost every
x ∈ S there exists a strictly increasing sequence (n k ) of natural numbers such that ψnk (x) ∈ S for every k ∈ N
(with an exponent denoting repeated composition).
has measure zero. Let ψ−k denote the k-fold composition ψ−1 ◦ · · · ◦ ψ−1 . As S \ ψ−n (S) is exactly the set
of x ∈ S such that ψn (x) 6∈ S, it is clear that
∞ \
∞
N = lim inf(S \ ψ−n (S)) = (S \ ψ−k (S)).
[
n→∞
n=1 k=n
with R ⊂ X (certainly) measurable, and satisfying ψ−i (R) ∩ ψ− j (R) = ; for every i 6= j . The idea is that
the sets ψ−i (R) “wander” about the finite measure space X for varying i , in that each is disjoint with
every other. Assuming the claim, it follows immediately that
à !
∞
−n
µ(N ) = µ (S ∩ ψ (R))
[
n=1
à !
∞
−n
≤µ ψ
[
(R)
n=1
∞
µ(ψ−n (R))
X
≤
n=1
≤ µ(X ).
However, we have µ(ψ−n (R)) = µ(R) for every n ∈ N by the µ-invariance of ψ, and hence as µ(X ) = 1 < ∞,
we must have µ(R) = 0. Therefore µ(N ) = 0, as required.
2 For generalised systems, we require that there exists a measure on the associated phase space which is invariant under the
evolution transformation.
2
To see (1), simply note that if x ∈ S then
as desired. It remains to show that the sets ψ−i (R) and ψ− j (R) are pairwise disjoint for i 6= j . Suppose
that x ∈ ψ−i (R) ∩ ψ− j (R) for some i , j ∈ N with j ≤ i . Then we have ψ j (x) ∈ R and ψi (x) ∈ R ⊂ S. The
latter condition is equivalent to ψi − j (ψ j (x)) ∈ R ⊂ S, which exactly means that the image under ψi − j
of an element of R is an element of R ⊂ S. Positivity of i − j would contradict the definition of R, and
therefore i = j . This completes the proof.
Poincaré originally proved this result in the case of the gravitational interaction of celestial bodies
([4]). It should be noted that the finiteness of X (with respect to µ) was critical to the proof of the result;
certainly a rocket on a solar escape trajectory (with a presumably infinite volume phase space X ) need
not return to the launch site—and certainly not infinitely often! Despite the ease with which this result
was produced, it is a substantially more difficult problem to characterise the rate at which points return
to any S ∈ M (see for instance [2]).
In the nomenclature of Ergodic theory, the “ergodicity” of a system refers to the tendency of a given
state to traverse all other possible states as it evolves. This requirement intuitively means that the av-
erage behaviour of a system is constant in time, and may be determined by examining the “average”
system state. Many “ergodic theorems” give precise characterisations of the extent to which exactly this
temporal-spacial equivalence holds under specific regimes. Hence we make the following definition.
2 Koopman operators
We now introduce an operator associated to each measurable transformation ψ : X → X . This opera-
tor permits the tools of functional analysis to be brought to bear on the study of ψ, and yields many
spectral/operator-theoretic characterisations of properties, such as the ergodicity, of ψ itself.
Definition 2.1. Given a measure-preserving map ψ : X → X , it is natural to consider the induced map
Uψ : L 2 (X ) → L 2 (X ), defined by sending f 7→ f ◦ ψ. We call Uψ the Koopman operator associated with ψ.
Proof. As ψ is measurable, (2) holds for the characteristic function of any measurable subset of X . The
claim then follows immediately from the definition of the integral in terms of simple functions.
The study of the elementary properties of Koopman operators begins with the observation that they
are linear and respect products of functions. For example, from these facts it will follow that they are
isometries and are unitary when surjective. In fact, acting on the continuous complex-valued functions
C (X ) on X , the Koopman operator is even a C ∗ -algebra homomorphism when X is compact.3
3 See Theorem 4.13 of [5].
3
Proposition 2.3. The operator Uψ preserves the inner product on the Hilbert space L 2 (X ) (i.e. Uψ is an
isometry). Furthermore, if ψ is invertible then Uψ is surjective and therefore unitary.
by Lemma 2.2. If ψ is invertible then Uψ ( f ◦ ψ−1 ) = f ◦ ψ−1 ◦ ψ = f for every f ∈ L 2 (X ), showing that Uψ
is surjective.
Proposition 2.4. A measure-preserving map ψ : X → X is ergodic if and only if Uψ has the eigenvalue 1
with multiplicity 1.
Proof. Suppose Uψ has the eigenvalue 1 with multiplicity 1, and furthermore that ψ−1 (S) = S for some
S ∈ M . Then χS ◦ ψ = Uψ (χS ) = χS , and hence χS is an eigenfunction of Uψ with eigenvalue 1. Now,
Uψ certainly sends constant functions to themselves, and hence (as the eigenspace associated with 1
has multiplicity 1) the eigenfunctions of Uψ associated with the eigenvalue 1 are precisely the constant
functions. Therefore χS is constant almost everywhere on X , which implies that either χS = 1 or χS = 0
with one of these equalities holding almost everywhere. Hence µ(S) ∈ {0, 1}, and therefore ψ is ergodic,
as required.
Now let ψ : X → X be an ergodic measure-preserving map, and suppose f ◦ ψ = f for some mea-
surable f : X → C. By the linearity of Uψ , it suffices to show that f is constant in the case that f is
real-valued. We claim that for each n ∈ Z the fact that ψ is ergodic implies that there is exactly one k ∈ Z
such that f (x) ∈ C n,k = [k2−n , (k + 1)2−n ) ⊂ R for almost every x ∈ X . It will then follow that almost
every f (x) is contained in a single half-open interval of arbitrarily small diameter, and therefore that
f (x) is constant almost everywhere. Hence the eigenspace of Uψ corresponding to the eigenvalue 1 has
algebraic multiplicity 1 (consisting of only almost everywhere constant functions), as required.
It remains to prove that f (x) ⊂ C n,k almost everywhere for exactly one k ∈ Z, for each n ∈ Z. Fix
an n ∈ Z. As µ(X ) = 1, there is certainly some k ∈ Z such that S = f −1 (C n,k ) has nonzero measure (X
is the countable union of the disjoint sets f −1 (C n, j ) with j varying over the integers, and hence the
contrary hypothesis would contradict the countable additivity of µ). Note that we have χC n,k ◦ f = χS .
Thus, as f ◦ ψ = f , we have that χS ◦ ψ = χS by the associativity of composition. But χS ◦ ψ = χψ−1 (S)
by the definition of the characteristic function, and thus ψ−1 (S) = S (deleting a null-measure set from
S if necessary). The fact that ψ is ergodic then immediately implies that µ(S) ∈ {0, 1}. Now, µ(S) > 0 by
assumption, and hence µ(S) = 1. Therefore X \ S is a µ-null set, and the claim follows.
3 Ergodic theorems
In this section we will develop the necessary operator-theoretic machinery to prove a fundamental result
of ergodic theory, namely von Neumann’s Mean Ergodic Theorem. The proof of the Mean Ergodic The-
orem, in combination with the so-called Pointwise Ergodic Theorem of Birkhoff which followed shortly
thereafter, marked the birth of modern ergodic theory.
As suggested by Proposition 2.4, the subspace fixed by the Koopman operator associated with a given
measure-preserving transformation is of considerable interest when studying the transformation itself.
We first make the following natural definition.
4
The subset F ψ ⊂ L 2 (X ) is a subspace of L 2 (X ) precisely because it the the kernel of the linear map
1 − Uψ . Furthermore, F ψ is closed because it is the level set of the continuous map 1 − Uψ . It is then
immediate from the theory of Hilbert spaces that L 2 (X ) = F ψ ⊕ F ψ ⊥
.
The Mean Ergodic theorem is essentially the statement that, when considering the average behaviour
of a system over long times, the subspace F ψ of observables of a measurable system (X , M , µ, ψ) is the
only interesting set of observables;
Theorem 3.2 (von Neumann). Let (X , M , µ, ψ) be a measurable system, and let f ∈ L 2 (X ). As F ψ is a
closed subspace of L 2 (X ), we can write f = f˜ + g with f˜ ∈ F ψ and g ∈ F ψ
⊥
. Then
1 n−1
X k
U f → f˜
n k=0 ψ
1 n−1
X k Z
U f → f dµ, (3)
n k=0 ψ X
In light of the following lemma, the utility of Lemma 3.3 is greatly elucidated.
Lemma 3.4. The closure G ψ is annihilated in mean by Uψ , in that for every f ∈ G ψ we have
1 n−1
X k
U f → 0, (4)
n k=0 ψ
5
Proof. Fix a sequence (g n ) in G ψ converging to g . Then for each n, m ∈ N we have
1 m−1
X k 1 m−1
X k+1 k 1 m
Uψ g n = (U f n −Uψ f n ) = (Uψ f n − f n ).
m k=0 m k=0 ψ m
m m 1 Pm−1 k
Now, as Uψ is unitary we have kUψ f n − f n k ≤ kUψ f n k + k f n k ≤ 2k f n k, and hence m k=0
Uψ g n → 0 as
m → ∞. The condition (4) now follows from the triangle inequality, because again by the unitarity of Uψ
we have
° ° ° ° ° °
1° m−1 ° 1° m−1 ° 1° m−1 °
°X k ° °X k °X k °
Uψ g ° ≤ ° Uψ (g − g n )° + ° Uψ g n °
°
°
m ° k=0 ° 2 m ° k=0 ° 2 m ° k=0 ° 2
L L L
° °
1 m−1 1 °m−1 °
U k gn °
X° ° X
≤ °g − g n ° 2 + ° °
L
m ° k=0 ψ ° 2
°
m k=0
L
° °
1 °m−1 °
U k gn ° .
° ° X
≤ °g − g n °L 2 + °
° °
m ° k=0 ψ ° 2
L
In particular taking n, and then m, large enough, the right hand side goes to zero and the claim follows.
1 n−1
X k 1 n−1
X k
Uψ f → U f˜ + 0 = f˜,
n k=0 n k=0 ψ
which proves the first part of the theorem. To see the second part, observe that when ψ is ergodic Propo-
sition 2.4 implies that F ψ is exactly the set of almost everywhere constant functions on L 2 (X ). Thus f˜ is
constant almost everywhere. As (X , M , µ) is a probability space it follows that
Z Z Z Z
f dµ = f˜ dµ + g dµ = f˜ + g dµ
X X X X
R
almost everywhere, and thus it suffices to show X g dµ = 0. By Lemma 3.4 we may take a sequence ( f n )
such that Uψ f n − f n converges to g in L 2 . As we have the inequality
¯Z Z ¯ Z
¯ ¯
¯ g dµ − (Uψ f n − f n ) dµ¯ ≤ |g − (Uψ f n − f n )| dµ → 0,
¯ ¯
X X X
R R
it follows that X (Uψ f n − f n ) dµ → X g dµ. However,
Z Z
(Uψ f n − f n ) dµ = (Uψ f n −Uψ f n ) dµ = 0
X X
R
by Lemma 2.2, which yields the desired equality X g dµ = 0.
Conversely, suppose that we have (3) for every f ∈ L 2 (X ). Fix any S ∈ M such that ψ−1 (S) = S. As
k
argued in the proof Proposition 2.4 we have Uψ χS = χS for every k ∈ N, and hence (3) immediately gives
1 n−1
Z
χS = χS → χS dµ = µ(S).
X
n k=0 X
Therefore χS is equal almost everywhere on X to the constant function µ(S), and hence µ(S) ∈ {0, 1}, as
required. Therefore ψ is ergodic, and this completes the proof of the theorem.
6
A stronger version of the Poincaré recurrence theorem, which includes a statement about the ex-
pected time between recurrences, follows directly from the Mean Ergodic Theorem. This is but a sin-
gle example of many which suggest the Mean Ergodic Theorem is an important result. The following
theorem of Birkhoff (published less than a year following von Neumann’s result, and leading to some
controversy), gives an analogous pointwise version of the Mean Ergodic theorem; for this reason it is
sometimes known as the Pointwise Ergodic Theorem. Note that we have explicitly avoided the use of
the notation “Uψ ” as Theorem 3.5 is a statement regarding L 1 functions.
Theorem 3.5 (Birkhoff). Let (X , M , µ, ψ) be a measurable system, and let f ∈ L 1 (X ). Then there exists
f˜ ∈ L 1 (X ) such that for almost every x ∈ X we have
1 n−1
f (ψk (x)) → f˜(x).
X
n k=0
In our restricted setting, the Pointwise Ergodic Theorem may be used to deduce the Mean Ergodic
Theorem. However, the latter should not be considered completely a corollary of the former, as the
Mean Ergodic Theorem is itself a corollary of a statement about retractions on Hilbert spaces which
does not follow from the pointwise version. Further, the Mean Ergodic Theorem generalises to arbitrary
measure-preserving actions of locally compact amenable groups and Følner sequences, while this is
more difficult in the pointwise case.
4 Mixing
The Mean Ergodic Theorem makes a statement which warrants further investigation. Namely, for a
given a measurable transformation, we ask what more can be said about the convergence of spacial and
temporal averages. To this end, let (X , M , µ, ψ) be a measurable system with ψ ergodic, and suppose
R, S ∈ M . Considering the characteristic functions χR and χS , the convergence implied by Theorem 3.2
gives that as n → ∞ we have (taking the inner product of both sides of the statement of the theorem)
1 n−1 1 n−1
X k Z Z Z
(Uψ χR , χS ) = χψ−k (R) χS dµ → χR dµ χS dµ,
X
n k=0 n k=0 X X X
or equivalently
1 n−1
µ(ψ−k (R) ∩ S) → µ(R)µ(S).
X
(5)
n k=0
In fact, it is not too difficult to check that the converse is true and hence (5) yields an alternate charac-
terisation of ergodicity.
It is natural to then ask when ψ is such that (5) may be strengthened to the requirement that the
expressions µ(ψ−n (R) ∩ S) (and not an average of such expressions) converge to µ(R)µ(S) as n → ∞.
Intuitively, this corresponds to the strengthening of the requirement that ψ stirs the entire phase space
on average (in that ψ is ergodic) to the requirement that ψ “thoroughly mixes” the entire phase space.
This motivates the following definition.
Definition 4.1. Let (X , M , µ, ψ) be a measurable system. Then ψ is mixing (or strongly-mixing) if for
every R, S ∈ M we have
µ(ψ−n (R) ∩ S) → µ(R)µ(S)
as n → ∞.
As it turns out, the requirement that a measurable transformation ψ be mixing is typically stronger
than one needs to prove theorems pertaining to the ergodic behaviour of ψ. This perhaps plausible in
view of the proof of ergodic theorems analogous to those presented above; we tend to care only about the
average behaviour of ψ, such as that described by (5). Thus, we relax Definition 4.1 to require something
slightly stronger than that which is automatic by the Mean Ergodic Theorem;
7
Definition 4.2. Let (X , M , µ, ψ) be a measurable system. Then ψ is weakly-mixing if for every R, S ∈ M
we have
1 n−1
|µ(ψ−k (R) ∩ S) − µ(R)µ(S)| → 0
X
n k=0
as n → ∞. This is equivalent to requiring that ψ satisfy Definition 4.1 with the limit taken over the
complement of a subset of K ⊂ N of zero upper density.
In light of (5), it is automatically the case that every weakly-mixing measurable transformation ψ
is ergodic. In complete analogy with the spectral characterisation of ergodicity above, and perhaps as
a testament to the deep interplay between spectral theory and ergodic theory, we have the following
characterisation of weakly-mixing measurable transformations.
Proposition 4.3. A measure-preserving map ψ : X → X is weakly-mixing if and only if Uψ has the only
eigenvalue 1. By Proposition 2.4 this eigenvalue has multiplicity 1, and in this case Uψ is said to have
continuous spectrum.
Proof. First suppose ψ : X → X is weakly-mixing, and that Uψ R f = λ f for
R some λ ∈R C. By Proposition 2.4
it suffices to show that λ = 1 for every nonzero such f . Then f dµ = Uψ f dµ = λ f dµ, and therefore
f dµ = 0 assuming λ 6= 1. Exactly as the definition of ergodicity was recast into the language of measure
R
theory to give the equivalent condition (5), we have the following equivalent definition of weakly-mixing;
for every f , g ∈ L 2 (X ) we have
1 n−1
¯ Z Z ¯
X¯ k ¯
¯(U f , g ) − f dµ g dµ¯ → 0. (6)
ψ
n k=0 ¯ ¯
1 n−1
|(U k f , f )| → 0.
X
(7)
n k=0 ψ
In keeping with our emphasis on operator-theoretic techniques, we directly apply the spectral theorem,
which gives (noting that the spectrum of Uψ is contained in the unit circle S 1 ⊂ C because Uψ is an
isometry on L 2 (X )) a finite measure ν associated to Uψ and f such that
Z
k
(Uψ f,f )= λk dν(λ),
S1
with the factor λk coming from the functional calculus associated with Uψ . Note that ν is zero on mea-
surable sets disjoint with the spectrum of Uψ . By (7) it remains to show that
1 n−1
¯ ¯
X ¯Z
¯ λk dν(λ)¯ → 0,
¯
(8)
n k=0 S 1
¯ ¯
does not suffice. We note that (8) holds if it holds with the terms in the sum replaced by their squares.
This fact permits the calculation of the sharper bound of [4] (we omit the expansion and manipulation
of the integral here); Ã !
1 n−1 X λk
n−1
¯ ¯ Z
X ¯Z k
¯ λ dν(λ) ¯ ≤2 1
d(ν × ν)(λ, η).
¯
n k=0 ¯ S 1 S 1 ×S 1 n k=0 ηk
¯
8
As S 1 × S 1 is compact and ν (and thus ν × ν) is a finite measure, it can be shown that the right-hand side
λk
converges to zero by the dominated convergence theorem. We note that integral of the expression ηk
is
well-defined by the fact that the spectral theorem gives a measure ν which assigns points zero measure.
In fact, for the dominated convergence theorem to yield the result, we require that the entire diagonal
of S 1 × S 1 is a ν-null set—but this is the case because ν has no atomic measurable sets.
Such a family {ψt } is sometimes called a measurable flow on (X , M , µ). We call a 1-parameter group of
measure-preserving transformations {ψt }t ∈R ergodic if ψt is ergodic for every t ∈ R.
Theorem 5.2. Let (X , M , µ) be a probability space, and let {ψt }t ∈R be an ergodic 1-parameter group of
measure-preserving transformations. If f ∈ L 2 (X ), then
1
Z T Z
lim Uψt f dt = f dµ.
T →∞ T 0 X
R1
Proof. We obtain the result from Theorem 3.2 by the following trick; we define g = 0 Uψt f dt and note
that g is measurable and L 2 integrable by Fubini’s theorem. Observe that for T ∈ N we have
Z T TX
−1 Z k+1 TX
−1 Z 1 TX
−1 Z 1
k
Uψt f dt = Uψt f dt = Uψk Uψt f dt = Uψ 1
Uψt f dt ,
0 k=0 k k=0 0 k=0 0
and hence Z T TX
−1
k
Uψt f dt = Uψ 1
g.
0 k=0
1 TX
−1 Z Z 1 Z 1Z Z
k
Uψ 1
g → (U ψt f )(x) dt dµ(x) = U ψt f dµ dt = f dµ,
T k=0 X 0 0 X X
9
it remains to show that °Z °
1°° T
°
Uψt f dt ° → 0
°
°
T ° bT c °
R1
as T → ∞. Fortunately, deferring to the discrete case may take us even further; let h = 0 |Uψt f | dt , which
is integrable because g ∈ L 2 (X ). Then as T → ∞ Theorem 3.2 implies (noting that dT e
bT c → 1)
1 dT e dT e 1 dTX e−1
1 bTXc−1
Z
k
|Uψt f | dt = Uψ h − U k h → 0. (9)
bT c bT c bT c dT e k=0 1
bT c k=0 ψ1
By (9) the right hand side converges to zero in L 2 (X ), which completes the proof.
Using a completely analogous trick to that of the previous proof, a continuous analog of Theorem 3.5
may be established. However, the statement of the result requires additional technical machinery—
particularly the notion of conditional expectation—which we have avoided here.
for every n, m ∈ R, where square brackets denote the distance from their argument to the nearest inte-
ger. In 2003, Einsiedler, Katok and Lindenstrauss [3] applied techniques of ergodic theory and measure
invariance to show that the set of pairs (n, m) such that (10) does not hold has Hausdorff measure zero.
10
References
[1] Jon Aaronson. An introduction to infinite ergodic theory. 50. American Mathematical Society, 1997.
DOI : 10.1112/S0024609398275436.
[2] L Barreira and B Saussol. “Hausdorff Dimension of Measures via Poincaré Recurrence”. Communi-
cations in Mathematical Physics 219.2 (2001), pp. 443–463. DOI: 10.1007/s002200100427.
[3] Manfred Einsiedler, Anatole Katok, and Elon Lindenstrauss. “Invariant measures and the set of ex-
ceptions to Littlewood’s conjecture”. Annals of Mathematics (2006), pp. 513–560. DOI: 10.4007/
annals.2006.164.513.
[4] Manfred Einsiedler and Thomas Ward. Ergodic theory with a view towards number theory. 1st ed.
Vol. 259. Graduate Texts in Mathematics. London: Springer-Verlag London, 2011. ISBN: 978-0-85729-
020-5. DOI: 10.1017/S0143385711001088.
[5] Tanja Eisner, Bálint Farkas, Markus Haase, and Rainer Nagel. Operator theoretic aspects of ergodic
theory. Vol. 272. Springer, 2015. DOI: 10.1007/978-3-319-16898-2.
[6] Paul Erdös and Paul Turán. “On some sequences of integers”. Journal of the London Mathematical
Society 1.4 (1936), pp. 261–264. DOI: 10.1112/jlms/s1-11.4.261.
[7] Frederick Riesz. “Some mean ergodic theorems”. Journal of the London Mathematical Society 1.4
(1938), pp. 274–278. DOI: 10.1112/jlms/s1-13.4.274.
[8] Norbert Wiener. “The ergodic theorem”. Duke Math. J. 5.1 (Mar. 1939), pp. 1–18. DOI : 10.1215/
S0012-7094-39-00501-6.
11