2017 s2 Math3325 Essay

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

A spectral entrée to the Ergodic theory proper,

lightly seasoned with a physical perspective

Keeley Hoek
October 27, 2017

In this essay we present a gentle introduction to several fundamental results of Ergodic theory. We do
this as much as possible from the perspective of functional analysis and spectral theory, developing the
necessary measure-theoretic tools as we go. We introduce the fundamental notions of ergodicity and
mixing of systems, naturally motivating the latter from the statement of the Mean Ergodic Theorem,
before discussing applications. Generalisations of the fundamental tools considered here yield proofs
of a vast array of results spanning the breadth mathematics.
By the the use of methods derived from ergodic theory, it is frequently found that previous results
may be sharpened or at least may be produced more efficiently. For instance, Szemerédi’s theorem of
arithmetic combinatorics was first proved via an intricate combinatorial argument, while Furstenberg’s
ergodic approach effectively generalises the Poincaré recurrence theorem—from which Szemerédi’s the-
orem then follows as a corollary. The progress of Einsiedler, Katok, and Lindenstrauss on Littlewood’s
conjecture—namely showing that the set of exceptions to the conjecture has Hausdorff dimension zero—
is another good example.

1 The idea of Ergodic theory


A central objective of Ergodic theory is the study of the behaviour of a system evolving either discretely
or continuously. For a system evolving discretely we consider the behaviour of the system as the number
of evolution steps becomes arbitrarily large, while for continuously parametrised systems we examine
the system for arbitrarily large real parameters. In this sense, we consider systems over “long timescales”.
While many authors (such as [4]) cite the scope of Ergodic theory as being particularly difficult to define,
so-called ergodic theorems—many of which are each considered a fundamental result of the theory—
are of precisely this nature. Namely, they characterise the mean value over long timescales of observables
(state-functions) of a given system. We begin by establishing a framework in which we may meaningfully
speak about the states of a system, and of observables associated with those states.
Intuitively, a physical (dynamical) system is defined by a state object living in the space of all possi-
ble system states—this space is known as the corresponding phase space of a system. In order to make
sense of notions such as “almost every state”, we require that the phase space be equipped with a mea-
sure, turning the phase space into a measure space. The measure of an entire phase space (X , M , µ) may
be “small” in that µ(X ) < ∞, or “large” in the sense that µ(X ) = ∞. For instance, compare a particle with
finite energy constrained within a bounded box, to a particle allowed to escape from the solar system
and enter deep space. While an ergodic theory of both small and large phase spaces (in the above sense)
is well-developed, it may come at little surprise that the results of each branch of the theory differ sig-
nificantly.1 Here we restrict ourselves to the case where µ(X ) = 1, where (X , M , µ) is called a probability
space (any finite measure space may be made into such a space by a trivial normalisation of the mea-
sure). This case is most faithful to the historical foundations of the theory. We additionally require that
all spaces considered throughout are separable, in order that the Hilbert space L 2 associated with each
probability space is itself separable.
We will first consider systems which evolve discretely, in which each state x ∈ X evolves under the
action of a measurable transformation ψ : X → X (we will see that the continuous case may be built on
top of the discrete one—we simply consider a continuously parametrised family of discrete transforma-
tions). A particular example of such a system is dynamical system of classical mechanics. In the for-
malism of Hamiltonian mechanics, such a system’s phase space is a symplectic manifold equipped with
1 Aaronson in [1] provides a introductory survey of results in the infinite measure case.

1
a natural measure, obtained locally in coordinate charts from the Lebesgue measure. In this context,
Liouville’s theorem guarantees that the measure of any measurable subset of the phase space is invari-
ant under the action of evolution transformations. The fact that many other generalised (non-classical)
systems obey2 this property motivates the following definition.

Definition 1.1. Let (X , M , µ) be a measure space and let ψ : X → X be a measurable function. Then the
map ψ is µ- or measure-preserving and µ is ψ-invariant if

µ(ψ−1 (S)) = µ(S)

for every S ∈ M .

The idea of a measure-preserving transformation extends naturally to maps between measure spaces,
but we will not deal with such maps here. The notion of a µ-preserving map then leads to the following
conveniently packaged object.

Definition 1.2. A measurable system is a tuple (X , M , µ, ψ) consisting of a probability space (X , M , µ)


and a µ-preserving map ψ : X → X .

Working within the framework of modern measure theory, the following foundational result of er-
godic theory now follows quickly. The theorem answers a question of a form typical of ergodic theory;
for a given initial state, how often will we return to a fixed neighbourhood of that state?—in this case, the
answer is (almost-certainly) at least infinitely often.

Theorem 1.3 (Poincaré). Let (X , M , µ, ψ) be a probability system, and let S ∈ M . Then for almost every
x ∈ S there exists a strictly increasing sequence (n k ) of natural numbers such that ψnk (x) ∈ S for every k ∈ N
(with an exponent denoting repeated composition).

Proof. Let S ∈ M . We will show that the set

N = {x ∈ S : x = ψn (x) for finitely many n}

has measure zero. Let ψ−k denote the k-fold composition ψ−1 ◦ · · · ◦ ψ−1 . As S \ ψ−n (S) is exactly the set
of x ∈ S such that ψn (x) 6∈ S, it is clear that
∞ \

N = lim inf(S \ ψ−n (S)) = (S \ ψ−k (S)).
[
n→∞
n=1 k=n

Now, we claim that


∞ ∞
(S ∩ ψ−n (R)) where R = (S \ ψ−n (S)),
[ \
N= (1)
n=0 n=1

with R ⊂ X (certainly) measurable, and satisfying ψ−i (R) ∩ ψ− j (R) = ; for every i 6= j . The idea is that
the sets ψ−i (R) “wander” about the finite measure space X for varying i , in that each is disjoint with
every other. Assuming the claim, it follows immediately that
à !

−n
µ(N ) = µ (S ∩ ψ (R))
[
n=1
à !

−n
≤µ ψ
[
(R)
n=1

µ(ψ−n (R))
X

n=1
≤ µ(X ).

However, we have µ(ψ−n (R)) = µ(R) for every n ∈ N by the µ-invariance of ψ, and hence as µ(X ) = 1 < ∞,
we must have µ(R) = 0. Therefore µ(N ) = 0, as required.
2 For generalised systems, we require that there exists a measure on the associated phase space which is invariant under the
evolution transformation.

2
To see (1), simply note that if x ∈ S then

x ∈ ψ−n (R) ⇐⇒ ψn (x) ∈ R ⇐⇒ ψk (x) 6∈ S for every k > n.

Hence for each n ≥ 0 we have



S ∩ ψ−n (R) = (S \ ψ−k (S)),
\
k=n

as desired. It remains to show that the sets ψ−i (R) and ψ− j (R) are pairwise disjoint for i 6= j . Suppose
that x ∈ ψ−i (R) ∩ ψ− j (R) for some i , j ∈ N with j ≤ i . Then we have ψ j (x) ∈ R and ψi (x) ∈ R ⊂ S. The
latter condition is equivalent to ψi − j (ψ j (x)) ∈ R ⊂ S, which exactly means that the image under ψi − j
of an element of R is an element of R ⊂ S. Positivity of i − j would contradict the definition of R, and
therefore i = j . This completes the proof.

Poincaré originally proved this result in the case of the gravitational interaction of celestial bodies
([4]). It should be noted that the finiteness of X (with respect to µ) was critical to the proof of the result;
certainly a rocket on a solar escape trajectory (with a presumably infinite volume phase space X ) need
not return to the launch site—and certainly not infinitely often! Despite the ease with which this result
was produced, it is a substantially more difficult problem to characterise the rate at which points return
to any S ∈ M (see for instance [2]).
In the nomenclature of Ergodic theory, the “ergodicity” of a system refers to the tendency of a given
state to traverse all other possible states as it evolves. This requirement intuitively means that the av-
erage behaviour of a system is constant in time, and may be determined by examining the “average”
system state. Many “ergodic theorems” give precise characterisations of the extent to which exactly this
temporal-spacial equivalence holds under specific regimes. Hence we make the following definition.

Definition 1.4. A measure-preserving map ψ : X → X on a probability-space (X , M , µ) is ergodic if for


every S ∈ M satisfying ψ−1 (S) = S we have µ(S) ∈ {0, 1}.

2 Koopman operators
We now introduce an operator associated to each measurable transformation ψ : X → X . This opera-
tor permits the tools of functional analysis to be brought to bear on the study of ψ, and yields many
spectral/operator-theoretic characterisations of properties, such as the ergodicity, of ψ itself.

Definition 2.1. Given a measure-preserving map ψ : X → X , it is natural to consider the induced map
Uψ : L 2 (X ) → L 2 (X ), defined by sending f 7→ f ◦ ψ. We call Uψ the Koopman operator associated with ψ.

Functions f ∈ L 2 (X ) form a large class of complex-valued functions of state. Thus it is natural to


interpret such functions f : X → C as observables of the system with phase space X . Physically, we
might think of L 2 (X ) as being a class of properties of the system associated with X , where each may be
possible (in principle) to observe. The action of Uψ on some such observable f then simply yields a new
observable which gives the value of f when evaluated a single time-step in the future.
To begin exploring Uψ , we first require a technical lemma—in fact the converse of the lemma also
holds, and hence yields an equivalent characterisation of measure-preserving maps.

Lemma 2.2. If ψ : X → X is a measure-preserving map and f ∈ L 2 (X ), then


Z Z
f dµ = f ◦ ψ dµ. (2)
X X

Proof. As ψ is measurable, (2) holds for the characteristic function of any measurable subset of X . The
claim then follows immediately from the definition of the integral in terms of simple functions.

The study of the elementary properties of Koopman operators begins with the observation that they
are linear and respect products of functions. For example, from these facts it will follow that they are
isometries and are unitary when surjective. In fact, acting on the continuous complex-valued functions
C (X ) on X , the Koopman operator is even a C ∗ -algebra homomorphism when X is compact.3
3 See Theorem 4.13 of [5].

3
Proposition 2.3. The operator Uψ preserves the inner product on the Hilbert space L 2 (X ) (i.e. Uψ is an
isometry). Furthermore, if ψ is invertible then Uψ is surjective and therefore unitary.

Proof. For every f , g ∈ L 2 (X ) we may directly compute


Z Z Z
(Uψ f ,Uψ g ) = ( f ◦ ψ)(g ◦ ψ) = ( f · g ) ◦ ψ = f · g = (f ,g)
X X X

by Lemma 2.2. If ψ is invertible then Uψ ( f ◦ ψ−1 ) = f ◦ ψ−1 ◦ ψ = f for every f ∈ L 2 (X ), showing that Uψ
is surjective.

We will shortly discover some interesting characterisations of ψ based on properties of Uψ , a conse-


quence of the fact that very many ergodic-theoretic properties of ψ are witnessed as spectral properties
of the associated Koopman operator. The following proposition is the first such example, and may be
proven by a standard measure-theoretic argument.

Proposition 2.4. A measure-preserving map ψ : X → X is ergodic if and only if Uψ has the eigenvalue 1
with multiplicity 1.

Proof. Suppose Uψ has the eigenvalue 1 with multiplicity 1, and furthermore that ψ−1 (S) = S for some
S ∈ M . Then χS ◦ ψ = Uψ (χS ) = χS , and hence χS is an eigenfunction of Uψ with eigenvalue 1. Now,
Uψ certainly sends constant functions to themselves, and hence (as the eigenspace associated with 1
has multiplicity 1) the eigenfunctions of Uψ associated with the eigenvalue 1 are precisely the constant
functions. Therefore χS is constant almost everywhere on X , which implies that either χS = 1 or χS = 0
with one of these equalities holding almost everywhere. Hence µ(S) ∈ {0, 1}, and therefore ψ is ergodic,
as required.
Now let ψ : X → X be an ergodic measure-preserving map, and suppose f ◦ ψ = f for some mea-
surable f : X → C. By the linearity of Uψ , it suffices to show that f is constant in the case that f is
real-valued. We claim that for each n ∈ Z the fact that ψ is ergodic implies that there is exactly one k ∈ Z
such that f (x) ∈ C n,k = [k2−n , (k + 1)2−n ) ⊂ R for almost every x ∈ X . It will then follow that almost
every f (x) is contained in a single half-open interval of arbitrarily small diameter, and therefore that
f (x) is constant almost everywhere. Hence the eigenspace of Uψ corresponding to the eigenvalue 1 has
algebraic multiplicity 1 (consisting of only almost everywhere constant functions), as required.
It remains to prove that f (x) ⊂ C n,k almost everywhere for exactly one k ∈ Z, for each n ∈ Z. Fix
an n ∈ Z. As µ(X ) = 1, there is certainly some k ∈ Z such that S = f −1 (C n,k ) has nonzero measure (X
is the countable union of the disjoint sets f −1 (C n, j ) with j varying over the integers, and hence the
contrary hypothesis would contradict the countable additivity of µ). Note that we have χC n,k ◦ f = χS .
Thus, as f ◦ ψ = f , we have that χS ◦ ψ = χS by the associativity of composition. But χS ◦ ψ = χψ−1 (S)
by the definition of the characteristic function, and thus ψ−1 (S) = S (deleting a null-measure set from
S if necessary). The fact that ψ is ergodic then immediately implies that µ(S) ∈ {0, 1}. Now, µ(S) > 0 by
assumption, and hence µ(S) = 1. Therefore X \ S is a µ-null set, and the claim follows.

3 Ergodic theorems
In this section we will develop the necessary operator-theoretic machinery to prove a fundamental result
of ergodic theory, namely von Neumann’s Mean Ergodic Theorem. The proof of the Mean Ergodic The-
orem, in combination with the so-called Pointwise Ergodic Theorem of Birkhoff which followed shortly
thereafter, marked the birth of modern ergodic theory.
As suggested by Proposition 2.4, the subspace fixed by the Koopman operator associated with a given
measure-preserving transformation is of considerable interest when studying the transformation itself.
We first make the following natural definition.

Definition 3.1. We denote the fixed subspace of a measure-preserving transformation ψ on a probability


space X by
F ψ = ker(1 −Uψ ) = { f ∈ L 2 (X ) : Uψ f = f },
where we let 1 : L 2 (X ) → L 2 (X ) be the identity map. The set F ψ is exactly the subspace of L 2 (X ) fixed by
precomposition with ψ.

4
The subset F ψ ⊂ L 2 (X ) is a subspace of L 2 (X ) precisely because it the the kernel of the linear map
1 − Uψ . Furthermore, F ψ is closed because it is the level set of the continuous map 1 − Uψ . It is then
immediate from the theory of Hilbert spaces that L 2 (X ) = F ψ ⊕ F ψ ⊥
.
The Mean Ergodic theorem is essentially the statement that, when considering the average behaviour
of a system over long times, the subspace F ψ of observables of a measurable system (X , M , µ, ψ) is the
only interesting set of observables;
Theorem 3.2 (von Neumann). Let (X , M , µ, ψ) be a measurable system, and let f ∈ L 2 (X ). As F ψ is a
closed subspace of L 2 (X ), we can write f = f˜ + g with f˜ ∈ F ψ and g ∈ F ψ

. Then

1 n−1
X k
U f → f˜
n k=0 ψ

in L 2 (X ) as n → ∞. In particular, if ψ is ergodic then

1 n−1
X k Z
U f → f dµ, (3)
n k=0 ψ X

in L 2 (X ). Conversely, if ψ satisfies (3) for every f ∈ L 2 (X ) then ψ is ergodic.


The quantity n1 n−1 U k f is an observable which gives the mean value of f over n consecutive evo-
P
k=0 ψ R
lutions of the system, while X f dµ is merely the average value of the observable f on the entire phase
space.
Thus, put another way, the Mean Ergodic Theorem says that the the mean of the value of an observ-
able averaged over a large number of evolution-steps of an ergodic measurable system converges to the
mean value of that observable on the entire space. Interpreting the map ψ : X → X as a “time-step”,
the former average is a “time-average” in some sense. Further, the theorem states that this property is
necessary and sufficient for ψ to be ergodic.

In order to prove the theorem, we will show that the time average of observables in F ψ go to zero. This
deviates from von Neumann’s original argument, which invoked powerful machinery of operator theory
including the spectral theory of unitary operators (in particular, their eigenvalues), and was quite elab-

orate. For our purposes the following shortcut—executed via a characterisation of F ψ first established
by Riesz in [7]—greatly simplifies the proof.
Lemma 3.3. Let (X , M , µ, ψ) be a measurable system. The set F ψ

is the closure (in L 2 ) of the set G ψ =

{Uψ f − f : f ∈ L 2 (X )}, and equivalently F ψ = G ψ .

Proof. We will directly show the necessary set-theoretic inclusions; first consider any g ∈ G ψ . The con-
venience of our (trick) definition of G ψ is then immediately clear, as we must have the useful condition
(g ,Uψ f ) = (g , f ) for every f ∈ L 2 (X ). Thus

kUψ g − g k2 = (Uψ g ,Uψ g ) − (Uψ g , g ) − (g ,Uψ g ) + (g , g )


= (Uψ g , g ) − (Uψ g , g ) − (g , g ) + (g , g )
= 0,
⊥ ⊥ ⊥
and hence Uψ g = g . Thus g ∈ F ψ , and it follows that G ψ ⊂ F ψ by the elementary fact that G ψ = G ψ .
Conversely, suppose g ∈ F ψ . Then

(g ,Uψ f − f ) = (g ,Uψ f ) − (g , f ) = (Uψ g ,Uψ f ) − (g , f ) = 0



for every f ∈ L 2 (X ). Therefore we have F ψ ⊂ G ψ

= G ψ , which completes the proof.

In light of the following lemma, the utility of Lemma 3.3 is greatly elucidated.
Lemma 3.4. The closure G ψ is annihilated in mean by Uψ , in that for every f ∈ G ψ we have

1 n−1
X k
U f → 0, (4)
n k=0 ψ

where convergence is understood in the L 2 sense.

5
Proof. Fix a sequence (g n ) in G ψ converging to g . Then for each n, m ∈ N we have

1 m−1
X k 1 m−1
X k+1 k 1 m
Uψ g n = (U f n −Uψ f n ) = (Uψ f n − f n ).
m k=0 m k=0 ψ m

m m 1 Pm−1 k
Now, as Uψ is unitary we have kUψ f n − f n k ≤ kUψ f n k + k f n k ≤ 2k f n k, and hence m k=0
Uψ g n → 0 as
m → ∞. The condition (4) now follows from the triangle inequality, because again by the unitarity of Uψ
we have
° ° ° ° ° °
1° m−1 ° 1° m−1 ° 1° m−1 °
°X k ° °X k °X k °
Uψ g ° ≤ ° Uψ (g − g n )° + ° Uψ g n °
°
°
m ° k=0 ° 2 m ° k=0 ° 2 m ° k=0 ° 2
L L L
° °
1 m−1 1 °m−1 °
U k gn °
X° ° X
≤ °g − g n ° 2 + ° °
L
m ° k=0 ψ ° 2
°
m k=0
L
° °
1 °m−1 °
U k gn ° .
° ° X
≤ °g − g n °L 2 + °
° °
m ° k=0 ψ ° 2
L

In particular taking n, and then m, large enough, the right hand side goes to zero and the claim follows.

We are now in a position to prove the Mean Ergodic Theorem.

Proof of Theorem 3.2. Fix some f ∈ L 2 (X ), and by the decomposition L 2 (X ) = F ψ ⊕ F ψ



write f = f˜ + g
˜ ⊥ k ˜ ˜
with f ∈ F ψ and g ∈ F ψ . Then by the linearity of the limit and the fact that Uψ f = f for every k ≥ 0,
Lemma 3.4 implies that

1 n−1
X k 1 n−1
X k
Uψ f → U f˜ + 0 = f˜,
n k=0 n k=0 ψ

which proves the first part of the theorem. To see the second part, observe that when ψ is ergodic Propo-
sition 2.4 implies that F ψ is exactly the set of almost everywhere constant functions on L 2 (X ). Thus f˜ is
constant almost everywhere. As (X , M , µ) is a probability space it follows that
Z Z Z Z
f dµ = f˜ dµ + g dµ = f˜ + g dµ
X X X X
R
almost everywhere, and thus it suffices to show X g dµ = 0. By Lemma 3.4 we may take a sequence ( f n )
such that Uψ f n − f n converges to g in L 2 . As we have the inequality
¯Z Z ¯ Z
¯ ¯
¯ g dµ − (Uψ f n − f n ) dµ¯ ≤ |g − (Uψ f n − f n )| dµ → 0,
¯ ¯
X X X
R R
it follows that X (Uψ f n − f n ) dµ → X g dµ. However,
Z Z
(Uψ f n − f n ) dµ = (Uψ f n −Uψ f n ) dµ = 0
X X
R
by Lemma 2.2, which yields the desired equality X g dµ = 0.
Conversely, suppose that we have (3) for every f ∈ L 2 (X ). Fix any S ∈ M such that ψ−1 (S) = S. As
k
argued in the proof Proposition 2.4 we have Uψ χS = χS for every k ∈ N, and hence (3) immediately gives

1 n−1
Z
χS = χS → χS dµ = µ(S).
X
n k=0 X

Therefore χS is equal almost everywhere on X to the constant function µ(S), and hence µ(S) ∈ {0, 1}, as
required. Therefore ψ is ergodic, and this completes the proof of the theorem.

6
A stronger version of the Poincaré recurrence theorem, which includes a statement about the ex-
pected time between recurrences, follows directly from the Mean Ergodic Theorem. This is but a sin-
gle example of many which suggest the Mean Ergodic Theorem is an important result. The following
theorem of Birkhoff (published less than a year following von Neumann’s result, and leading to some
controversy), gives an analogous pointwise version of the Mean Ergodic theorem; for this reason it is
sometimes known as the Pointwise Ergodic Theorem. Note that we have explicitly avoided the use of
the notation “Uψ ” as Theorem 3.5 is a statement regarding L 1 functions.

Theorem 3.5 (Birkhoff). Let (X , M , µ, ψ) be a measurable system, and let f ∈ L 1 (X ). Then there exists
f˜ ∈ L 1 (X ) such that for almost every x ∈ X we have

1 n−1
f (ψk (x)) → f˜(x).
X
n k=0

Furthermore the function f˜ is invariant under precomposition with ψ, and satisfies f˜ dµ =


R R
X X f dµ. If
in addition ψ is ergodic, then f˜ is constant almost everywhere.

In our restricted setting, the Pointwise Ergodic Theorem may be used to deduce the Mean Ergodic
Theorem. However, the latter should not be considered completely a corollary of the former, as the
Mean Ergodic Theorem is itself a corollary of a statement about retractions on Hilbert spaces which
does not follow from the pointwise version. Further, the Mean Ergodic Theorem generalises to arbitrary
measure-preserving actions of locally compact amenable groups and Følner sequences, while this is
more difficult in the pointwise case.

4 Mixing
The Mean Ergodic Theorem makes a statement which warrants further investigation. Namely, for a
given a measurable transformation, we ask what more can be said about the convergence of spacial and
temporal averages. To this end, let (X , M , µ, ψ) be a measurable system with ψ ergodic, and suppose
R, S ∈ M . Considering the characteristic functions χR and χS , the convergence implied by Theorem 3.2
gives that as n → ∞ we have (taking the inner product of both sides of the statement of the theorem)

1 n−1 1 n−1
X k Z Z Z
(Uψ χR , χS ) = χψ−k (R) χS dµ → χR dµ χS dµ,
X
n k=0 n k=0 X X X

or equivalently
1 n−1
µ(ψ−k (R) ∩ S) → µ(R)µ(S).
X
(5)
n k=0
In fact, it is not too difficult to check that the converse is true and hence (5) yields an alternate charac-
terisation of ergodicity.
It is natural to then ask when ψ is such that (5) may be strengthened to the requirement that the
expressions µ(ψ−n (R) ∩ S) (and not an average of such expressions) converge to µ(R)µ(S) as n → ∞.
Intuitively, this corresponds to the strengthening of the requirement that ψ stirs the entire phase space
on average (in that ψ is ergodic) to the requirement that ψ “thoroughly mixes” the entire phase space.
This motivates the following definition.

Definition 4.1. Let (X , M , µ, ψ) be a measurable system. Then ψ is mixing (or strongly-mixing) if for
every R, S ∈ M we have
µ(ψ−n (R) ∩ S) → µ(R)µ(S)
as n → ∞.

As it turns out, the requirement that a measurable transformation ψ be mixing is typically stronger
than one needs to prove theorems pertaining to the ergodic behaviour of ψ. This perhaps plausible in
view of the proof of ergodic theorems analogous to those presented above; we tend to care only about the
average behaviour of ψ, such as that described by (5). Thus, we relax Definition 4.1 to require something
slightly stronger than that which is automatic by the Mean Ergodic Theorem;

7
Definition 4.2. Let (X , M , µ, ψ) be a measurable system. Then ψ is weakly-mixing if for every R, S ∈ M
we have
1 n−1
|µ(ψ−k (R) ∩ S) − µ(R)µ(S)| → 0
X
n k=0
as n → ∞. This is equivalent to requiring that ψ satisfy Definition 4.1 with the limit taken over the
complement of a subset of K ⊂ N of zero upper density.
In light of (5), it is automatically the case that every weakly-mixing measurable transformation ψ
is ergodic. In complete analogy with the spectral characterisation of ergodicity above, and perhaps as
a testament to the deep interplay between spectral theory and ergodic theory, we have the following
characterisation of weakly-mixing measurable transformations.
Proposition 4.3. A measure-preserving map ψ : X → X is weakly-mixing if and only if Uψ has the only
eigenvalue 1. By Proposition 2.4 this eigenvalue has multiplicity 1, and in this case Uψ is said to have
continuous spectrum.
Proof. First suppose ψ : X → X is weakly-mixing, and that Uψ R f = λ f for
R some λ ∈R C. By Proposition 2.4
it suffices to show that λ = 1 for every nonzero such f . Then f dµ = Uψ f dµ = λ f dµ, and therefore
f dµ = 0 assuming λ 6= 1. Exactly as the definition of ergodicity was recast into the language of measure
R

theory to give the equivalent condition (5), we have the following equivalent definition of weakly-mixing;
for every f , g ∈ L 2 (X ) we have

1 n−1
¯ Z Z ¯
X¯ k ¯
¯(U f , g ) − f dµ g dµ¯ → 0. (6)
ψ
n k=0 ¯ ¯

Fixing f as above, we have


1 n−1
X k
|λ ( f , g )| → 0.
n k=0
But |λ| = 1 as Uψ is an isometry, and hence ( f , g ) = 0 for every g ∈ L 2 (X ). Thus we have f = 0, as required.
The other direction is considerably more difficult, though only in that each known proof makes use
of a significant result from functional analysis. Using the trick of Einsiedler and Ward [4], R by the polari-
sation identity it suffices to show that (6) holds for g = f . Then subtracting the constant f dµ from f if
necessary, it is enough that as n → ∞ we have

1 n−1
|(U k f , f )| → 0.
X
(7)
n k=0 ψ

In keeping with our emphasis on operator-theoretic techniques, we directly apply the spectral theorem,
which gives (noting that the spectrum of Uψ is contained in the unit circle S 1 ⊂ C because Uψ is an
isometry on L 2 (X )) a finite measure ν associated to Uψ and f such that
Z
k
(Uψ f,f )= λk dν(λ),
S1

with the factor λk coming from the functional calculus associated with Uψ . Note that ν is zero on mea-
surable sets disjoint with the spectrum of Uψ . By (7) it remains to show that

1 n−1
¯ ¯
X ¯Z
¯ λk dν(λ)¯ → 0,
¯
(8)
n k=0 S 1
¯ ¯

and in fact the trivial bound


1 n−1 n−1
¯ ¯
¯ λk dν(λ)¯ ≤ 1
X ¯Z XZ
1 dν(λ) = ν(S 1 )
¯
n k=0 S 1
¯ ¯ n k=0 S 1

does not suffice. We note that (8) holds if it holds with the terms in the sum replaced by their squares.
This fact permits the calculation of the sharper bound of [4] (we omit the expansion and manipulation
of the integral here); Ã !
1 n−1 X λk
n−1
¯ ¯ Z
X ¯Z k
¯ λ dν(λ) ¯ ≤2 1
d(ν × ν)(λ, η).
¯
n k=0 ¯ S 1 S 1 ×S 1 n k=0 ηk
¯

8
As S 1 × S 1 is compact and ν (and thus ν × ν) is a finite measure, it can be shown that the right-hand side
λk
converges to zero by the dominated convergence theorem. We note that integral of the expression ηk
is
well-defined by the fact that the spectral theorem gives a measure ν which assigns points zero measure.
In fact, for the dominated convergence theorem to yield the result, we require that the entire diagonal
of S 1 × S 1 is a ν-null set—but this is the case because ν has no atomic measurable sets.

5 Ergodic theorems, revisited


Given the abundance of examples of time-continuously evolving systems in our physical world, the
reader may have been disheartened by our insistence on proving theorems pertaining to only discrete
transformations of measurable systems. However, the case of continuous transformations is not much
more difficult to develop. In fact, the proofs of analogous fundamental theorems for the continuous case
benefit significantly from the discrete results. With the aim of proving a continuous version of the Mean
Ergodic Theorem, we first define the continuous-analog of a measure-preserving transformation.

Definition 5.1. A 1-parameter group of measure-preserving transformations on a probability space (X , M , µ)


is a family {ψt }t ∈R of µ-preserving maps ψt : X → X satisfying the properties that

1. for each ψt and ψs we have ψt ◦ ψs = ψt +s , and

2. the map ψ0 is the identity on X .

Such a family {ψt } is sometimes called a measurable flow on (X , M , µ). We call a 1-parameter group of
measure-preserving transformations {ψt }t ∈R ergodic if ψt is ergodic for every t ∈ R.

Indeed, the historical development of ergodic theory considered measure-preserving transforma-


tions induced by R-actions; the discrete case of a Z-action was an afterthought.4 Completing this sec-
tion, we will see the advantage of the more contemporary development given here, where the simpler
discrete version of Theorem 3.2 may be used to bootstrap the proof of a completely analogous result for
the continuous case.

Theorem 5.2. Let (X , M , µ) be a probability space, and let {ψt }t ∈R be an ergodic 1-parameter group of
measure-preserving transformations. If f ∈ L 2 (X ), then

1
Z T Z
lim Uψt f dt = f dµ.
T →∞ T 0 X
R1
Proof. We obtain the result from Theorem 3.2 by the following trick; we define g = 0 Uψt f dt and note
that g is measurable and L 2 integrable by Fubini’s theorem. Observe that for T ∈ N we have
Z T TX
−1 Z k+1 TX
−1 Z 1 TX
−1 Z 1
k
Uψt f dt = Uψt f dt = Uψk Uψt f dt = Uψ 1
Uψt f dt ,
0 k=0 k k=0 0 k=0 0

and hence Z T TX
−1
k
Uψt f dt = Uψ 1
g.
0 k=0

As ψ1 is ergodic, this immediately implies that as T → ∞ (with T ∈ N) we have

1 TX
−1 Z Z 1 Z 1Z Z
k
Uψ 1
g → (U ψt f )(x) dt dµ(x) = U ψt f dµ dt = f dµ,
T k=0 X 0 0 X X

by Fubini’s theorem5 , as desired.


As we have (where b·c and d·e denote the floor and ceiling functions, respectively)
°Z ° °Z ° °Z °
1° ° T 1°° bT c 1°° T
° ° °
Uψt f dt ° ≤ ° Uψt f dt ° + ° Uψt f dt °
° ° °
°
T ° bT c ° 2 T° 0 ° 2 T ° bT c ° 2
L L L
4 See [5].
5 Integrability of g is immediate from the fact that g ∈ L 2 (X ) and µ(X ) < ∞.

9
it remains to show that °Z °
1°° T
°
Uψt f dt ° → 0
°
°
T ° bT c °
R1
as T → ∞. Fortunately, deferring to the discrete case may take us even further; let h = 0 |Uψt f | dt , which
is integrable because g ∈ L 2 (X ). Then as T → ∞ Theorem 3.2 implies (noting that dT e
bT c → 1)

1 dT e dT e 1 dTX e−1
1 bTXc−1
Z
k
|Uψt f | dt = Uψ h − U k h → 0. (9)
bT c bT c bT c dT e k=0 1
bT c k=0 ψ1

However, we have the bound


°Z °2 Z ¯¯Z T ¯2
1 °° T 1
° ¯
Uψt f dt ° ≤ Uψt f (x) dt ¯ dµ(x)
° ¯ ¯
2 2
° ¯
T ° bT c ° 2 bT c X ¯ bT c ¯
L
¯ Z dT e ¯2
¯ 1
Z ¯ ¯
≤ ¯ |Uψt f (x)| dt ¯ dµ(x).
¯
X ¯ bT c bT c ¯

By (9) the right hand side converges to zero in L 2 (X ), which completes the proof.

Using a completely analogous trick to that of the previous proof, a continuous analog of Theorem 3.5
may be established. However, the statement of the result requires additional technical machinery—
particularly the notion of conditional expectation—which we have avoided here.

6 The main course


Just as the Mean Ergodic Theorem may be extended to the case of continuous transformations of mea-
surable systems, extensions to probabilistic and quantum-mechanical regimes also exist. The size of
the body of ergodic theorems based on different averaging schemes cannot be understated. In contrast,
so-called “local ergodic theorems”, first introduced by Wiener [8], make statements regarding the local
properties of averages (i.e. how they change over small times). This is directly analogous to the state-
ment which the fundamental theorem of calculus makes regarding the “local behaviour” of integrals.
The extensions of the results developed here have very significant implications for an unexpect-
edly large number of mathematical disciplines. For example, consider the 1936 conjecture of Erdös
and Turán [6] that every subset of N of positive upper density contains arithmetic progressions of ev-
ery length. This result is known as Szemerédi’s theorem after Endre Szemerédi, who gave an intricate
combinatorial proof in 1975. Famously, two years later Furstenberg provided a strengthening of the
Poincaré recurrence theorem (Theorem 1.3), where he studied “multiple recurrences”, from which Sze-
merédi’s theorem followed as a special case. In doing this, Furstenberg’s general method revolutionised
our capability to translate between ergodic-theoretic results and number-theoretic ones (see [4]).
As one may have already observed, in proving our Mean Ergodic Theorem we obtained no quanti-
tative estimate on the rate at which we should expect convergence to occur. In general, this is a much
more difficult question to answer, but its study has great utility in obtaining new results. For instance,
such ideas undeniably influenced Green and Tao’s proof that the primes contain arithmetic progressions
of arbitrary length, an extension of Szemerédi’s theorem [5]. Also of note is the work of Einsiedler, Katok
and Lindenstrauss on Littlewood’s conjecture; the claim that

lim inf k[nk][mk] = 0 (10)


k→∞

for every n, m ∈ R, where square brackets denote the distance from their argument to the nearest inte-
ger. In 2003, Einsiedler, Katok and Lindenstrauss [3] applied techniques of ergodic theory and measure
invariance to show that the set of pairs (n, m) such that (10) does not hold has Hausdorff measure zero.

10
References
[1] Jon Aaronson. An introduction to infinite ergodic theory. 50. American Mathematical Society, 1997.
DOI : 10.1112/S0024609398275436.

[2] L Barreira and B Saussol. “Hausdorff Dimension of Measures via Poincaré Recurrence”. Communi-
cations in Mathematical Physics 219.2 (2001), pp. 443–463. DOI: 10.1007/s002200100427.
[3] Manfred Einsiedler, Anatole Katok, and Elon Lindenstrauss. “Invariant measures and the set of ex-
ceptions to Littlewood’s conjecture”. Annals of Mathematics (2006), pp. 513–560. DOI: 10.4007/
annals.2006.164.513.
[4] Manfred Einsiedler and Thomas Ward. Ergodic theory with a view towards number theory. 1st ed.
Vol. 259. Graduate Texts in Mathematics. London: Springer-Verlag London, 2011. ISBN: 978-0-85729-
020-5. DOI: 10.1017/S0143385711001088.
[5] Tanja Eisner, Bálint Farkas, Markus Haase, and Rainer Nagel. Operator theoretic aspects of ergodic
theory. Vol. 272. Springer, 2015. DOI: 10.1007/978-3-319-16898-2.
[6] Paul Erdös and Paul Turán. “On some sequences of integers”. Journal of the London Mathematical
Society 1.4 (1936), pp. 261–264. DOI: 10.1112/jlms/s1-11.4.261.
[7] Frederick Riesz. “Some mean ergodic theorems”. Journal of the London Mathematical Society 1.4
(1938), pp. 274–278. DOI: 10.1112/jlms/s1-13.4.274.
[8] Norbert Wiener. “The ergodic theorem”. Duke Math. J. 5.1 (Mar. 1939), pp. 1–18. DOI : 10.1215/
S0012-7094-39-00501-6.

11

You might also like