9-Richtmyer - Principles of Advanced Mathematical Physics II
9-Richtmyer - Principles of Advanced Mathematical Physics II
9-Richtmyer - Principles of Advanced Mathematical Physics II
Monographs
in Physics
w. BeiglbOck
M. Goldhaber
E. H. Lieb
W. Thirring
Series Editors
Robert D. Richtmyer
Principles of Advanced
Mathematical Physics
Volume II
With 60 Figures
ill Springer-Verlag
New York Heidelberg Berlin
Robert D. Richtmyer
Department of Mathematics
University of Colorado
Boulder, Colorado 80309
USA
Editors:
Wolf Beiglbock Maurice Goldhaber
Institut fUr Angewandte Mathematik Department of Physics
Universitat Heidelberg Brookhaven National Laboratory
1m Neuenheimer Feld 5 Associated Universities, Inc.
D-6900 Heidelberg 1 Upton, NY 11973
Federal Republic of Germany USA
Richtmyer, Robert D
Principles of advanced mathematical physics.
9 8 76 54 3 2 1
Preface to Volume II XI
19 Continuous Groups 25
19.1 Orthogonal and rotation groups 25
19.2 The rotation group SO(3); Euler's theorem 27
19.3 Unitary groups 28
v
vi Contents
References 313
Index 317
Preface to Volume II
The first eleven chapters in this volume, 18 through 28, contain material
that was developed in the third year of the three-year mathematical physics
sequence at the University of Colorado. The central concepts are groups,
manifolds, and differential geometry. I wish to thank Professors Wesley
Brittin and Russel Dubisch for extensive discussions of this material, and
I wish to thank Professor Wolf Beiglbock for advice and suggestions on the
overall plan and on the material on group representations.
The material in the last three chapters, related broadly to recent work in
differentiable dynamical systems, has been discussed in special courses on
hydrodynamic stability and seminars on mathematical physics. That
material is somewhat less well organized than the older subjects, but has
been included because it contains various concepts of great potential value
in physical science.
Note. In some books, the axiom iii above is replaced by the fully equivalent
axiom that G contains a unique identity element e and that each element
a of G has a unique inverse a-I-see next section.
As a first example, let G be the set of all rotations in the plane: let R",
denote the transformation in which a point x, y is moved to, or mapped
onto, the point x', y', where
x' = x cos cP - Y sin cp,
(18.1-1)
y' = x sin cp + Y cos cpo
U the transformations R"" and R"'2 are performed in succession, the result
is a rotation through the angle CPI + CP2, i.e., it is the transformation R"" +"'2'
It is easily verified that the set {R",: 0 s cP < 2n} of all such rotations satisfies
the group axioms.
A rotation in 3 dimensions may be described by first choosing a direction
through the origin and then performing a rotation through some angle about
that direction as a fixed axis. It follows from Euler's theorem, proved in
Section 19.2, below, that the resultant of two such transformations, per-
formed in succession, is another such, i.e., is a rotation through some angle
about some axis. [This seems evident (because everyone knows that it is true)
until one tries to prove it.] In consequence, the set of all rotations in 3
dimensions is a group. The group of all rotations in n dimensions is denoted
by SO(n), for reasons that will appear.
As a third example, consider the set of all rotations in 3 dimensions
under which a cube, centered at the origin, is invariant (i.e., is mapped into a
cube that coincides with the original cube). One can rotate the cube through
90, 180, or 270 about an axis through the midpoints of opposite faces,
through 1800 about an axis through the midpoints of opposite edges, or
through 1200 or 240 about an axis through opposite vertices. It is easily
verified that these transformations (including the identity transformation)
form a group of 24 elements. More generally, the set of all transformations
of a specified kind (e.g., rotations, general linear transformations, rigid
motions, conformal mappings) under which a given figure is invariant is a
group, because the figure is clearly invariant under composition and inverses
of such mappings. The rigid motions under which a crystal lattice is invariant
constitute the space group of the crystal-see Section 18.13.
The set of all permutations of n objects is a group; such groups are dis-
cussed in Section 18.4.
Certain sets of real or complex numbers or quaternions are groups with
respect to addition or multiplication, e.g., the set of all integers (positive,
negative, and zero) under addition, the set of all positive real numbers under
multiplication, the integers 0, 1, ... , m - 1 under addition modulo m, or
the set of all nonzero (real) quaternions under multiplication.
When addition is the rule of composition, a 0 b is denoted by a + b, the
inverse of a by - a, and the identity by O. Often the little circle is omitted
and the composition of two elements a and b is written simply as a product abo
Elementary Consequences of the Axioms; Further Definitions 3
A finite group can be fully described by its multiplication table. For example,
Klein's 4-group V4 is defined by
e a b c
e e a b c
a a e c b
b b c e a
c c b a e
which means that a b = c, etc. Each group element appears just once in
0
each row and once in each column; furthermore, all rows are different and
all columns are different. Any square arrangement of letters having this
property is called a latin square (Euler). Any latin square defines an abstract
group, provided that the multiplicative structure thus determined has an
identity and satisfies the associative law.
Abstract group theory deals with the relations indicated in the multiplica-
tion table and completely ignores the inherent nature of the elements, a, b,
etc. In contrast with calculus, real and complex analysis, differential equations,
and other subjects in analysis (group theory belongs to algebra), numerical
quantities hardly ever appear, except integers for the purposes of enumera-
tion and counting.
The theory of groups plays a role in quantum mechanics, in the theory of
spectra, in the analysis of classical dynamical systems, in the theory of auto-
morphic functions, in the theory of algebraic equations, and so on.
The following laws are consequences of axioms i, ii, and iii of the preceding
section:
Law of cancellation: If a, b, c are any elements of a group G, then
a b = a c implies
0 0 b=c
and
boa = c a implies
0 b = c.
Identity: In G there is a unique element e such that a 0 e = eo a = a for
all a in G.
Inverses: If a is any element of G, there exists in G a unique element a- 1
such that a a- 1 = a-loa = e; furthermore, (a b)-l = b- 1 a- 1 .
0 0 0
a 1 = a,
a 2 = a a, 0
a- m = (a-1r
Clearly, these powers all commute, and an am = an+ m. Generally, two
0
1. What is the inverse of the element R", in SO(2)? What is the identity element?
2. Show that SO(2) is commutative, while SO(3) is not.
3. Show that the group of rotations that leave a cube invariant is of order 24, as
claimed in Section 18.1.
4. Describe the group of rotations under which a right circular cylinder is in-
variant; same for a regular icosahedron.
5. Derive the three laws at the beginning of this section from the group axioms.
6. Determine which of the following are groups:
(a) The set of all nonzero complex numbers, under multiplication.
(b) The set of all nonzero n x n matrices under multiplication.
(c) The set of all positive rational numbers, under multiplication.
(d) The set of all positive irrational numbers, under multiplication.
(e) The set of all positive algebraic numbers, under multiplication.
(f) The set of all n x n matrices under addition.
(g) The set of all n x n matrices of the form eA , under multiplication.
(h) The integers 1,2, ... , P - 1, under multiplication modulo p, p a prime.
(i) The integers 1,2, ... ,m - 1, under multiplication modulo m, m composite.
(j) The set of all vectors in E 3 , under vector addition.
Isomorphism 5
18.3 Isomorphism
1= G ~), A = (
-10 ~), B= (-1 0 -~),
C = (~ -1)o '
then the mapping
cp: 1 -> I, i -> A, -1 -> B, -i -> C
is an isomorphism of G onto G'; the law (18.3-1) is easily verified for each of
the 16 possible pairs (a, b) of elements of G. For example, (-i) = (-1)0);
hence cp( - i) ought to be = cp( - 1)cp(i), i.e., C ought to = BA, which in fact
it is. It should be noted that the mapping
-1 -> B, -i -> A
(
COS 8
sin 8
- sin
cos 8
8) (-sincos 88
~ sin8).
cos 8
Any mapping cp (not necessarily one-to-one or onto) of a group G into a
group G' such that (18.3-1) is satisfied is a homomorphism. If G is the group
GL(n, C) of all n x n nonsingular complex matrices under multiplication,
then the mapping A ~ det A is a homomorphism of G onto the group of all
nonzero complex numbers under multiplication. As a second example, let G
be the group M 2 of all rigid motions in a plane, i.e., the group of all trans-
formations of the form
'E {x ~ x' = x cos 8 - y sin 8 + a,
(18.3-2)
9,a,b y ~ y' = x sin 8 + Y cos 8 + b,
where 0 :$; 8 < 2n and where a and b are arbitrary real numbers. Then the
mappmg
Teab~ ( .
,,
COS
sm 8
8 -cos
sin 8)
8
(18.3-3)
G~ = q;(G)
e'
The main interest is in the case in which the image G'1 = cp(G) is a simpler
group than G (in this case cp cannot be one-to-one) but yet cp(G) is not merely
the trivial group {e'}. One may then regard the image as having the main
features of G but without some of the fine detail. Insofar as cp( G) approximates
G, it does so accurately, in that the image of a product of two elements in G is
always the product of their images.
Note. All the conclusions hold, even if G' is not a group, but merely a set
in which a product x'y' is defined. In particular, it then follows that the subset
G'l is necessarily a group, even though G' may not be. All the arguments are
unchanged except the second step in part (2). To prove associativity in G'l'
let x', y', and z' be elements in G'l' hence elements of the form <p(x), <p(y), and
<p(z). Then (x'y')z' = (<p(x)<p(ycp(z) = cp(xy)cp(z) = cp'xy)z) = cp(x(yz be-
cause multiplication is associative in G) = <p(x)<p(yz) = <p(x)(<p(y)<p(z =
x'(y' z').
For any element x, the elements of the form yxy-1 (y E G) are the conju-
gates of x. A subgroup is normal if and only if it contains all the conjugates
of all its members. In an Abelian group, yxy-1 is always equal to x, hence
every subgroup is normal.
18.6 Cosets
Go onto S~) (and similarly for right cosets); hence, each coset contains the
same number (finite or infinite) of elements as Go. Furthermore, it is easily
proved that any two left co sets (or any two right cosets) are either identical
or disjoint, so that the number of elements in G (if finite) is equal to the
number in Go times the number of (say left) co sets (including Go itself).
Lagrange's theorem for finite groups follows: the order of a subgroup divides
the order of the group. It follows, for example, that if the number of elements
in G is prime, then G has no subgroups other than {e} and G itself. The co sets
S~) and S~) are also denoted by yG o and GoY. Note that the co sets, except Go
itself, are not subgroups.
(18.7-1)
Theorem 2. If Go is a normal subgroup ofG (Go <l G), then Sy,SY2 = SylY2
for all Y1 and Yz in G. Conversely, if Go is a subgroup such that the product
of any two (say left) cosets is always a (left) coset, then Go <l G (in which
The Structure of Cyclic Groups 11
case the distinction between right and left cosets disappears). Under these
circumstances, the collection of cosets
{Sy: y E G}, o = multiplication defined by (18.7-1),
is a group, called the factor (or quotient) group of G with respect to Go
and is denoted by GIGo. Furthermore, the mapping ({In: G -+ GIGo defined
by ({In(Y) = Sy (each element ofG is mapped onto the coset in which it resides)
is a homomorphism called the natural homomorphism of G onto GIGo.
According to this theorem, whose proof is also left to the reader, one
can always construct a homomorphic image and a homomorphism corre-
sponding to any normal subgroup Go. The theorem ofthe next section shows
that these are essentially the only homomorphisms of the given group G.
~ '(i50)
The reader is urged to complete the proof in detail, but let us point out
what has to be proved. To prove that 'II is well defined, one must show that
if SYI = SY2' then tf;(Yl) = tf;(Y2). To show that 'II is one-to-one, we must show
that if tf;(Yl) = tf;(Yl), then SYI = SY2. To show that 'II has the homomorphism
property, one must show that
'P(Sy,Sy,) = 'P(Sy,)'P(Sy,).
Lastly, it is obvious that 'II is onto, for any element of tf;(G) is tf;(y) for some
yin G.
(that is, the order of the subgroup generated by that element) must be either
1 or p, by Lagrange's theorem. (The only element of order 1 in a group is the
identity e, for to say that a1 = e, when I = 1, is equivalent to saying that
a = e). If G = {e, a, a 2 , , an - 1} is a cyclic group of order n, and m is a
divisor of n, then the elements e, am, a 2m , ... constitute a cyclic subgroup of
order n/m, and these are the only subgroups of G. In an infinite cyclic group,
any element that is =F e generates an infinite cyclic subgroup; these are the
only nontrivial subgroups, and they are all distinct, but isomorphic.
EXERCISE
1. Show that the set .'T of all left translations in G is itself a group under the usual
law for the composition of mappings, and that this group .'T is isomorphic to G.
N.B.I. Any homomorphism of a group G (abstract or otherwise) onto a
group of mappings is called a representation of G. The isomorphism G ~ ff
is called the regular representation of G. If a representation is an isomorphism
(not merely a homomorphism), it is called faithful. The regular representa-
tion is faithful.
N.B.2. If G is a finite group, the mapping T. is a permutation of the elements
of G; therefore, ff is some subgroup of Y n' where n is the order of G, i.e.,
any finite group is isomorphic to some group of permutations. (This is Cayley's
theorem.)
EXERCISE
2(a). Show that the set ~ of all the mappings Aa in G is itself a group, under the
usual law of composition of mappings.
(b). IfG = d 3 , find~.
(c). If G = d 4 , find f
N.B.3. f is in any case some subgroup of Y 6/2 = Y 3 for exercise 2(b)
and some subgroup of Y 12 for exercise 2(c).
N.B.4. "Find f" means to idyntify f as being isomorphic to some known
group.
N.B.5. Each of the mappings Aa is an automorphism called an inner auto-
morphism of G: (1) It is one-to-one and onto, because the equation axa - 1 = Y
can always be solved for a unique x(x = a- 1 ya); (2) it has the homomorphism
The Subgroups of g' 4 13
Inner automorphisms of Y'n have a special property. Let n E Y'n, and let
n be written as a product of independent cycles, as in (18.4-2) [two cycles are
independent if they contain no common symbol; (173) and (24) are indepen-
dent, but (173) and (34) are not], the longer cycles being written first and the
cycles of length 1 being included. Then the lengths of the cycles constitute a
partition of n, that is, a set of positive integers whose sum is n. The special
property referred to is that the image of n under an inner automorphism of
Y'n (n -+ (In(J-1, where (J is a given element of Y'n) always corresponds to the
same partition of n as n itself, because if (a, b, ... , f) is any cycle, then
(J(a, b, ... , f)(J-1 is the cycle (J(a), (J(b), ... , (JU)).
If a and x are in a group G, then the element axa- I is called conjugate
to x. The inner automorphism Aa: x -+ axa- 1 (a fixed) maps each group
element x onto one of its conjugates. If Go is a subgroup of G, then the set
G~ = {axa- 1 : xEG o}
is also a subgroup, and is often denoted by aG o a- 1 ; it is said to be conjugate to
Go. If aGo a-I = Go,for all a in G, then Go<l G. Hence, normal subgroups are
sometimes called "self-conjugate" subgroups. The conjugate elements of
Y'n are those that have the same structure when written as products of
independent cycles, e.g., (1732)(56)(4) and (4531)(76)(2).
The symmetric group 51'4 has 22 subgroups in addition to the trivial sub-
group {e} and G itself. Arranged in classes of conjugate subgroups, they are:
Any subgroup in a given line of the table can be obtained from any other
subgroup in the same line by an inner automorphism of the whole group 51'4.
For instance, under the mapping n -+ (l2)n(12), the elements of the group
fe, (134), (431)} go into the elements ofthe group fe, (234), (432)}. Therefore,
the only normal (i.e., invariant, i.e., self-conjugate) subgroups are V4 and d 4,
each of which occupies a line of the table by itself. However, each of the
subgroups in line iv is a normal subgroup of V4 (which shows, incidentally,
14 Elementary Group Theory
that G 1 <l G z <l G 3 does not imply G 1 <l G3 ). A complete so-called com-
position series (see below) of!/'4 is the series
(18.11-1)
For n 2 5, d n is a simple group (it has no nontrivial proper normal sub-
groups), so the composition series is merely
{e}<l dn<l!/'n-
EXERCISE
1. Show that d s is a simple group. Outline of the proof: Assume that Go <:J d s,
but Go #- {e}; then, it must be proved that Go = d s . Show that Go must contain an
element TC of one of the types
(a) (a be),
(b) (a b)(c d),
(c) (a bed e).
Then it contains all elements aTCa-1, where a E d s; show that it contains all elements of
the type TC. Show that if Go contains all elements of one of the above types, then it contains
elements (hence all elements) of both the other types, e.g., if Go contains type (a), then it
contains (1 2 3)(2 3 4) = (21)(34). Why does this proof fail for d 4? The simplicity of
d s is a key step in Galois proof that the quintic equation cannot be solved by radicals.
EXERCISES
2. How does the Jordan-Holder theorem apply to the composition series (18.11-1)
for [//4?
Generators and Relations; Free Groups 15
It is an elementary exercise to verify that the set of all words, using a given
set S of generators, is a group G under this definition of product. It is called
the free group generated by S. The identity element is the empty word e,
and the inverse of X1X2 .. Xk is the word YkYk-l .. Yl, where Yi = a-lor ex
according as Xi is ex or a- 1
Relations can be established among the elements of the group (it is then no
longer free) by means of equations Wl = e, W2 = e, etc., where Wi> W2, etc.
are certain words. The relations establish a structure in the group.
If among the relations we have aba- 1 b- 1 = e (which is equivalent to
ab = ba) for every pair a, b in S, then the group is Abelian. If these are the
only relations, then G is called a free Abelian group. Free groups and free
Abelian groups appear in Section 23.7, in the study of the kinds of multiple
connectedness that a manifold can possess. The structure of a free group or
of a free Abelian group is determined solely by the number of generators.
Any finite group G is equivalent (i.e., isomorphic) to a group defined by
generators and relations: S can be taken as the set of all group elements in G,
and the relations taken so as to give all the information provided by the
16 Elementary Group Theory
period of f(x) is of the form (18.13-3) with integer coefficients; then, the
vectors v(l), ... , v(n) are said to constitute a fundamental set of periods.
Certain functions may not have a fundamental set, even though they are
periodic in the strict sense, for example, constant functions and functions that
are periodic in some coordinates and independent of the others. Such func-
tions will be called degenerate, and are excluded on the physical ground that
each atom occupies a certain volume, and that functions like the potential
and the charge density vary from the center of an atom to the outside, so that
some variation is unavoidable in any given direction in space. A multiply
periodic function is called nondegenerate if it has a fundamental set of periods.
The set of points x in IRn determined by
x = mlv(l) + ... + mnv(n), (18.13-4)
where the mi are integers, is called the lattice of f(x).
If new vectors v'(l), ... , v'(n) are given by
v'(j) = mjl v(l) + ... + mjn v(n), (18.13-5)
where the mjk are integers such that
mIn)
:. = -+1 , (18.13-6)
... mnn
then v'(1), ... , v'(n) also constitute a fundamental set of periods, because
when (18.13-5) is solved by the use of determinants, the v(j) are seen to be
linear combinations of the v'(j) with integer coefficients. Both fundamental
sets generate the same lattice.
would hold, for all nand k; for y irrational, the numbers ny - k are dense
on IR, so by continuity f(x, y) would have to be independent of x for all
irrational y, hence (by continuity again) for all y; hence f(x, y) would be
degenerate. Therefore shear must be excluded. By arguments of this sort, it is
concluded that the matrix M in (18.14-3) must be an orthogonal matrix.
The set of all orthogonal matrices M such that (~, M) is in Gs for some ~
is also a group; it is called the point group of f(x) and is denoted by G p
Clearly the mapping
(18.4-6)
is a homomorphism, whose kernel is ff; hence, by the homomorphism law
for groups, ff is a normal subgroup of G" and Gp is isomorphic to the factor
group Gs/ff.
EXERCISE
1. Using (18.14-4), find the formula for (/;, M)-l. Then verify directly that !Y
is a normal subgroup of Gs by showing that, if (/;, I) is any pure translation (J being the
unit matrix), then any group element of the form (1), M)(/;, 1)(1), M)-l is also a pure
translation.
The Space and Point Groups 19
EXERCISES
2. (The purpose of this exercise is to show that the only possible pure rotational
symmetries of a 2-dimensional crystal are ones with an n-fold axis, where n = 1, 2, 3, 4,
or 6.) Consider a non degenerate doubly periodic functionJ(x, y), and write it asJ(z),
a (nonanalytic) real function of the complex variable z = x + iy. Let IX and fJ be a
fundamental pair of period; then Re(lX/fJ) =1= 0, and J(z + nIX + mfJ) == J(z), when nand
m are integers. By suitable scaling and suitable orientation of the x and y axes, take
fJ = 1 for simplicity. Assume thatJ(z) is also invariant under a rotation z -> ei8 z. From
the equations
J(z) = J(ze i8 ),
J(z + 1) = J(ze iB + ei8 ),
J(z + IX) = J(ze i8 + lXe iB ),
conclude that ei8 and lXe i8 are also periods of J(z). Show from that that IX satisfies the
equation
ra 2 + (5 - p)1X - q = 0,
_ra 2 + (5 - p)1X - q = 0,
For (J real, I must be in [ - 2,2]. Conclude that the possible values of (J are 0, 11/3,
11/2, 211/3, 11.
3. Extend the conclusion of Exercise 2 to the 3-dimensional case, as follows:
Assume that the function J(x, y, z) = J(x) is triply periodic and that {D, v, w} is a
fundamental set of periods. Suppose, furthermore, thatJ(x) is invariant under a rotation
through an angle (J about some axis in space. By choosing the origin to lie on the axis,
the rotation can be written as x --> Rx, where R is a 3 x 3 orthogonal matrix of deter-
minant = 1. Show that the vectors
D' = RD - D,
v' = Rv - v,
w' = Rw - w,
are all periods ofJ(x), are all perpendicular to the axis of rotation, and are not collinear.
It follows thatJ(x) is doubly periodic in any plane perpendicular to the axis of rotation,
hence Exercise 2 applies.
In some books, the restrictions to 2-, 3-, 4-, and 6-fold rotation axes is
derived from a somewhat mysterious "principle of rational indices," which
is said to be of empirical origin. We have seen, however, that the restriction
follows directly from the existence of a triply periodic structure; hence, the
"principle ofrational indices" is unnecessary.
The identity element of G is the pair (e, e'), where e and e' are the identity
elements of H 0 and of K o , respectively; furthermore, (h, k) - 1 = (h - 1, k - 1).
Let Hand K be the subsets of G defined as
H = {(h, e'): hE Ho} and K = {(e, k): k E Ko}.
It is easy to verify that G, H, and K are groups, that Hand K are normal
subgroups of G, that H ~ Ho and K ~ K o, and lastly that G = H x K.
EXERCISE
EXERCISE
2. Assume that the identity e is the only element common to two subgroups H
and K of a group G. Show that h in H commutes with every k in K if and only if H <l G
and K <l G.
In the next simplest case, the semi direct product, it is still assumed that
every 9 in G can be uniquely expressed as hk, with hE Hand k E K, but it is
assumed only that H <J G, while K is not necessarily normal. Then G is a
so-called semidirect product of Hand K. Any coset of H in G (i.e., any element
of the factor group G/H) has a unique representation kH = Hk, with k in
K; furthermore, Hk1Hk2 = Hk1k2' and hence the factor group G/H is iso-
morphic with K. If 91 and 92 in G are expressed uniquely as 91 = h1k1 and
92 = h2k2' then the unique expression of 9192 = 93 is h3k3' where
h3 = h1k1h2kI1, k3 = k 1k 2
(Note that k1h2 kI1 is in H, because H is normal, but is not necessarily = h2
unless K is also normal.)
The group G of rigid motions in the plane (or in n-space) provides an
example of a semidirect product. A rigid motion is a transformation
X -+ x' = Mx + ~, (18.15-2)
where M is a 2 x 2 (or n x n) real orthogonal matrix with determinant = 1,
and ~ is an arbitrary vector. G is generated by the group f7 of translations
x-+x'=x+~
Definition. Let Hand K be any two groups (the law of composition will
be written multiplicatively in both), and let there be given a homomorphism
k --+ r(k) of K into the automorphism group of H [for fixed k, r(k) maps h
onto r(k)h, for all h in H]. Then, the set of all pairs (h, k), with h in Hand k
in K, and with the law of composition of such pairs given by
(18.15-4)
is a group, called the semidirect product of Hand K (or of H by K) and is
denoted by
EXERCISES
3. Show that the identity elements of G = H x, K is (e, e'), where e and e' are the
identity elements of Hand K.
4. Show that the element (r(k-1)h- 1, k- 1) is the inverse (i.e., both right and left
inverse) of (h, k).
5. Show that the associative law holds in G. Warning: r(k)[h 1h2J is not the same as
[r(k)hlJh2' because r(k) is a mapping, not a group element. Exercises 3, 4, and 5 show
that G is a group, as claimed in the definition of semidirect product.
Direct and Semidirect Products of Groups; Symmorphic Space Groups 23
6. Now identify Hand K with the subgroups {(h, e'): all h in H} and {(e, k): all
kin K}, respectively, and show that H is a normal subgroup of G.
7. Construct the factor group G/H, and show that it is isomorphic to K.
8. Conversely, suppose that a group G contains subgroups Hand K, of which H
is normal, such that H n K = {e}, and such that the factor group G/H is isomorphic to
K; show that G is the semidirect product H x, K, where, for any k in K, T(k) is the map-
ping h -> khk- 1 , for all h in H.
9. Show that the semidirect product is equal to the direct product H x K if and
only if T(k) == J [that is, the homomorphism k -> T(k) of K into the automorphism group
of H maps all of K onto the identity element (identity mapping of H onto itself)].
10. Show that K, as well as H, is a normal subgroup of G if and only if T(k) == J,
i.e., if and only if the product is direct.
Exercise 8 shows that the automorphisms -r(k) that appear in the semi-
direct product of two given groups Hand K become inner automorphisms
of the group H x rK that is being defined. This fact is somewhat obscured,
in the case of the rigid motion group G, by the use of the additive notation
for the translation group !!T. If, instead, the translation x --+ x' = x + ~ is
denoted by T; and the multiplicative notation is used, so that {~, M} is
simply the combined operation ~M, then TM~ = M~M-l, so that the
automorphism -reM): ~ --+ M~ in!!T takes the form
-reM): ~ --+ MT~M-\
hence is an inner automorphism in G.
According to the preceding section, the translation group !!T of a crystal
structure, given by (18.14-2), is a normal subgroup of the space group G.,
and the point group Gp is isomorphic to the factor group Gs/!!T [it is recalled
that the point group is the group of all rotations and reflections x --+ Mx
such that (~, M) is in G., for some~]. Gs mayor may not contain a subgroup,
say G~, isomorphic to Gp ; if it does, then!!T and G~ can have only the element
e in common, because all the other elements of !!T are of infinite order (if
T E !!T, then T m =1= I for all m =1= 0), while all elements of Gp are of finite
order. Therefore, according to Exercises 5 and 6, above, Gs contains such a
subgroup if and only if it is a semidirect product!!T x r Gp' In this case, the
space group is called symmorphic by the crystallographers.
When an x-ray crystallographer starts to analyze a set of x-ray reflection
data to determine a crystal structure, he often knows the point group in
advance, from measurement of the angles between crystal faces and cleavage
planes, and from other macroscopic properties of the crystals. However,
he cannot assume that the space group contains a copy of the point group,
i.e., that the space group is symmorphic.
A simple 2-dimensional example of a nonsymmorphic space group is
that of the function f(x, y), which is equal to 1 in the shaded triangles in
Figure 18.3, and 0 otherwise. In complex notation, the space group is
generated by the translations z --+ z + a, Z --+ Z + i/3 and the so-called
glide-reflection z --+ Z + !a; hence the point groups contains the reflection
z --+ z, while the space group does not.
24 Elementary Group Theory
Figure 18.3
Continuous Groups
General linear, special linear, orthogonal, and unitary groups; rotation and
Lorentz groups; Euler's theorem, the four components of the full Lorentz
group, the Thomas presession; group manifolds; intrinsic coordinates;
double connectivity of the rotation group; homomorphism of SU(2) onto
50(3) and of SL(2) onto ~; simplicity of the rotation and Lorentz groups.
Under the transformation (19.1-1), the length of any vector and the
angle between any two vectors are preserved, so that ifx' = Rx and Wi = Rw,
then x' . Wi = X w for any two vectors x and w.
25
26 Continuous Groups
Rll
and R =( :
Rn1
and we look for matrices R such that (Rx)' (Ry) = X' Y for all vectors
x and y. If ~(j) denotes the vector whose jth component is = 1 and whose
other components are = 0, then, in particular, R must be such that
is the jth column of R, it follows that the columns of R are pairwise orthogonal
unit vectors. Any such matrix is called orthogonal. Conversely, if R has that
property, then Rx' Ry = X' Y for all x, y. If RT denotes the transpose of R,
then the rows of RT are the columns of R, so that the law of matrix multi-
plication gives
(1 ... 0)
RTR =: . : =1 ...
I, (19.1-3)
... 1
which is another characterization of an orthogonal matrix. Since RT = R- 1
is the matrix ofthe inverse transformation, which also preserves dot products,
it follows that RT is also an orthogonal matrix; hence, the columns of R T,
that is, the rows of R, are another set of n pairwise orthogonal unit vectors.
Since det RT = det R, equation (19.1-3) shows that det R = 1. We now
define
O(n) = {R: R = n x n real orthogonal matrix}
( 0= matrix multiplication) as the orthogonal goup in n dimensions. Then,
the subgroup
SO(n) = {R E O(n): det R = I}
(i = 1, 2, 3). (19.2-1)
Since R is also a unitary matrix, we have IIRvil1 = Ilvill, where, for any
(generally complex) vector v, Ilvll denotes(lv x l2 + IVyl2 + IV z I2 )1 /2;therefore,
det(AI - R) = 0, (19.2-3)
(19.2-4)
At least one of the roots is real; if the other two (say A2 and A3) are complex,
then A3 = X2 and, by (19.2-2), A2 A3 = 1; hence A1 = 1. If all three roots are
real, they can be 1, 1, 1 or 1, -1, -1. In any case there is always one root,
say A10 equal to + 1; hence
which shows that the straight line through the origin in the direction of V1
(V1 can be taken as real) is invariant under the transformation x --+ Rx;
evidently this line is the axis of rotation.
1 .
U2 = .J2 (VI + V2), (19.2-5)
i
U3 = .J2 (VI - v 2);
these also form an orthonormal set (they can all be taken as real, for V2
and V3 can be taken as complex conjugates), and
RU 1 = u 1,
RU2 = cos OU2 + sin OU3, (19.2-6)
RU 3 = -sin OU2 + cos OU3'
It is seen that the transformation of R is a rotation in planes perpendicular
to Ul'
The practical calculation of the angle and axis of rotation, when the
matrix R is given, proceeds as follows: Since the sum of the eigenvalues
of a matrix is equal to its trace, the angle 0 is given by
or
cos 0 = !(R l l + R22 + R33 - 1). (19.2-7)
Next, the axis of rotation is in the direction of the eigenvector V (called VI
above) that corresponds to the eigenvalue A = 1; hence Rv = v. But, since
R is an orthogonal matrix, RTR = J; hence V = RTv. Therefore, (R - RT)V
= 0, so that the components VI, V2' and V3 of v are in the ratio
(19.2-8)
EXERCISE
According to the theory of special relativity, if x, y, z, t and x', y', Zl, t' are
Cartesian coordinates in two inertial frames of reverence whose axes are
parallel, but are such that the second frame is moving relative to the first
with speed V in the + x direction, and if the origins are coincident at time
t = t ' = 0, then
y' = y, Z' = z,
Vic
sinh qJ = (19.4-2)
sinh cP o 0 cosh cP
In the remainder of this section, the summation convention will be used,
according to which any term containing a repeated index, say v as in (19.4-3),
is understood to be summed for v = 1, 2, 3, 4, so that, with this convention,
(19.4-3) is written simply as x'" = p~xv. In relativity theory, Greek indices
usually go from 1 to 4, and Latin indices from 1 to 3.
The set {P( cp)} of matrices (or transformations) of this kind, obtained by
letting cP take on all real values, is a group that will be denoted by 2 x-it
is a subgroup of the Lorentz group. Note that
(19.4-5)
from this equation, the law of composition of (collinear) velocities can be
obtained, that is, the formula for the velocity with which a third frame moves
relative to the first in terms of the relative velocities of the second with
respect to the first and of the third with respect to the second; the derivation
is left as an exercise.
If the second frame is obtained by merely rotating the first one in space,
then the transformation is given by a matrix of the form
R
= ((R') ~)
0 ' (19.4-6)
000 1
where R' denotes a 3 x 3 proper rotation matrix-an element of SO(3).
The set of all such matrices (or transformations) is the rotation subgroup of
the Lorentz group, and will be denoted by r!lt.
The group generated by the elements of 2 x together with those of fJ,
i.e., the group consisting of all finite products Ql Q2 ... Qj' where each Qi
is either of the form (19.4-4) or of the form (19.4-6), is called the proper (or
restricted) Lorentz group and is denoted by 2 p' This group is connected, in
the following sense:
PROOF. First, any element P(cp) of 2?x can be connected to the identity by letting cp
vary continuously from zero to its final value; second, any element R of ,'!/l can be
connected to the identity by letting the angle of rotation vary continuously from zero
to its final value; therefore, if Qo = Q 1 Qz ' .. Qj' where each Qi is in 2? x or fYl, then the
The Lorentz Groups 31
interval [0,10 ] can be taken as [O,j], and Q(l) can be chosen so that Q(O) = I,
Q(l) = Q1' Q(2) = Q1 Q2' and so on, and finally QU) = Q1 Q2 ... Qj' with continuous
variations in between.
where
o o
o
o
1 o
o -1
~)= G,
(19.4-7)
and where the summation convention applies to both J.l and v. If the 4-vector
xl' in gl'vxl'X is replaced first by xl' + yl', then by xl' - yl', and the results
V
(19.4-8)
for v = 1, 2, 3,
(19.4-9)
for v = 4,
and
for v =J A. (19.4-10)
From this it follows that the inverse of Q is its transpose with signs changed
according to the pattern
therefore, q! is either ~ 1 or ~ - 1.
Theorem. The proper Lorentz group fi'p, defined above as the group
generated by fi' x and ~, consists of all those transformations Q of fi' f for
which det Q = + 1 and q! ~ + 1.
PROOF. It is shown first that the connectedness of the group .Ii' p implies that
det Q = + 1 and that q! ~ 1 for any Q in .Ii' p: Let Q be connected to the identity J,
as in the lemma; since Q(O) = J, we have det Q(O) = 1 and q(O)! = 1; det Q(A.) and
q(A.)! are continuous, and hence cannot jump to negative values as A. varies. (That
det Q is equal to 1 can also be seen directly from the decomposition Q = QIQ2'" Qj,
where each Qi is either in .Ii'x or in 9l). Conversely, let Q be any transformation in .Ii'f
such that det Q = 1 and q! ~ 1. It will be shown that Q can be expressed as RIPR 2.
where R 1 and R2 are in 9l and P is in .Ii'x; hence, Q is in .Ii' p' (This shows, furthermore,
that three factors always suffice in the decomposition QI Q2 ... Qj.) First, let R3 and
R4 be rotations that take the 3-vectors (q1, qi, ql) and (qt, qt q~), respectively, into
the direction of the positive Xl axis. Then,
q~l)
R3 QR 4 = ( (X') ~ = Q', (19.4-11)
q~4 0 0 q~4
where X' is some 3 x 3 matrix and where q~l and q'14 are ~ 0, and q~4 = q! ~ 1. Since
(q~I)2 _ (q~4)2 and (q'14)2 _ (q~4)2 both = -1, a parameter <p can be chosen so that
Therefore, ifQ' is multiplied by P( - <p), the last row and the last column of the product
are the same as those of P( - <p )P(<p) = J; hence
001
This matrix, like Q and Q', is in .Ii'p, and hence leaves the fundamental quadratic form
invariant. Therefore, X" leaves (Xl)2 + (X 2)2 + (X3)2 invariant, hence is in 0(3),
but det X" = + 1, hence X" is in SO(3); i.e., Q" is an element of 9l, say Rs. That is,
P(-<P)R3QR4 =R s ,
and it follows that Q is of the required form
The Lorentz Groups 33
o o
1
o
o
1 ~),
o o -1
-1 o o
S=( 0
-1 o
o o -1
o o o
is a subgroup of 2 J' Lastly, 2 J itself is generated by the elements of 2 p
together with T and S.
EXERCISES
the possibility of spatial reflections appears when r is odd, and of time reversal
when I is odd.
If each element
EXERCISE
Find an algebraic equation F(x, y, z) = 0 which determines the surface of the torus
in V 3 just as the equation x 2 + y2 + Z2 - 1 = 0 determines the surface of a sphere.
Find a group of which the torus is the group manifold.
Intrinsic Coordinates in the Manifold of the Rotation Group 35
a fixed axis and then performing a rotation about that axis (Section 19.2).
If R is the matrix of a rotation through the angle e ;;:: (seen as clockwise
when looking along the positive direction of k) about an axis having the
direction of the unit vector k, then the numbers ek x, ek y , ek z (= ex, ey , ez )
may be taken as intrinsic coordinates in SO(3), and we write R = R(9),
where 9 is the vector ke. To make the coordinate system unique, 9 must be
restricted to the spherical ball K = {9: 11911 :::; n} in the coordinate space [R3,
in which ex, ey , and ez are taken as Cartesian, and we must observe that
opposite ends of a diameter in K correspond to the same element of SO(3)-
otherwise, each point of K corresponds to a unique element of SO(3), and
conversely.
The matrix R = R(9) is given explicitly in terms of the intrinsic coordinates
by the equation
To see this, note first that since R is a nons in gular normal matrix, its logarithm
A is well defined (though multivalued), so that
A= (-~
-b
~ o~). -c
To see that a, b, and c have been correctly interpreted in (19.6-1), note that,
with this interpretation, (l) the eigenvalues of A are 0, ie, where e =
J e; + e; + e;, so that the eigenvalues of exp A are 1 and e i8, (2) the first
eigenvector of A (hence also of exp A) is proportional to 9, (3) since A is a
normal matrix, its eigenvectors can be taken as an orthonormal set, and (4)
it now follows, just as in Section 19.2, that R represents a rotation through
an angle e about an axis in the direction of 9. It is left as an exercise to verify
36 Continuous Groups
that (19.6-1) defines R(O) rather than R( -0), if the coordinate system is
right-handed .
.By means of these intrinsic coordinates ex, ey, ez , the properties of the
surface !I' in V 9 determined by the algebraic equations (19.5-1,2,3) can be
found. Each point of the ball K corresponds to a unique point of the surface
!I', except that opposite ends of a diameter in K correspond to the same point
of !I', and the nine coordinates of a point in !I', are continuous functions of
the intrinsic coordinates in K, according to equation (19.6-1). Therefore, !I'
is a connected surface, because any point of the ball K can be connected to
any other point of K by a curve (in fact by a straight line segment) lying in K.
However, !I' is not simply connected.
A connected surface is called simply connected if, given any two points
A and B and any two curves C 1 and C 2 going from A to B in the surface, C1
can be deformed into C 2 by a continuous deformation without leaving the
surface. The plane and the sphere are simply connected, while the torus, the
surface of a cylinder, and the annulus Ri < x 2 + y2 < R~ are not. Among
solids, the ball, the cube, and the spherical shell Ri < x 2 + y2 + Z2 < R~
are simply connected, while the solid torus and the pretzel are not.
To show that the manifold !I' of 80(3) is not simply connected, let A be
the point of !I' corresponding to the center of the ball K, and let B be the point
of !I' that corresponds to the two ends of a diameter of K. Then the two radii
which make up this diameter correspond, in !I', to two curves or paths going
from A to B, and it is evident that neither can be deformed into the other.
However, any other path from A to B can be continuously deformed, in !I',
into one of these two.
The Homomorphism of SU(2) onto SO(3) 37
A = C: iy x =:y) , (19.7-1)
which is evidently Hermitian and of trace zero. (The trace of a matrix is the
sum of its eigenvalues and is equal to the sum of its diagonal elements.) Let
U be any 2 x 2 unitary matrix of determinant = 1 [i.e., U E SU(2)], and call
A' = UAU*. (19.7-2)
Since the eigenvalues of A' are the same as those of A, the trace of A' is
also zero; A' is also Hermitian (A'* = A'), so it can be written as
A' = (z,
x' + iy'
x' - iY')
-z' ,
(19.7-3)
where x', y' and z' are real. Furthermore, det A' = det A, because of (19.7-2),
so
(19.7-4)
The relation between x, y, z and x', y', z' is obviously linear for given U;
hence, if we define a 3 x 3 real matrix R = R( U) by
EXERCISE
A= (t + ~ x - iY)
x + lY t - Z
some angle 8 about a fixed direction. Let 0 be a vector in that direction and having
length 11011 = 8, and call R = R(O), as in Section 19.6. Then Ro = R(Oo) for some
0 0 i' O. Now let 0 1 be any other vector having the same length as 0 0 , 110 1 11 = 1100 11,
and let RI be a rotation that carries 0 1 into 0 0 . Then Rjl R(Oo)RI is in the subgroup
Go, because Go is normal; but RjIR(Oo)RI = R(OI)' and hence the subgroup Go
contains every R(O) for 11011 = 1100 11. Next, Go also contains every element R(O)R(Oo),
for 11011 = 1100 11; this is R(O') for some 0' = 0'(0,0 0). Clearly 110'11 is a continuous
function of the components of O. [Recall the explicit formula for R(O) in terms of
8x , 8y , 8z given in Section 19.6.J For 0 = -00 and +0 0, 118'11 = 0 and 2118 0 11, re-
spectively. Therefore, for any 8 in [0, 2110011J, Go contains at least one element R(O)
such that 11011 = 8. By the first argument, again, Go therefore contains every R(O) such
that 0 :::; 11011 :::; 2110 0 11; but for any such R(O), Go also contains R(O)R(O) = R(20),
R(30), etc.; hence Go contains every R(O) for 0 :::; 11011 :::; n, that is, Go = SO(3).
The proper Lorentz group is also simple, although the proof is more
complicated.
CHAPTER 20
Group Representations I:
Rotations and Spherical Harmonics
In this chapter and the next two, three apparently remote subjects are shown
to have important interconnections: group representations, the classical
special functions, and quantum mechanics.
Group representations are intimately connected with various special
functions of mathematical physics. In a sense, the primary role of those
functions is to exhibit symmetry relations. For instance, the Legendre
functions appear (via spherical harmonics) in problems with sl?herical
symmetry in such diverse fields as electrostatics, acoustics, heat flow,
neutron transport, and the quantum mechanics of hydrogen-like atoms.
Bessel functions (of integer or half-odd-integer order) appear mostly in
problems of wave motion, but a closer examination shows that they are
associated more with certain symmetries than with the mechanism of the
waves. In fact, wave motion in a nonconstant (even spherically symmetric)
potential generally involves other functions, for example Laguerre functions
in hydrogen-like atoms, while Bessel functions appear when the system is
invariant under the full rigid-motion groups, not merely the rotation sub-
groups-see next chapter. It will be seen in Section 20.5 that the trigonometric
functions appear, in the form eim"" in the representations ofthe two-dimension-
al rotation group, and hence are associated with symmetry about an axis.
40
Vector and Tensor Transformation Laws 41
Note. The usage here differs slightly from that of Section 18.10, where a
representation of G was a homomorphism of G onto any group of (not
necessarily linear) transformations.
and where (gik) is a rotation matrix [an element of SO(3)], then the com-
ponents Ii j of a second rank tensor transform according to the law
T;j = L gikgjll1d'
(k,l)
(20.2-2)
If the nine quantities Ii j are called Xl, ' , , , X 9 and are regarded as the co-
ordinates of a point X in ~9, then each transformation x ---+ x' induces a
transformation X ---+ X', and those transformations give a representation of
SO(3) on ~9,
Transformation ofthe components of a third rank tensor give a representa-
tion on ~Z 7, and so on,
More special representations can appear when certain symmetry relations
exist among the tensor components, Suppose that the Ii j are the com-
ponents of the strain-rate tensor at some point in a fluid:
avo
Ii j = -a 1,
Xj
where v = v(x) is the velocity vector field. If the flow is irrotational, then
Ii j = Tji (i,j = 1,2,3). (20.2-3)
It is readily verified that these equations are invariant under rotations;
that is, if Ii j = Tji for all i andj, then T;j = Tji (see Exercise 2 below). Hence,
in this case, only six of the Ii j are independent, say the quantities Yi defined as
Y1=T1Z , YZ = TZ3 , Y3 =T31 ,
Y4 =T1 1> Y5 = TZ2 , Y6 =T33
Then the rotation x ---+ x' induces a linear transformation Y ---+ Y', and a
6-dimensional representation of SO(3) results.
If, furthermore, the fluid is incompressible, there is an additional relation
(20.2-4)
which is also invariant under rotations (see Exercise 3 below). In this case
Y6 can be dropped (i.e., it can always be computed as - Y4 - Y5 ), and the
transformations of Y1 , .. , Y5 give a 5-dimensional representation of SO(3).
The relations (20.2-3) and (20.2-4) determine subspaces of ~9 that are
invariant under the transformations (20.2-2); the invariance of those sub-
spaces permits the 9-dimensional representation to be reduced to the 6-
dimensional and 5-dimensional ones. [An 8-dimensional representation can
be obtained by using (20.2-4) alone.] It will appear below that the 5-dimension-
al subspace cannot be further decomposed into smaller invariant subspaces;
hence the 5-dimensional representation is irreducible. It will appear that for
each odd integer m there is an irreducible m-dimensional representation of
SO(3); when m > 1, that representation is faithful.
The original transformation law (20.2-1) for the components of a vector
x provides of course a 3-dimensional representation. (If the rotations are
regarded as constituting an abstract group, then that representation is no
Vector and Tensor Transformation Laws 43
more trivial than the others.) Furthermore, for the sake of completeness
(in a sense to be made precise later), we include the 1-dimensional representa-
tion given by the transformation law for scalars, which says simply that they
are not transformed at all. Each scalar is a real number x, and each group
element (rotation) is mapped onto the identity transformation x -+ x in IR.
[It should not be concluded that a 1-dimensional representation of a group
always consists only of the identity transformation. A 1-dimensional repre-
sentation of GL(n, IR) or GL(n, IC) is given by mapping the group element
M-an n x n matrix-onto the transformation x -+ (det M)x in IR or C.]
Representations of the Lorentz groups (see Section 19.4) are given similarly
by the transformation laws of scalars, vectors (i.e., 4-vectors), and tensors,
under Lorentz transformations in space-time; they are of dimension
1,4,16, ....
A 6-dimensional representation of the (restricted) Lorentz group !l' p
is given by the transformation law for the components of the electric and
magnetic fields E and H in free space. (The interpretation of this transforma-
tion law in terms of tensors in explained in Exercise 5 below.) Under the
particular Lorentz transformation (19.4-1), the electric and magnetic field
components transform according to the law
Y= (1 _~~rl/2.
(See any good book on electromagnetic theory.) Under a rotation, the
components of E and those of H transform independently according to
the usual vector law (20.2-1). According to Section 19.4 any element of!l'p
can be written in the form R 1 P R 2, where R 1 and R2 are rotations in space and
P is a transformation of the form (19.4-1), hence the general transformation
for E and H can be obtained by combining (20.2-1) and (20.2-5).
One of the aims of the theory of group representations is to find all possible
transformation laws for physical quantities, that is, all possible repre-
sentations of physical symmetry groups. As seen in the above examples, there
are two main procedures; the building up of representations from simpler
ones by means of tensors, and the breaking down of representations into sub-
representations. Still another procedure, based on the action of a group on
spaces of functions, is described below, starting in Section 20.5.
EXERCISES
1. Show that the representation of rotations (20.2-1) by the transformations
(20.2-2) is a homomorphism, as required. That is, show that if peg) is the transforma-
tion induced in [R9 by the rotation g, then p(g)p(g') = p(gg').
44 Group Representations I: Rotations and Spherical Harmonics
2. Show that the symmetry or anti symmetry of a second rank tensor is preserved
under rotations, when the law (20.2-2) is used; that is, show that if 1';j = ~; (or 1';j =
-~) for all i,j, then" T;j = Tj; (or T;j = - Tj;) for all i,j.
3. Show that the trace of a second-rank tensor is preserved under rotation; that is,
show that
T'J J + T~ z + T~ 3 = T) J + Tz z + T3 3 .
4. Consider a general linear transformation of the form (20.2-1) in rr;\;", where (g;)
is an n x n nonsingular matrix, and show that the transformation is orthogonal
(gT 9 = ggT = 1) if and only if the trace of every second rank tensor is invariant under
(20.2-2).
5. Show that tbe transformation law (20.2-5) for the electromagnetic field under
the Lorentz transformation (19.4-1) can be obtained by transforming the anti symmetric
second rank tensor T" v defined by
Pauli's 1927 theory of electron spin and Dirac's 1928 relativistic wave equation
led to transformation laws under rotations and Lorentz transformations of a
different kind from the familiar transformation laws for vectors and tensors.
That led in turn to the theory of new objects called spinors, which take their
place in relativistic quantum mechanics alongside scalars, vectors, and
tensors, and are discussed in Chapter 22. The transformation laws for spinors
give so-called two-valued representations of SO(3) and Y p' which, however,
are true representations of the covering groups SU(2) and SL(2, C) discussed
in Sections 19.7 and 19.8. Why the latter groups should appear at all in
physical problems seemed a paradox and is still left quite unclear in most
books on quantum mechanics. The resolution of the paradox, due mainly
to Hermann Weyl, is described in Chapter 22. It involves the so-called ray
representations, which are not true representations (as defined in this chapter),
but are nevertheless appropriate for quantum mechanical phenomena. In
Wey11928 it is shown that the ray representations of a group are precisely
determined by the true representations of its covering groups; hence,
the representations ofSU(2) and SL(2, C) playa role. The theory shows further
Infinite-Dimensional Representations 45
that since the manifolds of SU(2) and SL(2, IC) are simply connected, these
groups are the so-called universal covering groups of SO(3) and !l' p' re-
spectively, (see Chapters 24 and 27) and that, in consequence, there are no
multivalued representations of SO(3) and !l' p of multiplicity > 2. It is thus
concluded, on the basis of group theory, that vectors, tensors, and spinors
provide the only possible transformation laws of quantum mechanical
phenomena.
In classical physics a distinction is made between polar vectors (like
momentum and electric field) and axial vectors (like angular momentum
and magnetic field). The former change sign under an inversion x ~ - x,
while the latter do not. Hence there are two (in fact only two) ways of ex-
tending the transformation law for vectors to the full orthogonal group
0(3).
The symmetry or anti symmetry of a many-particle wave function under
interchange (or more generally permutation) of identical particles gives a
simple representation of the relevant permutation group. More complicated
representations of the permutation groups appear in the theory of para-
statistics.
It will be proved that the subspace X' not only contains this sum but contains
each term cmeiml{J individually; hence, it contains the subspace X -m for each
m such that Cm oF O. Namely, since X' is invariant under all the transformations
(20.5-1) it contains all translatesJ(ep - oc) of the given functionJ(ep); hence
it contains any function of the form
K
L hd(ep -
k= 1
OCk),
where the hk and OC k are constants. By choosing this sum as a Riemann sum
that approximates an integral and then going to the limit, it is seen that X'
contains any function of the form
J h(oc)J(ep - oc)doc,
the result is
P(g2)P(gl): f(x) --+ f"(x)
= f'(gzlx)
= z
f(g 1 1(g I X
= f((g2gl)-IX),
so that
(20.6-2)
Suppo.se, fo.r example, that G co.nsists o.f all transfo.rmatio.ns in real 3-space
o.fthe fo.rm
x ~ x' = ( ae bd fe) x,
001
where ad - be * O. Any plane X3 = co.nstant is invariant under G and
satisfies co.nditio.n b, but the plane X3 = 0 fails to. satisfy a, because every
po.int o.fthe plane X3 = 0 is mapped onto. itself under the mappings
x ~(0
1 0
1 f x.
e)
001
*
Hence, o.nly the planes X3 = co.nstant 0 are ho.mo.geneo.us spaces fo.r this
group.
A general procedure fo.r finding irreducible representatio.ns o.f a co.ntinuo.us
gro.up G o.f linear transfo.rmatio.ns in vn co.nsists o.f finding a ho.mo.geneo.us
space S fo.r G in V n , then letting p be the representatio.n f(x) ~ f(g-IX)
o.f G o.n a space XOO o.f functio.ns o.n S, usually L 2(S), then finding a co.mplete
set o.f infinitesimal o.perato.rs, and then finding minimal invariant subspaces
o.f XOO with the help o.f the infinitesimal o.perato.rs. The special functio.ns
asso.ciated with the symmetries described by G are the elements o.f the in-
variant subspaces o.f XOO.
The pro.cedure is described in mo.re detail in Sec. 20.9 belo.w fo.r the case
o.f the ro.tatio.n group SO(3). In that case, the invariant subspaces that will
be fo.und are all finite-dimensio.nal. It will be seen in the next chapter that the
same is true fo.r any co.mpact group. In that chapter, the questio.n whether all
irreducible representatio.ns are fo.und in this way is also. answered, fo.r co.m-
pact groups.
The theo.ry fo.r no.nco.mpact gro.ups, where infinite-dimensio.nal irreducible
representatio.ns can o.ccur, is co.nsiderably beyo.nd the sco.pe o.f this bo.o.k,
and we shall be co.ntent with a discussio.n o.f so.me o.f the main features o.f
three examples, in the next chapter: the rigid mo.tio.n group M 2, where
Bessel functio.ns appear, and the Lo.rentz gro.up and SL(2, IC), where spino.rs
appear.
EXERCISE
Show that the set of left translations is effective and transitive for the group mani-
fold.
The methods outlined in Section 20.6 and 20.7 are here applied to the
rotation group. Our approach contrasts with the one often taken, in that we
assume nothing in advance about the spherical harmonic functions, Yi(O, <p),
but derive those functions and their properties from group theory.
Let gO) = gw x , Wy, w. be the matrix of a rotation through an angle liroll about
an axis in the direction of roo That is, goo = R(ro), where R() is defined by
(19.6-1). Let XOO be the space of all infinitely differentiable functions f(x)
in [R3. For each g = gO), an operator peg) on XOO is defined, according to
(20.6-1), by the equation
(p(g)f)(x) = f(g-IX). (20.9-1)
These operators peg) constitute a representation of SO(3).
The infinitesimal operators of this representation are obtained as follows:
Since the functions in XOO are differentiable, the operator
1
~ [p(gw, 0, 0) - p(go, 0, 0)]
= -d
Ll def d p(gw,oo) I a - y-;-.
= Z-;- a (20.9-2)
W w=O uy uZ
Similarly,
L2 def
=
d p(go,w,o) I
-d =
a-
x ;:)- a
Z -;-,
w w=O uZ uX
(20.9-3)
d p(go,o,w) I
dcl
L 3 =-d a a
=Y:l-x-;-.
w w=O uX uy
Representations of the Rotation Group SO(3) 51
Except for a factor ih, these are the quantum-mechanical operators for the
components of angular momentum-see Schiff 1955, Chapter IV. They obey
the commutation rules
(i j k = 1 2 3, 2 3 1, 3 1 2). (20.9-4)
T = agwl.w2~1 (i = 1,2,3),
I awi w=o
then, according to (19.9-1),
o o -1
o o o
1 o o
hence [7;, 1j] = 1'", where ij k = 123, 231, or 312. Since products are
mapped onto products in the representation g -+ peg), the rules (20.9-4)
follow for any representation.
. a e a
Ll = sm <p ae + cot cos <p a<p ,
a e. a
L z = - cos <p ae + cot sm <p a<p' (20.9-5)
a
L3 = - a<p
The operators L , defined as L 1 iLz, are given by
(20.9-6)
(20.9-7)
52 Group Representations I: Rotations and Spherical Harmonics
Suppose that f(8, cp) is a function in XOO(S), not identically zero. We wish
to find the minimal invariant subspace Xl containing f(8, cp) and to choose
f(8, cp) so as to make that subspace as small as possible, in a sense. Iff(8, cp) =
L gm(8)e im q>, and if, for some m*, gm*(8) is not =0, then, by the argument of
Section 20.5, the subspace contains all multiples of the single term gm*(8)e im*q>.
The operators L + and L -, when applied to a function of the formf(8)e im q>,
give functions of the form f1(8)e i(m + llq> andfi8)e i(m-1 l q> (for this reason, L +
and L - are called raising and lowering operators), and these functions must
be in Xl, since Xl is invariant; hence, Xl contains functions of the form
(20.9-8)
for m = m* + 1, m* + 2, etc., and for m = m* - 1, m* - 2, ... , etc.
It will appear that the functions gm(8) can be so chosen as to make Xl
finite-dimensional; to achieve this, L +l/1m must be zero for some m, say,
m = l, and L -1/1 m must be zero for some m ~ l, say m = 1'; it will be seen
below that l' = -l. From the first ofthese conditions, g;(8) - 1cot 8g,(8) = 0,
according to the formula (20.9-6) for L +. The solution of this differential
equation is g,(8) = const. (sin 8)', and since the functions in XOO have no
singularities on the unit sphere, it follows that 1 ~ 0; hence,
(20.9-9)
where C is a constant to be determined later. Starting with this function, a
sequence of functions 1/1,-1, 1/1,-2, ... , is obtained by repeated use of the
operator L -, which transforms a function of the form g(8)e im q> into one of the
form h( 8)e i (m - 1 lq>. All these functions are in X 1. We shall now show first that
no new functions are obtained from these by the raising operator, i.e., that
L +l/1m-1 is the same function as I/1m, except for normalization, and second that
L -1/1 -I = 0, so that the sequence terminates at m = -1. We use induction
on decreasing m, starting with m = l: Assume that, for some m, L + 1/1m ex 1/1 m + 1,
i.e., that L - L +l/1m ex I/1m, and note that this last is in any case true for m = l,
because L + 1/1, = O. According to (20.9-7),
(20.9-10)
and it follows that L + L -l/1m is also exl/1m; hence L +l/1m-1 ex I/1m, and the
induction follows.
We now determine the functions I/1m more explicitly. Let the proportionali-
ties referred to be written as
(20.9-11)
Since each I/1m contains an arbitrary factor, these equations determine only
the product am 13m , by the equation L - L +l/1m = -amf3ml/1m. Hence, we can
take 13m = am, for all m. It then follows from (20.9-10) that
and this equation holds also for m = I, if IXI is set = 0.1t follows by an induction
on decreasing m that
It will be shown in this section that the properties of the spherical harmonics
follow from the representation theory of the rotation group, and that the
tesseral harmonics form a basis for the representation of SO(3).
If the functions l/lm((}, cp) of the preceding section are taken as a basis in
X 21 + 1, then the transformations peg), when restricted to x 2l+ \ are given by
(21 + 1) x (21 + 1) matrices. Before these matrices can be computed, the
functions l/lm must be discussed further; they will be denoted henceforth
by Yi(}, cp), to acknowledge the dependence on l. They are called tesseral
(surface) harmonics. A surface harmonic is a functionf(}, cp) such that rPf(}, cp)
satisfies Laplace's equation in x, y, Z, for some integer p, and it will be seen
that rIYi(}, cp) satisfies Laplace's equation. A tessera (which comes through
Latin from a Greek word meaning "four-cornered") is a curvilinear rec-
tangle such as the ones into which the sphere is divided by the zeros or nodal
lines ofRe Yi orIm Yi, which occur on certain circles oflattitude () = const.
and certain meridians cp = const.
An inner product is defined in the space XCO(S) of functions on the unit
sphere S, as follows:
because they are defined in all L 2(S) and are invertible, and [since the
integral (20.10-1) is invariant under rotations] because
(20.10-2)
for all il and i2' It will be shown that the functions Y?, are orthogonal with
respect to the inner product (20.10-1). If the constant C in (20.9-9) is suitably
chosen (it can depend on I), they are also normalized. It will be proved that
they form a complete orthonormal set of functions on the sphere.
The ({J integration alone shows immediately that Yl:' and YG' are or-
thogonal , if m 1 -r-
-J.. m
2, because ym,I, ym1
1, contains a factor ei(m1-m,)q> . It is
evident from (20.9-2, 3) that the operators Li are antisymmetric, i.e., (Li i, g) =
-(f, Lig), and from this it follows that L - L + is symmetric. Furthermore,
from (20.9-11),
L- L + Y?, = _(a?,)2y?" (20.10-3)
where, with a slight improvement of notation,
(a?,)2 = (l + m + 1)(1 - m), (20.10-4)
according to (20.9-12). Therefore, the equation
(L - L + yr:, Y G) = (Yr:, L - L + Y G)
is equivalent to
(a!';)2(y!';, Y G) = (aG)2(y!';, Y G);
hence, since a!'; i= aG for 11 i= 12, it is seen that the Y?, are orthogonal.
We now show how to choose the constant C in (20.9-9) so as to normalize
the functions Y?,. The adjoint of the operator L + is - L - ; hence
(L + Y?" Y?,+ 1) = (Y?" _ L - Y?,+ 1).
It follows from (20.9-11), since Pm = am = a?" that
( - ia?, Y?, + 1, Y?, + 1) = (Y?" ia?, Y?'),
from which it is seen that I Y?, 112 is independent of m, for given 1. From the
equation (20.9-9) for l/II = Yl,
I YW = 2n Iq2 f sin 21 + 10 dO
_ 2 24 .. (21)
- 4n IC I 1. 3 ... (21 + 1)
2 (21l!)2
(20.10-5)
= 4n Iq (21 + 1)! .
Therefore, if the constant C is chosen as
(-1Y (21+1)!
C = CI = 2lz! 4n (20.10-6)
With Yl given by (20.9-9) and (20.10-6), and the other Yi given in terms of
Yl by the recurrence relation (20.9-11), which says that L - Yi+ 1 = -io:iYi,
we define new functions Pi(w), called the associated Legendre junctions,
for -1 S w S 1 by the equation
(20.11-6)
[For the present purpose, it would have been more reasonable to fix y l- 1
initially, rather than YL and then determine the other Yi by means of the
raising operator L +, rather than the lowering operator L -. However, the
general relation between Yi and y 1 m , which is now needed, has independent
interest.] Complex conjugation interchanges L + and L -; hence, the con-
jugates of equations (20.9-11) are
L -1/1 m = irxm 1/1 m + 1 , L + 1/1 m + 1 = irxm 1/1 m'
from which it is seen that the quantities ( -lrl/1m satisfy the same equations
as the quantities 1/1 -m; hence
y 1m = k( -lrYi,
where k is a constant, which will soon be seen to be = 1. Since C in (20.10-6)
is real, equations (20.9-9) and (20.10-7) show that pl(w) is real; then (20.11-1)
shows that all the Pi(w) are real; hence, by (20.10-7), Y? is real. The above
equation, with m = 0 then shows that k = 1. Therefore,
(20.11-7)
Since Yl = C(eiq> sin ey, with C given by (20.10-6), the above equation gives
y l- 1 explicitly, from which equation (20.11-5) for P I- I follows from (20.lQ-7).
(Some authors define y l- m to be the complex conjugate of Yr, after defining
the latter for m 2: O. The procedure followed here has some advantages; for
example, the matrices P~'m of the irreducible representations of the rotation
group, given below, are symmetric.)
Clearly Pi(w) is a polynomial, for even m. p?(w), usually denoted by Plw),
is the Legendre polynomial of degree I.
EXERCISES
n
1. Show that SA (1 - wZ)' dw = [21/(21 + 1)] (1 - WZ)'-l dw, and use this
result to show, by an obvious induction, that the integral in (20.10-5) has been correctly
evaluated.
2. Express the operators L in terms of the variables wand cp, where w = cos e,
and derive the recurrence relationships (20.10-8,9) from (20.9-11).
3. For the special case m = 0, verify that the Rodrigues formula (20.11-6) gives a
solution of Legendre's differential equation, which is (20.11-3) with m = O. [The
solution is PiCw).] You are welcome to do the same for m #- 0; it is just more work.
Matrices of the Irreducible Representations of SO(3); the Euler Angles 57
4. Since Pi' and P1- m satisfy the same equation (20.11-3) (this equation is un-
altered by replacing m by - m), which can have at most one solution regular at w = 1,
they must be proportional. Find the proportionality constant. Further warning about
notation: some authors define p l- m to be = Pi'.
5. Show that, as an alternative to (20.11-6), the equation
holds, for -1 :S m :S t.
The first and third factors are diagonal matrices; the transformation pea, 0, 0)
merely replaces cp in a function by cp - a and hence multiplies yr
by e - ilZm;
that is,
P~'m(a, 0, 0) = e- ilZm 'c5 m'm'
Therefore, P~'m(rx, /3, y) can be written in the form
P~'m(rx, /3, y) = e-ilZm'p~'m(cos /3)e- iym . (20.12-2)
The functions P~'m(w) are closely related to the Jacobi polynomials; their
properties are discussed at length in Gel'fand, Minlos, and Shapiro 1963
and in Vilenkin 1968, to which the reader is referred for details. (The definition
58 Group Representations I: Rotations and Spherical Harmonics
of P~'m given below agrees with that in Gel'fand et al. and gives the complex
conjugate of the function defined by Vilenkin.) The P~'m are defined by the
equation
C = im'-m2- Z( + m')f
(I
)1/2 (20.12-4)
(I - m)!(l + m)!(l- m')!
(It would perhaps be more logical to incorporate the factor im'-m explicitly
in P~'m' rather than in P~'m' which would then be a real function, for
-1 :-:; w :-:; 1, but it is not customary to do so.)
For m = 0 (and for m' = 0), these functions are proportional to the
associated Legendre functions. Comparison of (20.12-3) with (20.11-8)
shows that
z
Pm'O(rx, /3, 0) = J 4n
21 + 1 Y m'(
z /3, rx - 2n) . (20.12-6)
[This is the same as p~'o(!X, /3, y), because P~'o is independent ofy.]
EXERCISES
l+w
--
2
-iJl ~w 2 w-l
2
(P~'m) = .p
-I --
2
w -iJ] ~ w
2
,
w-l
2
-I .p --
2
l+w
--
2
where the rows are numbered by m' = - 1, 0, 1 and the columns by m = -1,0, 1. Note
that this matrix is unitary.
4. Identify the left member of (20.12-1) with Y;"W, q/), multiply the equation
through by r, call z' = r cos e', x' iy' = r sin e'e;cp', and similarly for x, y, z. Again,
take I = 1. Then, using the result of Exercise 3, show that, for the case 9 = rotation
The Addition Theorem for the Tesseral Harmonics 59
through the angle fl about the x axis (i.e., rx = y = 0), the transformation (20.12-1)
reduces to
X' = x,
y' = y cos fl + z sin fl,
z' = - y sin fl + z cos fl.
5. Show that P~'m = P~m"
6. Show that the appearance of rx - n12, rather than rx itself, in (20.12-6) could be
avoided if the second step in the definition of the Euler angles were taken to be a rotation
through the angle fl about the y axis rather than the x axis.
In equation (20.12-1), which tells how the functions YT, for given 1, transform
among themselves under a rotation g, we set m = 0, and we take 9 to be the
rotation with Euler angles a, {3, 0:
I
=
1 [0 0+ L -L + + L3 -
2
r2 or r or
2
1L3 .
] (20.14-1)
for there are 21 + 1 of them, and they are obviously independent because of
the orthogonality of the Y7'. It is concluded that any harmonic polynomial can
be expressed in terms of the functions r1y7'(0, ({J).
According to the theory of the Dirichlet problem in potential theory, if
f(O, ((J) is any continuous function on the unit sphere, there is a function
tjJ(x, y, z) that satisfies V2tjJ = 0, for x 2 + y2 + Z2 < 1, and is continuous,
for x 2 + y2 + Z2 ::; 1, and takes the values f(O, ({J) on the sphere x 2 +
y2 + Z2 = 1. (In fact, tjJ can be expressed in terms offby the Poisson integral
formula.) tjJ is analytic in the ball and can be expressed as a power series in
x, y, and z. The terms of this expansion, of a given degree I constitute a
harmonic polynomial of degree I, and hence can be expressed in terms of the
functions r1y7'(0, ({J);"that is,
00 1
tjJ(x, y, z) = I
1=0 m=-l
L A7'r1y7'(0, ({J).
It can be shown that, for a continuous boundary functionf, the solution tjJ
of the Dirichlet problem converges to f, uniformly in angle, as r -+ 1. Hence,
it converges in L 2 ; therefore,
in the sense of mean convergence. Since the continuous functions are dense
in L 2(S), it follows that the tesseral harmonics form a complete orthonormal
set of functions on the sphere. It follows also that any distribution f(O, ((J)
in L 2(S) can be expanded as (20.14-2), where the coefficients are given by
proved that X2l+ 1 is invariant also under the transformations peg), g E SO(3):
Namely, peg) Y7' is a function of and ({J obtained by rotating the sphere by the
rotation g and carrying the values of Y7'(O, ({J) along; hence, rlp(g)Y7' is also
a homogeneous harmonic polynomial of degree I in x, y, z; hence, it can be
expressed linearly in the polynomials r1y;"'(0, ((J), m' = I, 1- 1, ... , -I.
Therefore, X 21 + 1 is invariant.
The completeness of the set of tesseral harmonics shows that the space
L 2(S2), where S2 is the unit sphere in [R3, is the direct sum (with respect to the
L2 norm) of the spaces X 21 + 1 (l = 0, 1,2, ... ).
CHAPTER 21
lal = 1, 1131 = 1.
peg) =
B (0)
(n - m)
(21.2-1)
(0) x
(n - m)
i.e., all matrix elements that connect the two subspaces are zero. (If P is
reducible but not decomposable, then it is possible to choose the basis so
as to make all matrix elements zero in the lower left rectangular block
shown, but not also in the upper right block.) If, for each g, PI(g) and pig)
denote the m x m and (n - m) x (n - m) matrices shown, respectively,
then each of the mappings g ~ PI(g) and g ~ pig) is a representation of G,
and the representation P is their direct sum; in symbols, P = PI P2' +
If X is an infinite-dimensional Banach or Hilbert space, X I is understood
as a closed linear manifold in X; there is no loss of generality here, because
the operators peg) are all bounded, hence the closure of an invariant linear
manifold is invariant. Again, if X = X I E8 X 2, and if X I and X 2 are invariant
under all peg), then we write
P = PI + P2'
where PI and P2 are the restrictions of the representation p to X I and X 2,
respectively. (One of PI' P2 may be finite-dimensional.)
It may be that X I or X 2 contains further invariant subspaces such that
PI and P2 (or both) can be further decomposed, and so on. Then, with respect
to a suitable basis in X, the matrices peg) contain a number of square blocks
straddling the main diagonal, and all elements outside those blocks are
zero. Each square block gives a representation of G, and P is the direct sum
of those representations. When this process has been carried as far as possible,
it may turn out that the resulting representations into which P has been
decomposed are all irreducible. In this case, P is called completely reducible.
Then one can find the structure of all representations of G by finding all
minimal invariant subspaces of a sufficiently large initial space X, as was
done in the preceding chapter for SO(2) and SO(3).
Comment. No use has been made of the fact that G is a group. The same
conclusions hold if the sets {Pl(g)} and {P2(g)} are any two irreducible
sets of square matrices. A set {MJ of k x k matrices is irreducible if there is
no nontrivial proper subspace of V k that is invariant under all mappings
x --+ MiX.
Corollary. Any matrix that commutes with all the matrices of an irreducible
representation (or an irreducible set of matrices) is a multiple of the identity.
PROOF. Each p(h) commutes with all peg); hence each p(h) is a multiple of the identity
matrix; hence, p would be reducible if it were not one-dimensional.
J/(h9)W(9)dd(9) = L/(g)W(9)dd(9)
(21.5-1)
for all h in G, all continuous f on 51',
where dd(g) is the m-dimensional volume element, or "element of area" on
g.
If G is noncompact, so that the surface 51' extends to infinity in V, then
the same is true~i.e., there is a weight function w(g) on 51' such that (21.5-1)
holds~provided that f is a function such that the integrals converge.
The mapping 9 ~ hg of G onto itself, for fixed h, is called a left translation
in G. The equation above shows that the integral of a continuous function on
51' with respect to the weight function w is invariant under all left translations
in the group. The integral in (21.5-1) is called a left-invariant integral. A
proof that such a function w exists can be found in Wigner's book Gruppen-
theorie (1931) under the heading "Hurwitzsches Integral." See also Nachbin
1965.
There is a similar invariant integral for right translations. It can be proved
that, in particular, if the group G is compact, then the above integral (with
the same weight function w) is also invariant under right translations and
under inversions; that is
The proof can be found in Weyl's book, The Theory of Groups and Quantum
Mechanics (1932), Chapter III, Section 12.
EXERCISES
then g is written as g(lJ(, {3, y), and the variables IJ(, {3, yare called the Euler angles of g.
[Under the homomorphism of SU(2) onto SO(3) given in Section 19.7, they become
the Euler angles of the rotation R(g), with y restricted to the range 0 ::; y < 2n-note
that replacing y by y + 2n replaces g by -g and leaves R(g) unaltered.] Show that if the
Euler angles are taken as intrinsic c~ordinates in the group SU(2), then the element of
area on S3 is given by
sin {3
dd(g) = -8- dlJ( d{3 dy.
4. Show that
g( IJ(, {3, y) = g( IJ(, 0, O)g(O, {3, 0) g(O, 0, y). (21.5-3)
5. Show that, under the homomorphism of SU(2) onto SO(3) given in Section
19.7, if R(g(lJ(, {3, y)) is called R(IJ(, {3, y), then R(IJ(, 0, 0) is =R(O, 0, IJ() and is a rotation
through IJ( about the z axis, while R(O, {3, 0) is a rotation through {3 about the x axis. Give
the geometrical interpretation of the result of Exercise 4 as the law of composition of an
arbitrary rotation in terms of successive rotations about the z, x, and z axes, respectively.
6. Derive the formula for the area An of the n-sphere (the unit sphere in En+l)
from the evident equation
using the gamma function, and verify directly that the formula of Exercise 3 is correctly
normalized. Conclude, on the basis of the 2-to-l homomorphism of SU(2) onto SO(3)
that the (3-dimensional) area of the surface that has been identified with the manifold
of SO(3)-see Section 19.5-is equal to n 2 Show that the volume of an n-dimensional
ball of radius R is
2nn/2
v" = ___ Rn.
n['(n/2)
Invariant Integration; Haar Measure 69
The system {Y;"} oftesseral harmonics is not the only orthogonal function
system that comes from the representations of SO(3). The tesseral harmonics
are orthogonal on the unit 2-sphere, which was taken as the homogeneous
space for the representation. However, it was pointed out in Section 20.8
that the group manifold can also be taken as the homogeneous space;
then, a larger class of orthogonal functions appears-they are functions of
the Euler angles a, {3, ')I, which can be taken as intrinsic coordinates in SO(3).
The theorem below deals with such function systems in general.
It is customary to denote the expression w(g)dd(g) that appears in the
left-invariant integration over the group manifold simply by dg, and to write
equation (21.5-1) as
(21.5-5)
= Lf1(g)fig)d g
J = Lf(hg')dg' = Lf(g)dg .
We consider the case where the function f(g) is zero except for elements g in
a small neighborhood % of the identity element of the group, and f(g) = 1
for those elements. They occupy a small volume V in the coordinate space
near 9 = 0; hence from the right member of the above equation we have
J ~ Vw(O). The left member comes from group elements hg' in the neighbor-
hood % of the identity, hence from g' in a neighborhood of h-l, having
volume V', so that J ~ V'w(h- 1 ); hence we must determine V' to determine
w(h - 1). If we call g' = h - 1 k, then k varies in %. We denote by 9(g) the
coordinates of any group element g. Hence, if 9 and 9' denote the coordinates
of k and g' = h- 1 k, we have
9 = 9(k), 9' = 9(h- 1 k) = 9'(9).
As 9 ranges through the volume V, 9' ranges through V'; hence V' is given in
terms of the Jacobian as
EXERCISE
7. Let G be the rotation group SO(3), let I3 x , l3 y , I3z be the intrinsic coordinates
introduced in Section 19.6, and let Obe the vector with components I3 x , l3 y , I3z Specifically,
let 0 represent the group element k in the above discussion, where 11011 ~ 1, and let 0'
Complete System of Representations of a Compact Group 71
We can take h- 1 as a rotation through an angle rx about the x axis, since w(h- 1 ) is
independent of the direction of the rotation-axis, i.e., we can take
Show that the coordinates e~, e~, e~ of the element h-1k are, to first order,
Hint: The angle of rotation and the direction of the axis are given by (19.2-7 and 8) for
a given rotation matrix.
Remarks. (1) Recall that right- and left-invariant integration are the same
on a compact group. (2) It has been assumed that SG dg = 1. (3) The matrix
elements are assumed to refer to an orthonormal set of vectors, so that the
matrices (P~n(g of the unitary transformations leg) are unitary matrices.
72 Group Representations II: General; Rigid Motions; Bessel Functions
For the proof of the theorem, see Vilenkin 1968. The orthonormality of the
functions is a straightforward matter; their completeness is somewhat
deeper and was proved by F. Peter and H. Weyl in 1927.
For G = SO(3), the theorem says that the functions
I = 0, 1,2, ...
{
-1:::;; m', m:::;; 1
~ ~ ~
M2 --+ M3 - - + f!J>p,
where an arrow indicates the subgroup relationship and where f!J> p is the
proper Poincare group (consisting of proper Lorentz transformations com-
bined with displacements in space and time). Groups involving spatial
inversion and time reversal are also of interest but would complicate the
diagram. M 2 will be considered in some detail in this chapter.
An element of M 2 is a mapping g = g~,~, 0 of the x, y plane onto itself given
by
g: (x)y ~ (XI)
y'
= (X c~s
sm X
f) -
f)
y sin f) + ~).
+ Y cos f) + 1]
(21.8-1 )
Verify that under this association the composition 929, of two mappings of the
form (21.8-1) is associated with the product of the corresponding matrices of the form
(21.8-2).
21.9 Representations of M2
To find other representations, let XOO denote the space of all infinitely dif-
ferentiable functions f(x, y) defined for all x and y. Evidently, the plane is a
homogeneous space for M 2 A representation of M2 on XOO is obtained by
74 Group Representations II: General; Rigid Motions; Bessel Functions
oq>,
(a
L -_ e icp -+-- i
or - r oq> ,
a) (21.9-4)
L + L - = L - L + = V2 .
for each m. This requires, of course, that gm(r) be proportional to the Bessel
function J m( ar); Bessel functions are discussed in the next section.
I/Im(X, y) = i-meim"'Jm(ar).
The equations (21.10-2) then take the form
which are the recurrence relations for the Bessel functions. Elimination of
J m + 1 gives
(21.11-2)
Jm(z) = ~
2n
In eizsint+imt dt.
-1[
(21.11-3)
76 Group Representations II: General; Rigid Motions; Bessel Functions
(21.12-2)
where
(21.12-3)
(the prime does not denote differentiation); a change of variable X - f) ...... X
has been made in the integral, without altering the limits, because the inte-
grand has period 2n. Hence, the group element g of M 2, which consists of a
clockwise rotation through f) followed by a translation by;, induces the
transformation (21.12-3) in the space of functions! of period 2n. lex) and
I'(X) are now expanded in the Fourier series L
crneirnx and I c~eirnx, res-
pectively; it is found that
00
c~ = L (n)Pmncn,
-00
where
(21.12-4)
where (21.11-3) has been used. It is seen that the Bessel functions appear
not only in the characterization of the invariant subspaces of Xoo, but also
Characters 77
space. Let Xa denote the space of all functions u(x, y, z) that satisfy the
3-dimensional reduced wave equation V 2 u + IX 2 U = in all of E 3 , for fixed IX.
Then the representation of M 3 on X a given by the association with each g
in M 3 of the transformation
p(g):f(x) --+ f(g-l X ) (21.12-5)
of Xa onto itself is irreducible. From the infinitesimal operators of this
representation, a basis in Xa is found, consisting of the functions
Yi (8 ,cpr-
) 1/2 J 1+ 1 / 2 (IXr,
) {I = 0, 1, 2,. . . (2 1.12-6)
m = -I, -I + 1, ... , l.
Therefore, the so-called spherical Bessel functions
21.13 Characters
The concept of the character X(g) of a representation plays an important
role in representation theory. For compact groups, it is the key to establishing
the completeness of a set of irreducible representations, hence to deciding
whether all representations have been found. If p is a representation on a
finite-dimensional space X n , so that the p(g) are matrices with elements Pjk(g),
then X(g) = tr p(g) = LJ= 1 Pj/g) Hence, X is a scalar-valued function on the
group G. If G is compact, the characters Xl and X2 of two inequivalent irre-
ducible representations are orthogonal with respect to the inner product
(21.5-5) :
(21.13-1)
78 Group Representations II: General; Rigid Motions; Bessel Functions
tlX(gW dg = 1 (21.13-2)
EXERCISES
1. Show that if two finite-dimensional representations pI and p2 are equivalent
[i.e., if they have the same dimension and if there is a matrix A i=- such that Apl(g) =
p2(g)A for all g], then Xl(g) == X2(g).
2. Show that two rotations gl and g2 through a given angle ()) about two different
axes are conjugate, i.e., that there is a rotation h such that 9 I = hg 2 h- I
3. Show that the characters of the irreducible representations pi of SO(3) are
Xl = sin(/ + t)lX/sin tlX (I = 0, 1, ... ), and show that they satisfy the orthonormality
relation (21.13-1), where dg is =((1 - cos 1X)/(4n 2 1X2))d 3 9, according to Exercise 7 in
Section 21.5. Hint: It suffices to consider rotations about the z axis, for which the
matrices pl(g) are diagonal; see (20.12-2) and the preceding equation.
To show the completeness of the characters i for SO(3), we must show that
if t/I(a) is any continuous function such that (i, t/I) = 0 for alII, then t/I(a) == O.
Since 1 - cos a = 2(sin ta?and d 3 9 ~ 4na 2 da, this is equivalent to
showing that if
for alII, then t/I(a) == O. If we call ta = t and sin ta t/I(a) = x(t), this is equiva-
lent to showing that if
f "/2
o x(t)sin(21 + l)t dt = 0
Characters 79
for all I, then X(t) == o. This is, however, the case, for if x(t) is extended to the
entire interval - n ~ t ~ n by requiring it to be an odd function about
t = 0 and an even one about t = nj2, then the functions sin(21 + l)t
just suffice for the Fourier series for X(t). Since the characters l form a
complete set of functions, the representations pI (l = 0, 1, 2, ...) are all the
irreducible representations of SO(3). This yields the answer to the question
raised in Section 20.2 for the case of rotations of the Cartesian coordinate
axes in 3-space: All possible nonrelativistic transformation laws of physical
quantities are provided by the representations pI of SO(3).
CHAPTER 22
The purpose ofthis chapter is to elucidate one particular point in the applica-
tion of group theory to quantum mechanics, namely the occurrence of double-
valued or spin representations of the rotation and Lorentz groups.
It has been seen that, in classical physics, various sets of quantities transform,
under rotations of the coordinate axes, so as to give representations of the
rotation group. (The same applies in classical physics to other symmetry
groups, such as the groups of rigid motion, the crystal symmetry groups, the
Lorentz group, and so on.)
In quantum mechanics, on the other hand, some quantities transform, under
rotations of the coordinate axes, like the components of spinors and thus give
representations of SU(2) rather than of the rotation group SO(3). This was
shown by Dirac (in somewhat different language) in his paper (1928) on the
relativistic wave equation, and it was also implicit in Pauli's theory of the
electron spin, published a year earlier. More generally, spinor components
transform under a Lorentz transformation !l' p so as to give representations
of SL(2, q rather than !l' p. This seemed rather surprising at the time, even
though Dirac showed that all observable quantities transform like scalars,
vectors, and tensors, i.e., according to the representations of SO(3) and !l' p. It
was seen in Sections 19.7 and 19.8 that the homomorphisms of SU(2) and
SL(2, q onto SO(3) and !l' P' respectively, are 2-to-l; hence a representation
of the first group can associate two different matrices, M and - M, with each
80
Rotations of the Axes 81
element g of the second group, i.e., with each of the transformations of space-
time. This association is sometimes called a two-valued representation of the
second group. How they arise is discussed in this chapter. It will be seen that
the role of SU(2) and SL(2, C) is to determine the so-called ray representa-
tions of the physically relevant groups SO(3) and !l! p.
Each possible state of a quantum mechanical system corresponds not to a
single vector tjJ in a Hilbert space f>, but to a ray {rxtjJ} consisting of all numerical
multiples of tjJ. If all vectors are normalized (II tjJ II = 1, II rxtjJ II = 1), then rx has
unit modulus (Irxl = 1), but its phase (arg rx) is arbitrary. This arbitrariness
affects the interpretation of representation theory, as will be seen.
Now suppose that for each 9 in SO(3) a single unitary transformation U(g)
is somehow chosen from the corresponding equivalence class. If ljJ' = U(g)ljJ
and ljJ" = U(h)ljJ', then the resulting transformation matrix for the mapping
ljJ -> ljJ", i.e., U(h)U(h), is not necessarily = U(hg), but is ~ U(hg). Hence, for
each pair h, 9 of rotations there is a phase factor y(h, g) such that
U(h)U(g) = y(h, g)U(hg), (22.2-1)
where Iy(h, g)1 = 1. Possibilities for the choice of the function y(h, g) are
discussed below.
then
V(g)V(h) = V(gh) for all g, h in JV o . (22.5-2)
To prove this, note that in any case
V(g)V(h) = beg, h)V(gh),
where beg, h) is a continuous function [compare with (22.2-1)]. It is seen from
(22.5-1) that det V(g) = lforallg,henceb(g, h)" = 1,henceb(g, h)issomenth
root of unity for all g, h; but bee, e) = 1, hence beg, h) == 1, by continuity, and
(22.5-2) follows.
84 Group Representations and Quantum Mechanics
Similarly, for a system that is invariant not merely under the rotation group
SO(3) but also under the entire proper Lorentz group !t' p' the transformation
of the wave functions corresponding to a given transformation g of !t' p is not
unique. Instead, there is a set {aU: Ia I = 1} oftransformations corresponding
to each g, and these sets are so correlated that one can choose transformations
from them so as to give a representation ofSL(2, C) [which is related to!t' pas
SU(2) is to SO(3)]; this may be a representation of!t' p itself, involving scalars,
vectors, or general tensors, or it may be a two-valued representation (spin
representation) of !t'p' In Dirac's theory of the electron, the transformation
laws ofthe four components ofthe electron's wave function give a two-valued
representation of !t'p (see Dirac 1958, p. 258).
It is easy to see that a two-valued irreducible representation cannot be made
into a single-valued one by somehow appropriately choosing one of the two
matrices U and - U that represent each given g of SO(3) (or !t' p); namely, if
U 0 is a matrix that represents a rotation through n, in a two-valued irreducible
representation, it can be shown that U~ = - I, but U~ represents the identity
in SO(3), hence must be = + I in any single-valued representation.
u = (a
y
[3):
i5
(Xl)
X2
--+ (aXI + [3X 2),
yXI + i5x 2
(22.7-1)
u- l = ( i5 -f3).
-y a
(22.7-2)
Certain elements of the subgroup SU(2) are now considered. Let Wi> W 2 ,
W3 be the intrinsic coordinates in SO(3) defined in Section 19.6, let gWl, W2, W3 be
the corresponding rotation matrix [element of SO(3)J, and let U W1 , W2, W3 be
the elements SU(2) that are mapped onto gWl, W2, W3 by the homomorphism of
Section 19.7. In particular, one can take
(
COS w/2 -i sin W/2)
uw,o,o = -i sin w/2 cos w/2 '
Uo = ( .
COSw/2 -sin W/2) (22.7-3)
,W , 0 sm w/2 cos w/2 '
e - iwj2
Uo,o,w -- (
0 ei~j2 ).
because a direct calculation, using the equations of Section 19.7, shows that
the corresponding transformations from x, y, z to x', y', z' are those given by
the matrices
~G
0
g..,OO 0
W
-~).
0
go"o ~ ( -w~ 0
0
en)
o,
0
(22.7-4)
~G
-w
goom 0
0 ~}
in agreement with (19.6-1). Infinitesimal group elements ofSU(2) are obtained
accordingl y :
T, - - u
1 -
o
OW w,O,Olw=O - -
- 2
1( 0-i
T.2 - -u o --
- OW O,w,Olw=O - 2 1
1(0 -1)0' (22.7-5)
T3 =
o uo,o,wlw=o = 21(-i0
ow 0)
i'
Irreducible Representations of SU(2) 87
(22.7-6)
L3
o p(uo,o,,,,)lro=o ="2i(Xl OX0 -
= ow X2
0)
OX 2
I
We note in passing that the matrices (22.7-5) can be regarded also as the
infinitesimal group elements of the larger group SL(2, q, for the following
reason: First, it is easily verified that the matrices (22.7-3) are given in terms
of the matrices T; by the equations
uro,O,O = exp(wTI ),
From (22.7-5) it is seen that the right member of this last equation is of the
form exp(iA), where A is a general 2 x 2 Hermitian matrix of trace zero. If,
now, WI' w 2 and W3 are allowed to take complex values, then it is of the form
exp B, where B is a completely general 2 x 2 matrix of trace zero, and then
exp B is a general 2 x 2 matrix of determinant = 1, i.e., a general element of
the group SL(2, q.
For each value O,!, 1,1,2, ... of an index I, a subspace X 2 l+ I of XOO is defined
as the space of all homogeneous polynomials in X I and X 2 of degree 21. From
(22.7-2) it is seen that each operator p(u) transforms any homogeneous
polynomial into another homogeneous polynomial of the same degree; hence
each subspace X 2 l+ I is invariant under p(u), not only for all u in SU(2), but
also for all u in SL(2, q.
It will be shown that the representation ofSU(2) given by (22.7-2) on each
subspace X 21 + I (it will be called Dl) is irreducible; hence, the representation of
88 Group Representations and Quantum Mechanics
for some rJ. in [0,2n]. For such u, the operator DI(u) simply multiplies the
basis vector fm (22.8-1) by eima ; hence DI(u) is a diagonal matrix, whose trace is
I( ) _ sin(l + t)rJ.
X rJ. - . 1 ' (22.9-1)
sm zrJ.
just as for the case in which 1is an integer, according to Exercise 3 at the end of
Section 21.13. It was shown in that section that the functions 22.9-1, for
1 = 0, t, 1, ~, . .. form a complete system for the expansion of functions
depending only on the conjugacy class on the manifold of SU(2); hence, the
representations DI exhaust the irreducible representations of SU(2).
The method of homogeneous polynomials used in Section 22.8 for SU(2) can
also be used for SL(2, iC), but here a new aspect appears. Given a representa-
tion p of a group G, there are many ways in which another representation p'
can be obtained. Among them is
[which means that each matrix element Pmn(u) is replaced by its complex
conjugate], for then P'(U 1 U2 ) = p'(u 1 )P'(U 2 ), etc. Another possibility is
(22.11-2)
If G is a unitary group, e.g., U(n) or SU(n), then (22.11-3) and (22.11-4) are the
same, but otherwise, they are generally different.
We now show that ifG is SU(2), then the representation p' given by (22.11-3)
is equivalent to p; hence, in this case, no new representations are obtained by
these methods, and that is why these methods were not used in Section 22.8.
Namely, call
so that generally
Y
-l(ac b)d Y ( d
-b
-c).
a
(22.11-5)
The Finite-Dimensional Representations of SL(2, q 91
In particular, for any u in SU(2), U = y-1 uy, which can be seen by writing u as
(-b ~). Then, since y is also in SU(2),
p'(u) = p(y- 1uy) = p(y)-l p(u)p(y),
for all u; hence p and p' are equivalent representations.
When the representations (22.11-3) and (22.11-4) are extended from SU(2)
to SL(2, C) by writing
p'(m) = p(m) (22.11-6)
and
(22.11-7)
respectively, for m in SL(2, IC), they are no longer identical, or even equivalent.
The second one is equivalent to p, because (m T )-l = y-1 my, by (22.11-5),
since det m = 1, while (22.11-6) is inequivalent to p, for if the equations
p(m) = V-I p(m)V (22.11-8)
held for all m, then V would have to be = p(y) in order that this equation be
satisfied for mE SU(2), in which case it would not be satisfied for mf/: SU(2),
since y- 1my is not in general =m for such m.
Clearly, then, SL(2, IC) has more representations, in some sense, than
SU(2). To find them, we let XOO denote the set of all complex-valued functions
of two complex variables Xl and X 2 that are Coo in the real sense rather than
entire analytic in contrast with Section 22.10, and we denote these functions
by f(x!> X2, Xl' X2), following the procedure of Section 22.10. In place of
(22.7-2), we write
a6 - {Jy = 1, (22.11-10)
i.e., any matrix in SL(2, C). Now, in addition to the three matrices (22.7-3),
which are in SU(2) and correspond to rotations in space, and which determine
the infinitesimal operators L 1, L2 , and L3 by (22.7-6), we have three additional
matrices,
cosh m/2 sinh mlh)
( (m = CPJ,
sinh m/2 cosh m/2
cosh m/2 -i sinh m/2)
( (22.11-11)
i sinh m/2 cosh m/2
(e~2
92 Group Representations and Quantum Mechanics
(22.11-12)
1 1
K3 = -"2 (Xl aX! - X2 aX) - "2 (Xl aX! - x
2 ax).
The complete commutation relations are
[Li' L j ] = Lk
[Ki' K j ] = Lk
(ijk = 123,231, or 312). (22.11-13)
[Ki' LJ = 0
[Ki' L j ] = -Kk
L = Ll iL2' (22.11-14)
1 I-m I+m-l'-m'-l'+m'
l/l = l/llml'm' = eXl X2 Xl X2 , (22.12-1)
where
c2 = (1- m)! (/ + m)! (I' - m')! (I' + m')!,
Spinors 93
where I and I' are any two of the numbers 0, !, 1, !, ... , and where
m = I, I - 1, ... , -I,
m' = 1', I' - 1, ... , -I'.
For given I and 1', the space X(I, 1') spanned by the "'1m I'm' is the space of all
homogeneou~ polynomials of degree 21 in the variables x 1 and X 2 and of degree
21' in Xl and X2' This space has (complex) dimension (21 + 1)(21' + 1). It is
clear from (22.11-9) that each subspace X(l, I') is mapped into itself under
every p(u), and hence is an invariant subspace. From (22.11-12) we see that
22.13 Spinors
Spinors are sets of quantities related to SU(2) and SL(2, <C) in the same way
that the tensors (including vectors and scalars) of prequa.ntum physics are
related to the physical groups SO(3) and .ff'p' Their transformation laws
94 Group Representations and Quantum Mechanics
If we were concerned only with rotations of the x, y, z axes, and not with
Lorentz transformations, hence only with the group SU(2), this is all that
would need to be said. It was seen in the preceding section, however, that in
the study of the representations ofSL(2, C), the matrix mplays a parallel role
with m. A dotted spinor ofrank 1 is the association of a pair of complex numbers
~i, ~i with each frame of reference, according to the transformation law
Finally, a mixed spinor having r undotted and s dotted indices is the associa-
tion of 2r + s complex numbers with each frame of reference, with the trans-
formation law
(22.13-4)
EXERCISES
1. Let ~otJi be a mixed spinor of rank 2, and define quantities Vj U = 1, ... ,4) by
VI = ~Ii + ~2i'
V 2 = ~Ii - ~2i,
V3 = ~ti + ~2i,
V4 = i(~li - ~2i)
Show that quantities VI' ... , V4 transform like the components of a vector under rotations
and Lorentz transformations. Show similarly that a mixed spinor of rank 2r having r
dotted and r undotted indices determines a tensor of rank r.
2. A spin or is called symmetric ifit is symmetric in the dotted indices (unchanged
by any permutation of the dotted indices) and also symmetric in the undotted indices;
show that the transformation law of such a spinor gives a representation of SL(2, C),
which is equivalent to the representation p(l, I') defined in the preceding section, where
21 and 21' are the numbers of dotted and undotted indices, respectively.
CHAPTER 23
Locally n-dimensional space; sphere; torus; disk; Mobius strip; Klein bottle;
identification of edges; coordinate charts; compatibility of charts; induced
topology; Hausdorff separation axiom; manifold; curves; functions on a
manifold; connectedness; simple connectedness; component; homotopic
curves; homotopy classes of curves; fundamental group; double
connectedness of SO(3); configuration space of a mechanical system;
Cartesian product manifolds.
The theory of manifolds is basic for the theory of Lie groups and Riemannian
and Einsteinian geometries. The introduction of the manifold concept into
general relativity around 1960, mainly by Martin Kruskal, put a new light
on that subject and clarified the topological properties, both local and global,
of space-time models. Statistical mechanics deals with flows on manifolds.
Other applications of manifolds to physics appear from time to time, because
of their basic geometric nature. Only finite-dimensional manifolds will be
discussed. For more general manifolds, see Lang 1962.
A B
D C Figure 23.2
as if the rectangle were a narrow strip of paper, which has been bent into a
circle, has had one end twisted through a half turn, and then has had the edges
glued together.
If, in the above example, the edges AD and CB are also identified, in the
same manner, the Klein bottle is the result. (This would require considerable
stretching of the paper, to say nothing of the problem of self-intersection.)
Group manifolds were discussed in Section 19.5. The manifold of SO(3}
was realized as a certain 3-dimensional algebraic surface in a 9-dimensional
space. This surface is homeomorphic, in some neighborhood of each of its
points, to a region in E 3 , but as a whole is not homeomorphic to any region in
E3; it will be seen in Section 23.7 that it has a kind of connectivity that a
region in E3 cannot have.
The manifold of SO(3} can also be regarded as obtainable by a 3-di-
mensional version of the method of identification, used above for the Mobius
band. If ()x, ()Y' ()z are the intrinsic coordinates introduced in Section 19.6,
then each point of the ball 11911 :s; n represents a single element of SO(3),
and conversely, except that any two antipodal points on the surface, 9 and
-9, where 1/91/ = n, represent the same element of SO(3} and must be iden-
tified with each other. The identification cannot be achieved, in analogy with
the Mobius band, by distorting the sphere in 3-dimensional space and gluing
surfaces together, but evidently it can be achieved by suitably distorting the
sphere in a 9-dimensional space.
According to Exercise 1 in Section 20.6, the manifold of SU(2} can be
realized as the 3-sphere, i.e., the unit sphere in E4. This manifold is simply
connected but is also not homeomorphic to any region in E3.
of an n-tuple {xl, ... , xn} ~ x of real coordinates to each point P ofa specified
subset U of ffilo in such a way that the assignment P --+ x is a one-to-one
mapping <p of U onto a connected open set N in the coordinate space IRn; one
writes x = <p(P), and one refers to the triple {U, cp, N} as a coordinate chart
in ffil o. The notation is, of course, redundant, since U and cp determine N,
but it is convenient (see Note in Section 23.4). The vector x = cp(P) is
sometimes called the coordinate of P.
For example, if f) and qJ are polar coordinates on the sphere (f) = Xl and
qJ = XZ), then the mapping is from certain points of the sphere onto points
of the open rectangle (0 < f) < n, -n < qJ < n) in the f), qJ plane IRz. To
make the mapping one-to-one, it is necessary to omit the north and south
poles, f) = 0 and () = n, respectively, and the international date line, qJ = n.
To describe the entire sphere, one might use the method of identification,
that is, extend the mapping to the boundary of the rectangle and then decree
that the points () = 0, - n :-:; qJ :-:; n are all one point at the north pole, also
the points () = n, - n :-:; qJ :-:; n at the south pole, and that, for each () in
(0, n), the points with qJ = +n and qJ = -n are the same point. However, in
order to be able to impose smoothness requirements and ensure that the
surface, when glued together, really looks like a sphere, not like a kreplach
or a sopaipilla, a different procedure is needed.
If {U b CPI, N d and {U z , cpz, N z} are two overlapping charts in ffil o ,
they esta1:>lish a relation between the two sets of coordinates for points P in
the intersection U I II Uz , and this relation is one-to-one, because each of the
mappings P --+ CPI(P) and P --+ cPZ{P) is one-to-one. If we write x = CPI(P)
and y = cPz{P), then the resulting relation between x and y and its inverse
involve functions that will be denoted as follows:
i = 1, ... , n, (23.2-1)
i = 1, ... , n. (23.2-2)
Note. It follows from 2 that if either of the sets referred to in 1 is open in IRn,
the other is, too.
P~i_---+-----4
~i
""------.yi
Figure 23.3 Schematic sketch of two charts in a manifold.
Figure 23.3, where the various mappings indicated are the following:
If the polar angles e, cp on the sphere are the coordinates in the first system,
a second system can be chosen so that the coordinates e', cp' are polar angles
with respect to different axes. For example, the north pole N' in the primed
system (()' = 0) might be taken as the point (e = n/2, cp = n/2), in the old
system, and the angle cp' about this new north pole so chosen that the new
international date line is the portion (e = n/2, - n/2 < cp < n/2) of the old
equator. See Figure 23.4. It is clear that these two coordinate systems
together completely cover the sphere.
Figure 23.4
Definition of Manifold; Hausdorff Separation Axiom 101
EXERCISES
1. Find the transformations (23.2-1, 2) for this example, i.e., the relation between
e, !.p and e', !.p'.
2. Describe coordinate systems on the surface of the torus. Show that the torus
can be covered by two charts, but if simply connected charts are required, three are
needed.
Hausdorffspace. (However, see Lang 1962.) One then requires the coordinate
systems to be continuous with respect to that topology. On the other hand,
the existence of coordinate systems restricts the topology considerably,
in fact, in such a way that the space is locally Euclidean (this refers to topo-
logical, not to metric properties). For the purpose of this book it seems better
to let the topological properties be entirely determined by the coordinate
systems. Then, only the familiar topological concepts of Euclidean spaces
are needed, except for one consideration: When a manifold is being con-
structed by piecing together two or more coordinate charts, care must be
used to ensure that the Hausdorff separation axiom is satisfied - this question
will be discussed below.
Note. The discussion starts with a space IDlo, which is simply a collection
(uncountably infinite) of otherwise undefined elements, called points. In
some applications, the space IDlo is given in advance; for example, it may be
a group. In Riemannian geometry or general relativity, on the other hand,
one starts with a set of functions gflV(X I , ... , xn) defined for coordinates
xl, ... , xn lying in a certain domain N of the coordinate space IRn; each
point of N is then assumed to determine a point P of the Riemannian mani-
fold or the physical space being constructed or described. Then, the part of
the manifold or physical space thus described may be extended, by means of
coordinate transformations like (23.2-1, 2), and so on, until one believes, on
the basis of some criterion or other, that the complete manifold has been
defined (see, for example, Kruskal's criterion of geodesic completeness
described in Chapter 28). In this method, nothing is said in advance about
the abstract space IDlo or its subsets U, U/, etc., until the description is complete.
Each chart is specified by describing N, and nothing is said explicitly about
U and (j); hence we prefer to retain" N" in the designation {U, (j), N} of a chart.
Hausdorff Separation Axiom. If P and Q are any two distinct points, then
there are neighborhoods U and m
of P and Q, respectively, such that
Un m = o.
Definition. An n-dimensional manifold IDl is a space IDlo together with a (finite or)
countable set of compatible n-dimensional coordinate charts, which together
cover IDlo in such a way that the resulting topology satisfies the Hausdorff
separation axiom. It is understood that compatible coordinate systems can
be added or deleted at will, so long as the space IDlo is kept covered at all
times; the intrinsic properties of IDl are those properties that are unaltered by
such additions or deletions.
EXERCISE
x(t) = <p(P(t
are continuous functions of t, for all t for which they are defined, then pet)
is called a curve or path in IDl. If the fi(t) are of class C, then pet) is said to be
of class c. As t varies in an interval [t1' t 2], the function pet) describes a
curve ~ going from the initial point P(t1) to the terminal point P(t2). It is
assumed that all curves are either piecewise differentiable or at least recti-
fiable (i.e., that the image in any coordinate chart is such), unless otherwise
specified. To make that possible, it is assumed that all manifolds considered
are at least of class C 1.
Continuous or class C functions of two or more variables pet, s, ... )
are similarly defined.
If IDl is such that, given any two points PI and P 2 in it, there is a curve, ~
in IDl that goes from PI to P2, then IDl is called pathwise or arc wise connected.
If 9Jl is arcwise connected and is furthermore such that, given any two
curves ~1 and ~2 going from any point PI to any other point P 2, ~I can be
continuously deformed into ~2 in IDl; i.e., if there is a continuous function
Warning. The concept of an open set in 9)1 has nothing to do with the possible
embedding of 9)1 in a space of higher dimension. For example, if the unit
sphere x 2 + y2 = Z2 = 1 is regarded as a 2-dimensional manifold 9)1, then
a polar cap, i.e., the set of all points north of a given circle of latitude, is an
open set in 9)1 but not an open set in [R3.
describes the curve C(! 1 C(! 2' The law of composition of homotopy classes is
0
This applies when the terminal point of the curves of the first class is the same
as the initial point of the curves of the second class; otherwise, [C(! 1] 0 [C(! 2] is
undefined. It is easy to give a formal proof that the result is independent ofthe
particular curves C(! 1 and C(52 chosen from the respective classes. The "product"
[C(! 1] 0 [C(! zJ consists of all curves homotopic to the curve C(! 1 0 C(52, such as the
curve C(!~ in Figure 23.6.
This law of composition is associative but does not make the set of all
homotopy classes into a group, because the composition is not defined for all
pairs of classes, and nothing has been said about inverses. However, a group
can be obtained as follows: A fixed base point Eo is chosen, and consideration
is restricted to curves that start from Eo and return to Eo. (In the definition
of homotopy, it was not excluded that the initial and terminal points might
coincide.) The set of all homotopy classes of such curves is a group, called the
fundamental group of the manifold, and is denoted by 1[1 (m). If C(! 0 is a curve
C'3
Q
p R
Figure 23.6
Global Topology; Homotopic Curves; Fundamental Group 107
Figure 23.7
...,.-c
Bo--~
Figure 23.8
108 Elementary Theory of Manifolds
EXAMPLES
(1) If Wl is simply connected, then its fundamental group nl(Wl) is the trivial
group consisting of an identity element only.
(2) Let Wl be the surface of a cylinder (finite or infinite, but finite in the drawing
of Figure 23.9). Let e, z be cylindrical coordinates, and let values of e, z be represented
on a strip in the plane, as shown. Each point of Wl is multiply represented on the
strip, and in particular the base point Bo of Wl is represented by the points Bo, B' I'
B' 2, etc. A curve in the strip, such as C(j, going from Bo to any other image of Bo, say
B~, is the image of a closed curve in WI, and, conversely, every closed curve beginning
and ending at Bo in illl has such an image; furthermore, C(j can be continuously
deformed, within the strip, keeping its ends fixed, into any other curve going from
Bo to B~, such as C(j'. Consequently, for each of the possible terminal points B",
there is precisely one homotopy class of curves beginning and ending at Bo in Wl;
k is the net number of windings about the cylinder made by a curve of the class.
The composition of two such curves, say with terminal points B~ and B;, is a
curve with terminal point B"+l; hence nIOJJl) is isomorphic to the additive group
of integers, that is, to the infinite cyclic group Coo. The annulus a < x 2 + y2 < b,
the punctured plane x 2 + y2 > 0, and the Mobius strip all have fundamental
groups isomorphic to Coo.
4n A----+--B~ TERMINAL
POINT
91
2n _---If---_ B~
C c'
_z
(J to ) 0 Bo INITIAL
POINT
Bo
-2n B'_I
-4n B'-2
Figure 23.9
(3) Let Wl be the surface of the torus, which is given by the equations
z = a sin ex,
x = (A + a cos ex)cos {3,
y = (A + a cos ex)sin {3,
Global Topology; Homotopic Curves; Fundamental Group 109
where x, y, and z are Cartesian coordinates, a and A are constants (A > a > 0),
and a and f3 are two angles or intrinsic coordinates on IDl. See Figure 23.10. If a and f3
are allowed to vary unrestrictedly, then the number pairs (a, f3) and (a + 2nk, f3 + 2nl)
represent the same point ofIDl. LetthebasepointBobegivenbyx = A + a,y = z = 0;
it is then represented by any of the points (a, f3) = (2nk, 2nl) in a lattice in the a, f3
plane. Any curve from (0, 0) to (2nk, 2nl) in the plane represents a closed curve be-
ginning and ending at Bo in IDl and can be continuously deformed into another curve
from (0, 0) to (2nk, 2nl). Therefore, each integer pair (k, I) determines an element
of the fundamental group n 1(!Ill). Clearly, the composition of the elements determined
by (k, l) and (k', I') is the element determined by (k + k', I + 1'); that is, the funda-
mental group of the torus is isomorphic to the direct product Coo x Coo, i.e., to the
free abelian group on two generators.
a-I
A-
1-
:=0
I
1- Figure 23.10 The 2-torus.
(4) Consider the manifold IDl consisting of the plane with two points a and b
removed. In the theory of functions of a complex variable, a contour of integration
is specified by writing an equation such as
J = f(Q+a+.b-)
f(z)dz.
Here, the expression (a +, a +, b - ) indicates that the contour starts from some base
point B (not coinciding with a or with b), encircles the point a twice positively
(counterclockwise), then encircles the point b once negatively, and then returns to B,
as in Figure 23.11. In function theory it is taken as geometrically evident that this
procedure specifies the contour adequately, if fez) is analytic except for branch
points at a and b; that is, any two contours which follow the above prescription
B
Figure 23.11 A contour in the complex plane.
110 Elementary Theory of Manifolds
can be deformed continuously into each other without crossing either branch point.
In other words, the expression (a +, a +, b - ) determines a homotopy class of curves
in IDl, i.e., an element of n I (IDl). This point of view will be accepted here. The simplest
nontrivial elemerits of the group n l (IDl) are (a + ) and (b + ) and their inverses (a - )
and (b-); these group elements will be denoted by ex, {J, ex-I, and p-I. The general
group element is of the form
Figure 23.12
(5) Let IDl be the manifold of the rotation group SO(3). In Section 19.6 intrinsic
coordinates in IDl were introduced as the three components of a vector 0, which lies
in the spherical ball K = {a: 11011 :0; n} in the coordinate space. If opposite ends of
each diameter of K are identified (regarded as the same point), then there is a one-to
one correspondence between the points of IDl and those of K. A nontrivial element
of n l (IDl) is obtained by taking the base point B as the center K, and by considering
a curve C(j that goes from B along a radius to a point A on the surface, then jumps to the
antipodal point A', then returns to B, as shown in Figure 23.13. This curve C(j cannot
be collapsed onto B by a continuous deformation, because (a) any curve that starts
and ends at B and has such a jump has total length (in K) at least 2n, and (b) it is
intuitively clear on grounds of continuity that continuous deformation cannot make
the jump disappear. (This will be established more firmly in the next chapter.) Now
consider a curve C(j that starts at B and returns to B after making a finite number of
such jumps, let us say from Al to A'I' from A2 to A~, etc., where in each case the prime
denotes the antipodal point. By continuous deformation of the curve, consecutive
jumps can be made to disappear two at a time. Consider a portion of C(j that contains
A'
A'2
two consecutive jumps, as in Figure 23.14, where it consists of the parts PAl' A~A2'
and A;Q. By moving the part A'IA2 to the surface of K and simultaneously drawing
the points A2 and A'l (also Al and A;) together, the part A~A2 can be made to
disappear, and what remains is a curve, like the dashed one, going from P to Q
without a jump. If this procedure is continued, the curve C(j can be collapsed onto the
base point B, if it had initially an even number of jumps, or onto a curve having a
single jump, if it had initially an odd number. Therefore, the group nl(SO(3 is
isomorphic to the cyclic group of order 2, consisting of just two elements. If initially C(j
had infinitely many jumps, then many of these jumps would have to be very close
together, so that the value of 11911 would remain close to n between them, and a
continuous deformation could be made to a curve that has a finite number of jumps.
These results will all be obtained more simply and rigorously in the next chapter by
means of the covering of SO(3) by SU (2).
PRIMARY
PIVOT
SECONDARY
PIVOT
PRIMARY
PIVOT
SECONDARY
PIVOT
ordered pair (P, P'), where P and P' are arbitrary points of 9)1 and 9)1/,
respectively. That is, 9)1 x 9)1/, as a set, is the Cartesian product of 9)1 and IDl'
in the set-theoretical sense. (2) If {U, <p, N} and {U/, <p', N/} are any charts in
9)1 and 9)1', respectively, then a chart {U", <p", N"} is defined in Wl x 9)1/,
as follows: U" is the set of all points (P, PI) such that P is in U and P' is in U/,
and <p"P, PI)) is the (n + n')-component vector consisting ofthe components
of <pep) together with those of <p/(P'), i.e.,
Covering Manifolds
The 2-to-1 mapping If; ofSU(2) onto SO(3) found in Section 19.7 is more than
merely a group homomorphism. It is also a mapping of the manifold of
StJ(2) onto the manifold of SO(3) of the kind known as a covering. It is
locally a homeomorphism, in the sense that if P is any point on the manifold
of SU(2), and Q is its image in the manifold of SO(3), then P has a neighbor-
hood that is mapped homeomorphic ally by If; onto a neighorhood of Q.
Furthermore, if Q is any point of the second manifold, then there are always
two points P in the first manifold, each of which has such a neighborhood.
If; is a two-sheeted covering of SO(3) by SU(2).
A mapping If;: 9J1 ~ 91 of a manifold 9J1 to a manifold 91 (they will be called
the "upper" and "lower" manifolds, respectively) is called a covering of 91
by 9J1 if it satisfies the following two requirements, of which the first says that
all of 91 is covered, and the second tells just how it is covered: (a) If; is an onto
mapping; i.e., for each point Q in the lower manifold (91) there is at least one
point P in the upper one (9J1) such that If;(P) = Q; (b) each point Q of the
lower manifold is in some neighborhood 'n whose preimage If;-l('n)-this is
the set of all points of the upper manifold that are mapped onto points of
'n-consists of one or more disjoint neighborhoods U 1, U2 , ... ,or components
(one in each" sheet" of 9J1), each of which is homeomorphic with 'n; that is,
for each j, the mapping P ~ If;(P), restricted to Uj , is a one-to-one bicon-
tinuous mapping ofUj onto 'n. A neighborhood 'n in the lower manifold with
these properties is called a good neighborhood. (A mapping is called bi-
continuous if it and its inverse are both continuous.) If Xl, ... ,xn are co-
ordinates of P in the neighborhood Uj in 9J1, and if yl, ... ,yn are coordinates
114
Definition and Examples 115
Note. The following I-dimensional example shows that it would not have
been equivalent to require merely that each point P of the upper manifold
9'Jl have a neighborhood that is mapped homeomorphically onto a neighbor-
hood of the lower manifold 91: Let 91 be the unit circle in a plane, and let m
be an open interval of length greater than 2n wrapped around the unit circle.
Then the points a and b of 91 lying under the ends of m (see Figure 24.1) do
not satisfy the conditions of the definition, although, since m is open, each
of its points has a neighborhood that is mapped homeomorphically into 91.
If m denotes the Riemann surface of any algebraic function F(z), with all
branch points deleted, if 91 denotes the complex plane, with the corresponding
points deleted, and if IjJ is the mapping that maps any point of m onto the
point of 91 directly underneath it (i.e., the point with the same value of z
attached), then IjJ is a covering of 91 by m. If P is any point of 91, then P has a
neighborhood mwhich is simply connected and does not contain any of the
branch points of F(z). If a right cylinder is constructed with mas base, then
this cylinder intersects each sheet of the Riemann surface in a neighborhood
U which looks exactly like m. Hence, mis a good neighborhood.
A given manifold 91 may have many different covering manifolds IDl, and
may have many different coverings by a given m. If 91 is the unit circle Iz I = 1
in the z plane, then 91 can be covered by the real line by the mapping
Figure 24.1
116 Covering Manifolds
---,
I
Z I
e:
0
()
U
Figure 24.2
EXERCISE
() () ( ) etc.
Figure 24.3
Principles of Lifting 117
If ID1 and 91 are C k manifolds, then the mapping t/J is required to be of class
C k ; that is, if t/J maps Ponto Q = t/J(P), if IB is a good neighborhood of Q, if
U is the component of t/J - 1(1B) containing P, and if Xl, ... , Xn are coordinates
of P in U, while y1, ... , yn are coordinates of Q in IB, then, as P varies, the
Xi are functions of class Ck of the /, and conversely. (In nearly all cases of
interest, ID1 and 91 are analytic manifolds, and these functions are analytic.)
If the covering t/J is a one-to-one mapping, so that each of ID1 and 91 is a
covering of the other, then t/J is called a (C k ) homeomorphism and the manifolds
are called homeomorphic; they are topologically indistinguishable. In the case
k = 00, a homeomorphism is sometimes called a diffeomorphism.
A partition of [0, 1] into closed intervals [0, t l ], [tl' t 2], ... , [tN-I' 1] is called a
good partition if each segment Po([t j , t j + 1 ]) of the curve '6'0 lies in a good neighbor-
hood. A good partition exists, because each point of '6'0 is in a good neighborhood;
hence each t in [0, 1] is in an open interval I such that the curve segment P 0(1) is
in a good neighborhood. These open intervals cover [0, 1]; by the Heine-Borel
theorem, a finite number of them cover [0, 1]; if these are arranged in order of
increasing t, then t 1 can be chosen in the intersection of the first and second of these
neighborhoods, t2 in the intersection of the second and third, etc. See Figure 24.4.
Thus, a good partition is obtained. Call Uj the good neighborhood in which the curve
segment P o([t j , t j + 1]) lies. For eachj = 0, 1, ... , N - 1, a segment P 1([tj , t j + 1]) of
118 Covering Manifolds
Figure 24.4
the curve ~I = {PI(t):O:::; t:::; I} in the upper manifold Wli is now defined, in-
ductively, as follows: First let 5!l0 be the component of IjJ-I(UO) that contains the
base point BI of Wl 1, and define PI ([0, t l ]) to be ~-I(P 0[0, t I]), where ~ is the
restriction of IjJ to 5!l0; ~ is a homeomorphism of 5!l0 onto Uo; hence PI ([0, t I]),
thus defined, is a curve segment in Wl i . Now suppose PI([tj , t j + I]) has been defined;
since Po(tj+ I) lies in Uj+ I as well as in Uj , 5!lj+ 1 can be taken as the component of
IjJ-I(Uj + I) that contains the endpoint P I(t j + 1) of the previously defined segment of
'!&'t; then the segment PI([tj + l , t j +2]) is defined as ~-I(PO([tj+I' t j + 2]), where now
~ is the restriction of IjJ to 5!l j+ l' In this way, the curve'!&' j in Wli is constructed. It is
uniquely determined by the curve'!&'o in the lower manifold and the choice of the base
point B I in the upper one; in particular, it is independent of the choice of the good
partition of [0, 1], because any two partitions have a common refinement, and'!&'l
is obviously not altered by refining the partition used (i.e., by adding further points
of subdivision of [0, 1]). Each of the curves <go and '!&'j uniquely determines the other.
Corollary. If two curves C(/O and (If?~ in the lower manifold, both runningfrom
Bo to some point A o , are homotopic, i.e., if one of them can be deformed in 9R o
continuously into the other, keeping the endpoints fixed, then the curves that
result from lifting them up to 9R 1 also have a common endpoint A 1 and are
homotopic in 9R 1 .
SKETCH OF THE PROOF. Let poet, s) in the lemma be such that, for each fixed s
in [0, 1], poet, s) traces a curve from Bo to Ao. as t increases from to 1. and such
that for s = this curve is '!&' 0 while for s = 1 it is '!&'o; then use a continuity argument.
based on a good neighborhood of Ao. to show that the terminal point of the lifted
curve, P 1(1, s) cannot jump from one sheet of Wl j to another, as s varies.
Universal Covering Manifold 119
EXERCISE
Complete the proof by showing (a) that l/1Z1 is an onto mapping, (b) that any Ql
in WlI has a neighborhood msuch that each component of l/1i./( m) is homeomorphic
to munder l/1z j, and (c) that if9'Jl 1 is also simply connected then Ql uniquely determines
Qz, so that l/1Z1 is one-to-one.
Note. The statement that two manifolds are homeomorphic says nothing
about how they might look if they are embedded in some Euclidean space of
higher dimension. A circle in the plane, as a one-dimensional manifold, is
homeomorphic to a simple knot in space; a simple loop of paper is not
homeomorphic to a Mobius band, but it is homeomorphic to a loop of paper
that has two half twists (one end was twisted through a full turn before being
120 Covering Manifolds
ffil2
B' 0 .. - - .. ,....
1 ...... Q
... 1
''''--_OQl
Figure 24.6
Comments on the Construction of Mathematical Models 121
glued to the other end) or any even number of half twists, and a Mobius band
is homeomorphic to such a loop with any odd number ofhalftwists. This can
be seen by considering the method of identification of edges, discussed in
Section 23.1, for constructing these manifolds.
[Rndetermine a point p ofUa with coordinates cpj(p) ~ xi, where the xj are the
components of x. m is made up of the points determined in this way by all
the charts K a , K p, ... , L(, L~, ... , ... ; they are all distinct points of m except
for the identifications to be made as we specify the overlap of the charts.
The overlap of K and L in ill is described, according to (24.5-1), by
(24.5-3)
which gives a one-to-one mapping from part of N onto a part of N' and hence
gives two coordinate systems in the region U n U' of m. If a ~ '1, as above,
we specify that the overlap of Ka and L~ is given by the same equations
(24.5-3), and we identify the point of Ua having given coordinates Xl, ... ,xn
with the point ofU~ having corresponding coordinates X'l, ... , xln determined
by those equations.
Clearly, this procedure determines a manifold m of the same differentia-
bility class C k as m. The projection tjJ of m onto ill is easily defined by project-
ing each Ka onto the corresponding K: Each point of Ua in mis projected onto
the point of U in mhaving the same coordinates Xl, ... ,xn. This projection tjJ
is of class C k because in these coordinates it is just the identity mapping.
Lastly, to show that m is simply connected, we first choose a base point Ao
in m as one of the points that lie over the base point Bo of m, as follows: Bo
lies in some chart in m, say L. Then the paths of the homotopy classes (, '1, ...
can be taken as closed paths beginning and ending at Bo. One of the classes,
say '1, is the class of nullhomotopic paths-paths that can be shrunk con-
tinuously in mto the point Bo. Let Ao be the point of L~ that lies over B o ,
i.e., has the same coordinates Xl, ... , xn in L~ that Bo has in L.
Given any chart Ka in m, we choose one of the paths in the class a in mand
lift it, according to the first principle of lifting in Section 24.2, up to m as a
unique path, which we call a', from the new base point Ao to a point in Ka.
Then, whenever charts Ka and L( overlap, we have a ~ (; hence by the second
principle oflifting, the paths a' and" are homotopic in m, or, more precisely,
they become homotopic if they are so chosen as to have a common terminal
point in the intersection of Ka and L,.
Now let~: peA), 0 ~ A ~ 1, be any path in m; we wish to show that it is
homotopic to any other path that also goes from P(O) to P(1), i.e., that m
is simply connected. Each point Pea) of ~ lies in some chart, i.e., peA) lies in
that chart for A in some interval (a - , a + c). By the Heine-Borel theorem,
Figure 24.8
Manifolds Covered by a Given Manifold 125
Figure 24.9
See Figure 24.9. The right member here depends only on the initial and
terminal points P(O) and P(1) of Cfi; hence any other path from P(o) to P(l)
is homotopic to Cfi, as required.
1, 2, ... , of the plane coincide with a single point of the cylinder. There-
fore, a manifold 91 homeomorphic to the cylinder Z can be constructed by
defining each set {(x + 2rcl, y): 1= 0, 1, ... { ~ 1jJ((x, y to be a "point"
of 91 and by defining charts in 91 in an obvious way. The mapping (x, y) ~
1jJ((x, y is then a projection of Wl onto 91. One says that all the points
(x + 2rcl, y) of each set have been identified (i.e., made identical). Note that the
factor 2rc is irrelevant, because only the topological properties of 91 are in-
volved. Identification of the points (x + n, y) or, more generally, of the points
(x + an, y), where a is any nonzero real number, would have the same effect.
Similarly, if, for each x, y in Wl, all the points of the form (x + I, y + m),
where I and m run independently over 0, 1, 2, ... , are identified, then the
resulting manifold 91 is the torus (more properly, is homeomorphic to the
torus).
Let Wl be the infinite strip -1 < x < 1, - 00 < y < 00. For given x, y in Wl,
let the points (( -lYx, y + I), I = 0, 1, 2, ... , be identified. The resulting
manifold 91 is the Mobius band.
To generalize from these examples, let Wl be any (connected) manifold.
Suppose that (J is a homeomorphism (of class ck, ifWl is a C k manifold) ofWl
onto itself. Denote by (Jl the lth iterate of (J, i.e.,
(JI(p) = (J(o{ .. (J(P) . ..
~
,
I repetitions
and denote by (J-l the lth iterate of the inverse mapping (J-1. For any point
P in Wl, consider the point set
IjJ(P) ~ {(JI(p): 1= 0, 1, ... }. (24.6-1)
[In the first example, (J is the displacement (x, y) ~ (x + 2rc, y) in the plane.]
Suppose further that (J is such that the point set IjJ(P) is discrete in Wl, for
every P; that is, suppose that there is a neighborhood of P that contains none
of the other points (JI(p) with I =f. 0. Then, since (J is a homeomorphism, each
~(P) has a neighborhood that contains no (JI(p) with I =f. k.
Under these assumptions, a manifold 91 whose" points" are the sets IjJ(P)
is now constructed:
91 = {1jJ(P): P E Wl}.
To define charts in 91, let {U, <p, N} be a chart in Wl. Assume that U is small
enough so that for no P in U is (J(P) also in U. (If this is not the case, replace
the chart by a suitable subchart.) A chart {U, q" N} is then defined in 91
as follows:
U = {IjJ(P): P in U}.
For P in U, q,(IjJ(P ~f <pep),
hence,
N=N.
It is left as a quite obvious exercise to show that (a) q, is one-to-one, (b) charts
defined in this way are pairwise compatible and cover 91, (c) the mapping
Manifolds Covered by a Given Manifold 127
P ~ ljJ(P) is onto 91, and (d) any neighborhood U of the kind defined above
is a good neighborhood of each of its points, because the components of
ljJ - l(U) are the sets O"I(U), I = 0, 1, .... The conclusion is that ljJ is a covering
of 91 by m.
It will now be shown that every covering of a manifold 91 by a manifold m
is associated with a group of homeomorphisms in m of the kind described
above.
Let m and 91 be connected n-dimensional manifolds, and assume that m
covers 91 by a projection ljJ (which is assumed not to be one-to-ohe, so that
the covering is not trivial). Let B1 and Bo be basepoints in m and 91, where
B1 lies over B o , i.e., ljJ(B1) = Bo. We wish to show that, corresponding to
each other point B'l of m that lies over B o , there is a homeomorphism 0" of
m that carries B 1 into B'l.
The proof is quite simple if m is simply connected, and that case will be
discussed first. Let C(j 1 be a curve in m from B 1 to B~, and let C(j 0 be its image
in 91; C(j 0 is a closed curve beginning and ending at Bo; C(j 1 and C(j 0 will be
kept fixed during the discussion, and by means of them a homeomorphism
0" ofm onto 91 will be constructed, under which B1 goes into B'l. Let
But these are closed curves in 91 beginning and ending at A o , and they are
both homotopic to Co, and it follows from the corollary to the second
principle oflifting that p I l P'l and QI1 Q'l are homotopic, and hence that
128 Covering Manifolds
Lie Groups
Lie Group G; linear Lie group; tangent vector; Lie algebra Aof G; Lie product;
Jacobi identity; abstract Lie algebra; structure constants; local isomorphism
ofSU(2) and 50(3); exponential mapping of A into G; logarithmic (or normal)
coordinates in G; adjoint representations of Lie algebras and simply
connected Lie groups; the Campbell-Baker-Hausdorff formula; translation
of charts; ideals; simple Lie algebra; local and global homomorphisms of
groups; homomorphism theory; center of a group; center of an algebra;
covering group; direct and semidirect sums of Lie algebras; classifications of
the simple Lie algebra.
The subject ofthis chapter is the advanced theory of continuous groups, often
called, inaccurately, the theory of Lie groups. Most of the groups themselves
play a role in physics and mathematics at a more elementary level. Among them
are the rotation and rigid motion groups, the Lorentz and Poincare groups,
and the unitary and symplectic groups. What is new is the study of the groups
and the structure they comprise from a deeper analytic, algebraic, and
topological point of view. The key to the study is the theory of Lie algebras
and of the interaction between the groups and their algebras. That inter-
action has played a role in quantum mechanics from the beginning, in that the
elements of the Lie algebras have appeared as operators derived from the
symmetries of a system. In the last 25 years much of the terminology and
certain specific groups, such as the groups derived from the Lie algebra G2 ,
have appeared in particle physics. So far, the applications to particle physics
has been mostly heuristic, but it seems likely that as the physical theory
becomes fully developed, the details of the mathematical structure will be of
greater importance. Most presentations of the theory are quite recondite
and hence rather difficult for the nonspecialist. I have attempted to present
the subject in the most elementary way possible consistent with describing
the complete structure. For instance, a vector field on the group manifold
is defined as consisting of components subject to a transformation law, just
as elsewhere in physics, rather than as an abstract mapping (derivation) in an
algebra of Coo functions.
129
130 Lie Groups
obvious way, groups of matrices, namely quotient groups G/H and semi direct
products. Every compact Lie group is linear, but the proof depends on quite
advanced developments in the theory. See Chevalley 1946. Two nonlinear
Lie groups are described in the Appendix to this chapter. The abstract
theory is presented below, but the specialization to matrices is mentioned at
various points; see Exercises 1-7 in Section 25.14.
Let G be a group. Suppose that, in the space whose points are the elements
of G, there is defined an n-dimensional coordinate chart {U, <p, N} such that
U contains the identity element 1 of the group. (The symbol "1" is used
because" e" is needed for exponentiation). It is assumed, for convenience,
that <p maps 1 onto the origin of [Rn: <p(l) = O. A subset Uo ofU is called open
(as in Chapter 23) if <p(U o) is an open subset of N in [Rn.
We assume that products and inverses of group elements are continuous
in this chart insofar as their coordinates are defined. It then follows that we
can define a smaller chart, with special properties, as follows: Let 9 and h be in
U. If g and h are close enough to 1, that is, if <peg) and <p(h) are close enough
to the origin in [Rn, then gh, g - \ and h - I are also close to 1. In particular, if
g = h = 1, then gh, g- 1, and h- I are = 1 and their coordinates are defined
and are all zero. Hence, by continuity there is a neighborhood UI of 1 such
that if g and h are in U I , the coordinates of gh, g, and h are defined and are in
the open set N of [Rn. It is convenient to consider an even smaller neighborhood
U o = U l (\ U t \ where Uti ~ {g-l: g E Ud, and to call No = (j)(U o) c N.
Then, if g and h are in U o , gh is in U, while g - I and h - I are in Uo. A vector-
valued function m(xI' x z) is therefore defined, for all Xl and Xz in No, by
The group G, together with the n-dimensional chart {U, <p, N}, will be called
an n-dimensional Lie group if the functions m(, .) and 1() are defined in an
open set No, as described above, and are of class C 4 . Later, further charts will
be obtained from {U, <p, N}, by use of the group operations, in such a way as
to make G into a manifold.
All the groups described in Chapter 19 are Lie groups when coordinate
charts are suitably defined in them.
(Some authors require the manifold of a Lie group to be connected; for
reasons given in Section 25.11, that requirement is irrelevant.)
For example, let G be the rotation group SO(3) with the intrinsic coordinate
{}x, {}Y' {}z discussed in Section 19.6. Then U can be taken as the set of all group
elements for which 11011 < n (i.e., all for which 11011 i= n), and Uo as the set
for which 11011 < n/2. Hence N is the interior of the ball K in [R3 described in
section, and No is the open ball of half the radius of K. The same coordinates
can be used for 0(3); in that case the entire second component of the mani-
fold is outside U.
132 Lie Groups
To derive the properties of Lie groups from the above definitions, the Lie
algebra A = A( G) of a Lie group G is constructed; A is an n-dimensional
linear space of elements A., JI, ... , in which a multiplicative operation [A., JlJ,
the so-called Lie product, is defined. The structure of A is completely de-
termined by the properties of G in any arbitrarily small neighborhood of 1 ;
on the other hand, A completely determines many of the properties of G.
Then, the so-called exponential mapping from A into G is constructed; it
generalizes the mapping M -+ eM for matrices. In some neighborhood of the
origin of A, the mapping is one-to-one, and the components of an element A.
serve, via the inverse mapping, as the so-called logarithmic coordinates in G.
Other coordinate charts are later obtained from this one by translations in G
and are related to it analytically. An expliCit formula (the CBH formula-
see Sec'tion 25.10) then gives m(A., JI) in terms of A. and JI and shows that the
dependence of gh on g and h is analytic in these coordinates. The formula
depends only on the structure of the Lie algebra, which shows that, in a
neighborhood of 1 where the logarithmic coordinates are defined, the
structure of the group depends entirely on its infinitesimal group elements.
The study of Lie groups combines analysis, algebra, and topology in
almost equal proportions. The application of powerful techniques of linear
algebra yields a complete classification of Lie algebras, from which follows a
classification of Lie groups. That may seem surprising in view of the origin
of Lie groups often as groups of nonlinear transformations-see Eisenhart
1933.
Since the functions m(x, y) and I(x) are of class C 4 , they can be expanded in
Taylor's series in the components Xi and i of x and y, through 3rd order
terms, with remainder terms of order 4. The group relation a1 = 1a = a for
arbitrary a shows that
m(x, 0) == m(O, x) == x (25.2-1)
[it is recalled that .:p(1) = 0]. Therefore, in the expansion ofm(x, y) about the
origin, the constant term vanishes, the linear part is x + y, and the quadratic
part contains terms like xiY', but not terms like XiXk or yil; that is,
(25.2-2)
where the a's, b's, and c's are coefficients and where the summation con-
vention has been used.
The associativity axiom of group theory imposes on m( " . ) the restriction
m(m(x, y), z) = m(x, m(y, z)). (25.2-3)
If only the linear and quadratic terms of the expansion ofm(, -) are included
then (25.2-3) is satisfied automatically; nevertheless, associativity imposes
certain restrictions on the coefficients ajk of the quadratic part, which can
The Lie Algebra of a Lie Group 133
be seen only when the cubic terms are also included. Substitution of (25.2-2)
into (25.2-3) gives, after considerable cancellation,
i. a j xlymzk
aJk lm + b~Jkl (xjyk + xkyj)Zl i ak xjylzm
= aJk lm + c~Jkl Xj(ykzl + ylZk) .
(25.2-4)
The b's and c's will now be eliminated from this relation. Since j, k, I, and m
are summation indices, they can be renamed in each term in such a manner so
that the factors xk/zm appear throughout; then, since the equation is an
identity in x, y, Z, the net coefficient of xkylzm must vanish; this gives
some interval : ;
the identity element 1. Such a curve is given by a function get), defined for
t ::;; 8 (8 > 0), such that g(O) = 1 and such that the cor-
responding curve x(t) = <p(g(t in the parameter space [Rn has a tangent at
each point (including t = 0, which is in fact the only point that matters). If
x(t) = <p(g(t = At + ... , (25.3-1)
then the components IV of A transform as the components of a contravariant
vector at the point x = of the manifold, when the coordinates are changed
(see Section 26.1). Namely, since
d
A = dt <p(g(tlt=o,
134 Lie Groups
(25.3-3)
Equation (25.3-2) shows that the function k(t) ~ k(Jt) is a curve in G
emanating from 1 and that v is its tangent vector; v is called the Lie product
of A. and 11 and is denoted by the Lie bracket expression
v = [A., 11]. (25.3-4)
[One can also establish directly from (25.3-3) that the Vi transform as the
components of a vector, when coordinates are changed, after the rather
complicated transformation laws of the coefficients a;k have been worked
out.]
From (25.3-3) it follows that the Lie product is linear in each factor and is
antisymmetric: [11, A] = - [A., 11]; from the identity (25.2-5), which was
The Lie Algebras of Linear Groups 135
deduced from the associativity in G, it follows that the Lie product also
satisfies the Jacobi identity
[A, [~,v]] + [~, [v,A]] + [v, [A,~]] = 0. (25.3-5)
Am example of a Lie algebra is the algebra of vectors in 1R 3 , where the Lie
product is defined to be the vector product [A, ~] = A x ~,in the notation of
Gibbs. The Jacobi identity can be verified either by writing (25.3-5) in com-
ponents or by use of the identity A x (~ x v) = (~ . v)~ - (~. ~)v. Lie
algebras of matrices are discussed in Section 25.5, below.
Tlz =
1(01 -1)o '
2. Tl3 =
1
2.
(-i
0
and that these matrices satisfy the same relations as the Ei , namely,
(ijk = 123,231, or 312).
According to (25.4-1), these relations determine the structure of the cor-
responding Lie algebra completely. Therefore, if A3 and A z denote the Lie
algebras of SO(3) and SU(2), respectively, then the linear mapping of A3
onto Az induced by Ei ~ Tli (i = 1, 2, 3) is an isomorphism. Regarded as
abstract Lie algebras, A z and A3 are identical, whereas the corresponding
groups are not. As will be seen later, the isomorphism of the algebras induces
a one-to-one mapping of the groups only in a neighborhood of 1. This
mapping is an isomorphism as far as it goes, but, when extended globally,
it becomes the 2-to-1 homomorphism of SU(2) onto SO(3) discussed in
Section 19.7.
Note that Az and A3 are real Lie algebra. Although the matrices Tlb Tlz, Tl3
are complex, Az consists of the linear combinations of these matrices with
real coefficients.
Theorem. The solution x(t) of(25.6-3) satisfies (25.6-2)for all t and s such
that x(t), xes), and x(t + s) are defined.
Note 1. This does not follow in any trivial way from the mere forms of the
equations (25.6-2, 3) because, as will be seen, the associative law of group
multiplication has to be used in the proof.
Note 2. Once the functional equation get + s) = g(t)g(s) has been established
for t, s, and t + s in an interval ( - T, T), the equation itself is then used to
define get) for t in (-2T, 2T), then in (-4T, 4T), etc. In consequence, get) is
uniquely defined for all t and satisfies the functional equation for all t and
s-details to be supplied by the reader.
Note 3. If G is a group of matrices, so that the elements Aof A are also matrices,
then the corresponding equation
(25.6-5)
is usually established as follows: The matrix
J1(s) = (e At )-l e A(t+s>,
138 Lie Groups
as a function of s, satisfies the same differential equation and the same initial
condition as the function eM, namely
d
ds pes) = p(s)A.,
p(O) = I;
since the solution of this initial value problem is unique, it follows that
(el.l)-l~(I+S) = eM,
which is equivalent to (25.6-5). This argument is used as a model for the proof,
given below, for the abstract case.
For t = 1, we have
X(A.) = <p(el.).
Furthermore, at A. = 0, the Jacobian
det(~~;) = det(q~{A., 0
is different from zero, because q~(O, 0) = t5~. Therefore, the function X(A.)
has an inverse in some neighborhood of the origin, and the components of A.
An Auxiliary Lemma on Inner Automorphisms; the Mappings Ad~ 139
Lemma. Let e" be afixed group element;for each A, in A let get) be a smooth
curve, with g(O) = 1, whose tangent vector at 1 is A" and let A,' be the tangent
vector at 1 to the curve e"g(t)e-"; then, the mapping A, ~ A,' is a linear
transformation in A given explicitly by
(25.7-1)
PROOF. For each fixed s in the interval [0, IJ, the group automorphism get) ~
e'''g(t)e-''S induces a mapping A ~ A(s), in the manner described in the lemma, and
it will be proved that A(s) satisfies the same differential equation in s as esAd"A. The
logarithmic coordinate of the group element get, s) = e'''g(t)e-''S is
(25.7-3)
where, as in the preceding sections, m(, .) gives the coordinate (here, the logarithmic
coordinate) of the product of two group elements in terms of the coordinates of the
factors. We recall the definition (25.6-4) of q~(x, y) and define similarly
. o.
pj(x, y) =- m'(x, y). (25.7-5)
oX j
Differentiating (25.7-4) with respect to t shows that the tangent vector at t = 0 to
the curve x(t, So + s) is given by
(25.7-6)
From the expansion (25.2-2) of mi(x, y), the corresponding expansions of p~ and ql
are, to first order,
p~ = b~ + a~,i + "',
ql = bl + a{k x' + .. '.
(These expansions and the quantities p~ and ql now all refer to the logarithmic co-
ordinates.) From (25.7-6), then,
that is,
(25.7-7)
EXERCISE
Show that, for fixed 11, the linear mapping A. -> eAdpA. is an automorphism of A,
that is, (1) that it is one-to-one and (2) that
eAdp[A., v] = [eAdpA., eAdpv]. (25.7-8)
One says that eAdp is the inner automorphism of A induced by the inner
automorphism g --> e"ge-" of G.
Comment. If e"' and e" 2 are any two group elements, and if e" 3 = ell 'ell2 , then
the automorphism
d
dt etA = Ae tA = etAA. (25.8-4)
are tangent vectors through the point g(s, t)-1g(S, t) = 1 for all sand t, i.e.,
are always in A and can be further differentiated.
PROOF. To verify (25.8-6) for s = So, t = to, write g(s, t) = g(so, to){j(s, t); then
pcan also be written as
IX and
a-
IX(S, t) = {j- I a~'
P(S, t ) = --I
a-9
9 at'
Since {j(so, to) = 1, the expansions of the coordinates of {j and g-I in powers of
s - So = Sl and t - to = tl start with the linear terms [q(I) = 0 is assumed]:
(25.8-7)
hence,
yi(S, t)dJ[ cpi({j(S, t)-I) = -AiS I - thl - Aisi - Bislt l - Citi
+ a~k(A.isl + Jlitl)(AkSI + l t l ) + .... (25.8-8)
Since IX(S, t) is the tangent vector to the curve obtained from {j(s, t)-I{j(S', t) by
varying s', for given sand t, and then setting s' = s, and similarly for p(s, t), it follows
that
. a.
ex'(s, t) = as' m'(y(s, t), x(s't)) 1.,= ..
. a.
P'(s, t) = at' m'(y(s, t), x(s, t)) I., =.;
which is the desired result (25.8-6), by virtue of the definition (25.3-3, 4) of the Lie
product.
where
1 - e- z 1 1 2
f(z) = z
= 1- - z
2!
+ -3! z -'" . (25.9-2)
Comment 1. On the left side of (25.9-1), de1./dt is a tangent vector at the point
e1. of the group G; multiplication of this vector on the left by e -1. transforms
it into a tangent vector at the point 1 of G, i.e., into an element of A.
144 Lie Groups
Comment 2. On the right side of (25.9-1), since the transformation AdA can be
represented by an n x n matrix, and since the series for fez) converges
absolutely for all z, it follows thatf(Ad A) is a well-defined linear transforma-
tion in A. [In particular, in and A,' commute, i.e., if AdA A,' = [A" A,'] = 0, then
the right member of (25.9-1) is just A,' itself.]
where A is the matrix of the transformation Ad).(t); the solution that satisfies the
initial condition ~(o) = 0 is
1 - e- As
~(s) = A'.
A
[Note. A is a singular matrix, because Ad). A = O. The expression (1 - eAS)jA denotes
the matrix obtained by substituting A for z in the entire function (1 - eZS)jz.] There-
fore,
as was to be proved.
This differential equation can be simplified by use of the lemma of Section 25.7,
which says, in the notation of Section 25.8, that the mapping eAd " is the mapping
v ..... e"ve-"; specifically,
eAd,,(,)V = e"(')ve-a(t) = e"e'''ve-'''e-l.
= eAd"e'Adpv.
146 Lie Groups
Now, the unknown appears only on the left, and the CBH formula follows directly
by integrating from t = 0 to t = 1 and by using the conditions
0"(0) = log eA = A., 0"(1) = log eAe" = 0".
Since Ad," is the transformation v ~ [A, v], the matrix elements of Ad," are
linear functions of the components of A. Therefore, the matrix elements of the
transformations exp{Ad,"} and exp{t Ad,,} are analytic functions of the
components of A and Jl. Since t/J(z) is analytic, for 1z - 11 < 1, it follows that,
for A and Jl restricted to the neighborhood IV of the origin in A referred to in
Comment 1, the components of (J = log e'"e" are analytic functions of the
components of A and Jl; by analytic continuation, they are analytic for all A
and Jl such that the logarithm is defined.
When logarithmic coordinates Ai are used, log e'"e" is simply the multipli-
cation function denoted by m(A, Jl) previously. Although this function was
only assumed to be of class C 4 , it is now seen to be analytic, when logarithmic
coordinates are used. In these coordinates, the inversion function I( . ) is given
by 1(1.) = - A, and hence is also analytic.
EXERCISE
Express the matrix elements of AdA in terms of the components Ai of A., for a given
basis E 1, .. , En in 1\., and the corresponding structure constants C)k.
then for each g = ag 1 in alID, acp(g) is defined as log g 1. Note that the image
of alID under the mapping g ~ acp(g) is the same open set N in the coordinate
space A as the image of lID under the mapping g ~ log g. Right-translated
charts {lIDa, CPa' N} are similar.
Comment. If a lies in lID, then the left (also right) translation by a is a homeo-
morphism in the basic chart, as far as it is defined, because the coordinates of
ag are continuous (in fact, analytic) functions of the coordinates of g, by the
CBH formula, and the coordinates of g are continuous functions of the co-
ordinates of ag, because g = a- 1 (ag). Hence, any translated chart is com-
patible with the basic chart, and it will be shown that any two translated
charts are also compatible with each other. G thus becomes a manifold, and
Theorem 2 below then shows that the translations are homeomorphisms
in all G.
Figure 25.1
148 Lie Groups
first condition for compatibility is satisfied. Second, it must be proved that the co-
ordinates lP(g) = log gl and b<j)(g) = log hI are analytic functions of each other,
for all 9 that can be written both as ag l and bh l , with gl and hI in illl Since b-Ia =
hl gil, and since hI and gl are in W, it follows that b-Ia is in m(even though a and h
themselves may not even be in the neighborhood U where logarithmic coordinates
are defined). Hence, log hI' which is =log((b-Ia)gl), depends analytically on
log gl, according to the CBH formula. Similarly, log gl depends analytically on
log hI' Lastly, a similar argument shows the compatibility of two right-translated
charts or of a right-translated and a left-translated one.
hence the last member of this equation must be shown to depend analytically on
log 9 and log h. The element b - 19b is obtained from 9 by a left translation by b - I
Lie Algebra Homomorphisms 149
kind such that any relation that holds among elements of the first structure
also holds among their images in the second structure. Some of these relations
generally turn out to be trivial ones in the second structure, like 1 0 1 = 1 or
o + 0 = 0, while the remaining relations may be regarded as showing some
of the main features of the first structure, but with less fine detail. Just as for
groups, such a mapping exists if and only if the first structure contains a
particular kind of substructure (e.g., a normal subgroup), which can serve as
the kernel of the mapping; the theory shows how to reconstruct the mapping
in question, when the substructure is given, by first forming the so-called
quotient or factor structure (e.g., factor group) and then constructing the so-
called natural homomorphism of the first structure onto the quotient struc-
ture, which is then shown to be equivalent to the original homomorphism.
The execution of this program is straightforward for Lie algebras. For
Lie group, each idea introduced has an immediate parallel in the correspond-
ing Lie algebras, and this interplay between the groups and their algebras
provides powerful techniques for the investigation of the groups.
If A and X are real Lie algebras, a homomorphism A ----> X is a mapping ljJ
of A into X which preserves all the operations of the Lie algebra; that is, if
A, p are in A and a, b are in IR, then ljJ(aA + bp) = aljJ(A) + bljJ(p), and
ljJ([A, p)] = [ljJ(A), ljJ(p)]. For complex Lie algebras, the definition is the
same, except that C replaces IR.
Comment. It is not really necessary to assume in advance that Xis a Lie algebra;
it only needs to be a structure in which A + p, aI., and [A, JI] are defined.
However, the part of it onto which A is mapped by the homomorphism, i.e.,
the image ljJ(A), is then necessarily a Lie algebra.
space, called the factor space of A modulo Ao. We shall show that if Ao is an
ideal, the factor space can be interpreted as a Lie algebra.
hence, if A - AI and ~ - ~I are in Ao, then both terms on the right are in Ao; hence,
[A, ~J and [AI, ~IJ are in the same residue class, as claimed. (2) Conversely, if, for
arbitrary A and ~, [A, ~ + <rJ is always in the same residue class as [A, ~J, for any <r in
Ao. then [A, <r] E Ao, and it follows that Ao is an ideal. (3) The definition [II' i z] =
[AI' Az] of the product of two residue classes shows that the mapping 1..-->1. is a
homomorphism, and then it follows from the comment after the definition of homo-
morphism that the factor space is a Lie algebra.
In the case of Lie groups, a homomorphism must preserve not only all
algebraic relations but also all local topological and analytic relations
arising from the manifold structure.
152 Lie Groups
If G and G are Lie groups, then a mapping \f' of G into G is called a Lie-
group homomorphism if:
1. It is a homomorphism in the sense of group theory: \f'(gh) = \f'(g)\f'(h);
\f'(g-l) = \f'(g)-I.
2. It is a continuous mapping; that is, if <t> and (j) are any coordinate
systems in G and G, respectively, then each component of the vector
(25.13-1)
is a continuous function of the components of x, for all x such that the
above expression is defined.
Remark 4. If lfI is one-to-one and onto, and if \f'- 1 is also continuous, then lfI
is a Lie-group isomorphism.
PROOF OF THEOREM 1. For any A sufficiently near the origin in A, we can define
an element of i\. as
}. = log('P(e'-).
Lie Group Homomorphisms 153
We must show that the mapping A --> l is linear and maps [A, JlJ onto [l, ji]. We
show first that it maps tA onto t'i.. for real t, that is, that ?i. = t'i... For fixed A, the set
lJ'(eIA), t E IR, is a one-parameter subgroup of G which includes the group element
e~; hence for each t there is a real number s = f(t) such that
Hencef(t + s) = f(t) + f(s), and the only continuous functions with this property
are linear. Since also f(O) = 0, f(1) = 1, we see that f(t) == t, as was to be shown.
We now apply the eBH formula to both sides of the equation lJ'(eSAe'l') = esi:e' ii, and
we see that
sA + tJl + 1SI[A, JlJ + ... = s'i.. + tji + 1st [l, jiJ + ... (25.13-2)
for all sand t. We write s = ss', t = st'. Since F,V = Ri, we can cancel one factor s on
each side of the above equation. Then, as s --> 0, the quadratic and higher terms
vanish; hence
S'A + t'Jl = s''i.. + t'ji,
and this establishes the full linearity of the mapping A --> l. Then the linear terms can
be dropped from both sides of (25.13-2), and by a similar argument we see that
This theorem does not have a global converse, but only a local one, and a
definition is needed, before the converse can be stated. If U is a neighborhood
of the identity 1 of a Lie group G, then an analytic mapping ' of U into a Lie
group G such that ,(gh) = ,(g)'(h) whenever g, h, and gh are in U is called
a local homomorphism of G into G. If the inverse mapping is also a local
homomorphism, i.e., is unique and analytic in some neighborhood of 1 in G,
then ' is a local isomorphism. If, furthermore, G = G, then ' is a local auto-
morphism of G.
We come now to the converse of Theorem 1.
Theorem 2. If A and A are the Lie algebras of G and G, then any Lie-
algebra homomorphism 1jJ: A --> A induces a local homomorphism 'I' : G --> G
given by the exponential mapping, namely '(e A) = e!/l(A), for eA in a suitable
neighborhood of 1 in G.
PROOF. By the eBH formula,
lJ'(eAel') = lJ'(eU I'+1/2[A,I']+"')
= el/l(J,+I'+ 1/2[A, 1'] + "');
154 Lie Groups
and since the CBH formula holds also in G, the above is equal to eo/l(")eo/l("l; that is,
'(e"e") = '(e")'(e"), as was to be proved.
e~~ e~P).
Its Lie algebra A is the commutative Lie algebra of matrices of the form
has the property that g2 is the identity of G, and this property is not preserved
under 'P, except for special choices of a and b.
PROOF OF PART (a). For arbitrary fixed go,let 9 = go h, where h is in the neighbor-
hood U, so that the mapping h --> 'P(h) is analytic. Then, .p(g) = <P(go)'P(h), but
products and inverses are analytic throughout both groups (Section 25.11); hence h
is analytic in g, and .p(g) is analytic in g.
PROOF OF PART (b). Any 9 in G can be written as glgZ'" gk> where all the factors
are in U; hence, if <P is any extension of the homomorphism 'P, then .p(g) = .p(g 1) ...
.p(gk) = 'P(g I) ... 'P(gk), which is completely determined by 'P; hence any two such
extensions must agree for every g.
PROOF OF PART (c). As on previous occasions, we let!D be a subneighborhood of
U, containing 1, such that if 9 and h are in !D, g-I and h- 1 are in !D, while gh and hg
are in U. The mapping .p will be constructed. Let hand k be elements of G joined by a
smooth curve CfJ. Let CfJ be partitioned into smaller segments by points (group elements)
go, gl, ... , gz so that go = hand gl = k, while g;-l g ;+ 1 is always in !D. We call
(which cancel) between the first two factors of 9 and by then noticing that
'P(9jlgl)'P(gjlgZ)'P(g119z) = 'P(9jI 92 )
In this way it is seen that 9 = g, that is, that [] is unaltered by a continuous deforma-
tion ofthe curve CfJ, keeping the endpoints hand k fixed. Lastly, if G is simply connected
any two curves from h to k are homotopic; hence 9 depends only on hand k, so that we
can write [] = [](h, k). Furthermore,
[](gh, gk) = [](h, k) for any 9 in G,
since the same is true of each factor 'P(g;-l g ;+ 1)' It follows that the mapping .p of G
into G given by .p(g) = [](1, g) is the required extension of the local homomorphism
'P, because
<P(?1tgz) = [](l, 91)[](91' glg2) = [](1, gl)[](1, g2),
by addition of curves in G.
Theorem 1. The kernel ofa Lie group homomorphism 'P ofG onto G, that is,
the set
Go = {g E G: 'P(g) = 1 (identity of G)}
is a closed normal subgroup of G.
Theorem 2. Let Go be a closed subgroup ofG. Then the set Ao of all tangent
vectors to Go at 1 is a subalgebra of the Lie algebra ofG. If Go is a normal
subgroup, then Ao is an ideal.
PROOF. Suppose A is in Ao. Then there is a smooth curve get) that lies in Go and is
such that
log(g(t = At + .. "
where the dots denote terms of order t2 . For each positive integer m, g(tlmr is in
Go, and, by the CBH formula,
where now the dots denote terms of order t 2 lm. By letting m -> 00, we see, since Go
is a closed set, At is the coordinate of a point in Go; hence, for A in A o , eAis in Go.
lfA, and A2 are both in A o , then, by the CBH formula, the vector
log(eAlteA2t) = teAl + A2) + ttZ[A" AzJ + ...
is the coordinate of a point in Go. By an argument similar to the one above, we see
from the linear terms that A, + A2 is in A o, and more generally so is tAl + sA2 for
real t and s, so that Ao is a subspace. In a like manner we see from the quadratic
terms that [A" A2 J is in Ao; hence Ao is a sub algebra. Lastly, if Go is a normal sub-
group, only one of A, and Az need be in Ao , and we see that Ao is an ideal.
The next two theorems show that, if Go is a closed normal subgroup, then
the factor group GIGo can be endowed with a manifold structure which makes
it into a Lie group; the Lie algebra of this group is isomorphic to AIAo;
the natural homomorphism of G onto GIGo is analytic with respect to this
manifold structure. The last theorem of the section is the homomorphism
law itself.
It is easily seen that the last n - k components Ak + 1, ... ,An oU with respect
to the basis E 1, ... , En described above can be taken as coordinates in GIGo,
for they are constant in each coset (of Go in G), or more precisely in the inter-
section of the coset and a suitable neighborhood mof 1 in G in which loga-
rithmic coordinates can be used. Namely, if eA' = eAe", where Ji is in Ao ,
so that eA' and e A are in the same coset, then, by the eBB formula,
A,' = A, + Ji + 1[A" Ji] + ... ;
all the terms on the right, starting with Ji, are in Ao, because Ao is an ideal;
hence A,' and A, have the same last n - k components. It is also clear that these
last components are different (i.e., at least one of them is different) in different
cosets. In these coordinates, the product and inversion functions m(,)
and 1() in GIGo are continuous (in fact analytic): For any product eAe A' = e A"
in G, all n components of A" depend continuously on all the components of A,
and of A,'; hence in particular the same is true of the last n - k components,
which are the coordinates of the corresponding cosets. The natural homo-
morphism of G onto GIGo, in which a group element e A is mapped onto the
coset in which it lies, is continuous in these coordinates, for the mapping
consists in simply ignoring the first k components of A,.
In this way the following theorem is established.
Theorem 4. Let G, Go, A, A o , E1, ... ,En , A1, ... ,An be as above. Then
Ak+ 1, ... , An can be taken as coordinates <peg) (g denotes a coset) in a subset
IT of GIGo (consisting of cosets that intersect the neighborhood 'l5 in G),
thus defining a chart which makes GIGo an (n - k)-dimensional Lie group,
called the Lie factor group (it is also denoted by GIGo) of G with respect to
Go. The natural homomorphism of G onto GIGo is continuous, and hence is
a Lie group homomorphism (hence is analytic).
158 Lie Groups
The following example shows that the conclusions of the theorem need
not hold if Go isnot a closed subgroup: Let G be the 2-dimensional torus group
consisting of matrices
where eis a fixed real irrational number. The manifold of G is a torus, and that
of Go is a helical curve everywhere dense on the torus. Go is a normal subgroup,
but is not closed. The Lie algebra A is a plane, and, under the mapping
e" --+ A of G onto A, Go maps onto a set of parallel straight lines dense in the
plane. Ao is that line of the set that passes through the origin. With respect
to the basis (1' 2) referred to, where now 1 lies in Ao, the second coordinate
A2 has different values on different lines of the set that constitutes Go, and
since any neighborhood of the origin intersects infinitely many of the lines,
,.1,2 is not constant in Go or in any coset.
The last two theorems are now stated without proof.
The first seven of the exercises below deal with the somewhat elusive
question, mentioned in Section 25.1, as to which Lie groups have faithful
representations, and hence are linear groups, i.e., can be regarded as groups
of matrices (or of the corresponding linear transformations), as is normally
true of the groups that appear in applications. Every Lie algebra A has at
least one representation, the so-called adjoint representation of A on itself
(Exercise 2). By means of the exponential mapping, this gives a local repre-
sentation of a Lie group G on its algebra A. This mayor may not be extendable
to a representation on all of G, and, if it can, it mayor may not be faithful.
Exercise 8 deals with covering groups. SU(2) is the universal covering
group of SO(3). The basic theorem of Section 24.3 on covering manifolds
shows that there is no group not isomorphic to SU(2) that covers SU(2).
Hence, as stated in Section 21.1, there are no multivalued representations of
SO(3) except the 2-valued ones.
Law of Homomorphism for Lie Groups 159
EXERCISES
The center e of a group G is the set of those group elements that commute with
every group element, i.e.,
e= {g E G:gh = hgYh in G}.
Similarly, the center Z of a Lie algebra is the set of those elements that commute with
every element in the algebra, i.e.,
1. Show that the center of a Lie group is a closed normal subgroup, and the center
of a Lie algebra is an ideal.
2. Recall that, for any J,., in A, Ad). is the linear transformation 11 --> [A,I1J of A
into itself. Show that if the Lie product of two such transformations is defined in the
usual way
whereupon the set {AdA: J,., E A} of all such transformations becomes a Lie algebra,
then the mapping J,., --> Ad). is a homomorphism of A onto this new algebra. This homo-
morphism is called the adjoint representation of A (on itself).
3. Show that if A is center1ree, which means that Z = {O}, then the adjoint
representation is faithful (i.e., the above homomorphism is an isomorphism).
4. If G is simply connected, then the local homomorphism e). --> eAd). of G onto a
group of linear transformations in A, discussed in Section 25.7, can be extended to a
homomorphism of all G, called the adjoint representation of G on A. Show that a necessary
condition for this homomorphism to be an isomorphism is that G be center1ree, which
means that e = {I}; then the homomorphism is locally an isomorphism.
5. To illustrate Exercise 4, let G be SU(2) and take the 2 x 2 matrices Tj , Tz , T3
defined by (21.2-4) as a basis of the Lie algebra A of SU(2). Then, the transformations
Ad). are represented by 3 x 3 real matrices. Show that the matrices eAdA constitute the
group SO(3) and that the homomorphism e). --> eAd). is then the familiar 2-to-l homo-
morphism ofSU(2) onto SO(3). What are the centers ofSU(2) and SO(3)?
.6. Show that if e is the center ofa group G, then the factor group Gle is not neces-
sarily center-free by considering the finite group
G = {1, i, j, k},
where i,j, k are the quaternion units, which satisfy the equations
i2 = / = k2 = - 1,
g'(O) = h'(O) = 1', while g'(I) = g' and h'(1) = h'. Let g(s) and h(s) be the projections of
g'(s) and h'(s) down into Wl; i.e., g(s) = tjJ(g'(s and h(s) = tjJ(h'(s. Then, k(s), defined
as =g(s)h(s), is a curve in Wl starting at 1. Let k'(s) be the curve in Wl' that results from
lifting k(s) up to Wl' in such a way that k'(O) = 1'. (See Section 24.2). Then, the product
g'h' in Wl' is defined to be = k'(I). Show that this definition is consistent (i.e., independent
of the choice of the curves; recall that Wl' is simply connected) and that it makes G'
into a Lie group. Show that the projection tjJ is a Lie group homomorphism of G' onto G.
Show that if g~ lies over 1 [i.e., iftjJ(g~) = 1; i.e., if g~ is in the kernel of the homomorphism
just referred to], then g~ commutes with every h' in G'. Hint: Choose the defining curves
in Wl' in such a way that h'(s) = l' for 0 :s; s :s; 1while g~(s) = g~ for 1 :s; s :s; 1.
The concepts introduced in this section are analogous to the direct and
semidirect products of groups, defined in Section 22.9, in connection with the
crystallographic space groups. Suppose that a Lie algebra can be decomposed
as the direct sum (in the vector space sense) of two subspaces A1 and A 2 ;
i.e., any A in A can be uniquely decomposed as A1 + A2, where A1 is in A1 and
A2 is in A 2 . Suppose, further, that [A 1, A2] = 0 for every A1 in A1 and A2 in A 2 .
Then A1 and A2 are both ideals in A, and A is said to be their direct sum.
Now suppose that A is the direct sum (in the vector space sense) of Ao
and M, where Ao is an ideal, while M is merely a sub algebra. Then, for A1
and A2 in Ao and Ji1 and Ji2 in M,
The first three terms on the right are all in Ao (because Ao is an ideal) and can
be rewritten as
[A1, A2] + Ad,,! A2 - Ad"2 A1>
Ji-> Ad"
is a representation of M on Ao, according to Exercise 2 of the preceding
section, because, for Ji and v in M,
p(x) 0 y+ x 0 p(y), where the circle denotes the multiplication in the algebra.]
Direct and Semidirect Sums of Lie Algebras 161
Now let Ao and M be any given Lie algebras, and let the mapping
EXERCISE
1. Show that the product just defined satisfies the Jacobi identity.
EXERCISE
2. Let Go and Hbeclosed subgroups ofa Lie group G, where Go is normal. Assume
that each 9 in G has a unique representation as goh, where go and h are in Go and H,
respectively. Let A, Ao, and M be the Lie algebras of G, Go, and H, and show that
A = Ao ffip M, where, for any 11 in M, P(I1) = Ad~. Hint: For A. in Ao and 11 in M, let
fA., 11} denote 10g(eAe~) and find the Lie product of two such curly bracket expressions
by applying the expansion of the CBH formula to
simple simple
real complex complexEreal
Lie Lie Lie Lie Lie ~ Lie
group --> algebra --> algebra algebra algebras groups
Each Lie group determines a unique real Lie algebra, which in turn deter-
mines a unique complex Lie algebra, by a process called complexification,
described below. The complex case is simpler than the real case, just as in
elementary matrix theory, because the complex number system C is alge-
braically closed, while ~ is not. (It is recalled that a real matrix generally
has complex eigenvalues and eigenvectors.) There exists a complete classifica-
tion of the simple complex Lie algebras into four main series of algebras and
five so-called exceptional algebras. The next step is to find all the simple
real algebras whose complexification leads to a given complex algebra.
This step is carried through in Hausner and Schwartz 1968, where the reader
can find a complete classification of the simple real algebras. The result is
considerably more elaborate than the classification of the complex algebras,
but it is still two steps removed from a classification of the Lie groups; for
this, one must first find all possible repeated indirect sums of I-dimensional
and simple algebras, as described at the end of the preceding section, and
then find all (say connected) Lie groups that yield a given real Lie algebra.
We shall sketch the development very briefly through the classification of
the simple complex algebras. For the algebraic details and the many lemmas
needed for the proofs, the reader is referred to Hausner and Schwartz 1968.
As indicated in the preceding section, we are mainly interested in the simple
algebras, but, in the analysis of them, certain nonsimple algebras appear,
namely the semi simple, solvable, and nilpotent Lie algebras. To define those,
we note first that if Al and A2 are any ideals in a real or complex Lie algebra
A, then [Ab A2], defined as the subspace spanned by elements of the form
[Al' A2], where Al is in Al and A2 is in A2, namely, the subspace
[Al' A2] = span{[Al' A2]: Al E A1, A2 E A2}
is an ideal contained in both Al and A2. We then define two descending
sequences of ideals in A, namely,
Al = A :::J A2 :::J A3 :::J
and
inductively by
N+l = [A, N],
Classification of the Simple Complex Lie Algebras 163
where Va j is the weight space that corresponds to the weight (Xi, ),j = 1, ... , k.
weight spaces of this representation are then called roots, root vectors, and
root spaces of M in A. If et( . ) is a root, the corresponding root space is de-
noted by Aa; it is a subspace of A. From the nilpotence of M it follows that
the zero function, et(A) = 0 for all A, is one of the roots, and the corresponding
root space, called AD, contains M. If the nilpotent sub algebra M can be so
chosen that AD is = M, then M is a Cartan subalgebra of A. A basic theorem
that every complex Lie algebra has a Cartan subalgebra.
It turns out that if A is a complex semisimple Lie algebra, then (a) the
Cartan subalgebra M is commutative, (b) for each et f= 0, the root space
Aa is one-dimensional, (c) if et is a root, - et is also a root, and (d) if A and A'
are nonzero vectors in Aa and A-a, then [A, A'] is a nonzero vector in M,
and (A, A') f= O. We number the nonzero roots etl> et z , ... , et k ; we choose
vectors Ai and A-i in Aa, and A-a" so normalized that (Ai, Il-J = 1, and we
call
i = 1, ... , k.
It can be shown that the vectors Ili span M.
It follows from (a) and (b) in the preceding paragraph that for a semisimple
algebra, only ordinary root vectors (i.e., no generalized ones) appear. For
the root vectors Aa (et f= 0), that follows from the one-dimensionality of Aa;
and every vector v in AD = M is a root vector, because Ad" v = 0 for all Il in M.
The Cart an subalgebra is not unique, but it can be shown that if M' is
any other Cartan sub algebra in A, then M and M' have the same dimension,
and there is an automorphism of A that carries M onto M'; hence, either of
them can be used to investigate the structure of A.
It is found that the configuration of the vectors Ili completely determines
the Lie algebra. The description of this configuration is greatly simplified by
the fortunate fact that if Mr denotes the real vector space consisting of
linear combinations of the Ili with real coefficients, then the natural bilinear
from ( ., . ) is real and positive definite in Mr; hence, Mr is a Euclidean space,
if (-, .) is taken as the scalar product. It can be shown that the real dimension
of Mr is the same as the complex dimension of M, and we call it m. The
(complex) dimension of A is then m + 2k. The length of a vector Il in Mr is
111111 = (Il, ll)l/Z, and the angle between two such vectors is
(Il, v)
cos L fl, v = IfJilnv~
The star in M r consisting of the vectors Ili' thought of as radiating out from
the origin, has a rather high degree of symmetry and can be described in the
following terms: (1) For any given simple algebra A, either all the Ili have the
same length, or there are just two lengths, some of the Ili being short, and
the others long. (2) The angle between any two of the vectors is an integer
multiple of 30 or 45. (3) If the angle is 30 or 150, one vector is long and the
other short; the ratio of the lengths is J3.If the angle is 45 or 135, the ratio
ofthe lengths is fl. If the angle is 60 or 120, the two vectors have the same
length. (4) The entire star is symmetric with respect to reflection in each
166 Lie Groups
hyperplane perpendicular to one of the Jli' Every minimal star that satisfies
these conditions determines a unique simple complex Lie algebra, and dif-
ferent stars determine different algebras. If A is merely semisimple and is a
direct sum A = Ai EB ... E8 Ak of simple algebras, then Mr is spanned by k
mutually orthogonal subspaces, each containing the star of one of the simple
algebras.
We now assume that A is simple. When Mr is one-dimensional, the star
consists of two opposed vectors of equal length, and the algebra A, called
At. has dimension I = 3. When Mr is two-dimensional, there are three pos-
sible stars, shown in Figure 25.2 together with the designations and dimen-
sions I of the corresponding algebras, which are called A 2 , B 2 , and G2
When M, is three-dimensional, there are again three possible stars, corres-
ponding to algebras called A 3 , B 3 , and C 3 . The star of A3 consists of six
pairs of opposed vectors Jli and JI-i' all equal in length, and extending from
the origin to the midpoints of the edges of a cube; the angles that occur are
60, 90, 120, and 180. In the star of the algebra B3 there are six pairs of
long vectors, arranged as for A 3 , extending to the edges of a cube, and three
pairs of mutually orthogonal short vectors making angles of 45 with the
nearest long ones, the length ratio being -Ii;the short vectors extend to the
midpoints of the faces of the cube referred to. The star of C 3 is the same as
that of B 3 , but with the long and short vectors interchanged, so that the star
fits into a rhombic dodecahedron. The dimension number I is equal to 8, 10,
14,15,21, and 21, for the algebras A 2 , B 2 , G2 , A 3 , B 3 , and C 3 , respectively.
It is of course meaningless to talk about the lengths and directions of the
vectors Ai, because they lie in the complex space A, for which ( " .) is not
even a Hermitian inner product. What is meaningful is to find the Lie pro-
ducts [A, JI] for sufficiently many pairs A, JI so as to determine the structure
of A. That is best done by means of the models described in the next section.
To determine the possible stars, when Mr is of more than 3 dimensions,
one makes use of a device due to the Soviet mathematician E. B. Dynkin. A
simple set of vectors in the star is a certain set n of just m of the vectors Jli
(m is always < 2k) such that all the vectors of the star can be obtained by
repeated additions and subtractions, starting with the vectors of n, and
such that only one set of vectors satisfying the conditions (1)-(4), above, i.e.,
only one star, can be obtained in this way. It can be proved that it is always
possible to choose a simple set of vectors. Furthermore, although the set n
is not in general unique, if n' is another simple set, then there is an auto-
morphism of A under which M is invariant and n is carried into n'; hence
it is unimportant which simple set is used. The possible angles between any
two vectors of n are 90,120,135, and 150. A Dynkin diagram is a set ofm
points or small circles on a plane, one for each vector in n. If the angle
between two vectors in n is 120, 135, or 150, then the corresponding
points of the diagram are joined by a single, double, or triple line respec-
tively; if the angle is 90, the points are not directly connected. If the angle is
135 or 150, the point corresponding to the shorter vector is indicated by an
asterisk. Then, a number of things can be proved about the diagrams of
simple complex algebras; for example, a diagram can contain no loops, it is
connected, it can contain at most one double or triple line, it can have at
most one branching, and so on. In consequence of these rules, it is found that
there can be just seven types of Dynkin diagrams, as follows (m is the number
of points and is equal to the dimension of M, lis the dimension of A):
F4 ()--()==0-0 52
G2 ()==(V 14
Figure 25.3 Dynkin diagrams for the simple complex Lie algebras.
Types Am, B m, em, and Dm constitute the regular series, and the remaining
five algebras are called exceptional.
continue to denote the elements of the algebras by the symbols A, p, ... ,even
though other symbols might seem more appropriate for matrices.
1. Am consists of all (m + 1) x (m + 1) complex matrices of trace zero.
See exercises below. For Bm and Dm it is necessary to introduce the
antidiagonal matrix
J = p x p matrix(t)i).
1--- 0
2. Bm consists of all (2m + 1) x (2m + 1) complex matrices)" such that
)..J + J).. T = 0 (p = 2m + 1).
3. Dm consists of all 2m x 2m complex matrices).. such that AJ + J).. T = 0
(p = 2m).
For em it is necessary to introduce the 2m x 2m anti diagonal matrix
o
J' =
-1
---(,/ o
4. em consists of all 2m x 2m complex matrices).. such that ,,-J' + J').. T = O.
The following exercises concern the series Am. The series Bm, em, and Dm
are similar. The models of the exceptional algebras are more complicated and
are given in Hausner and Schwartz.
EXERCISES
1. Let N be the Lie algebra of (m + 1) x (m + 1) complex matrices A" with
[A" J1] = A,J1 - J1A,. Compute the natural bilinear form
(A" J1) = tr(AdAAd,J
[Note that AdA and Ad~ are linear transformations in an (m + 1)2 dimensional space,
namely N.] Show that
(A" J1) = 2m + l)tr(A,J1) - (tr A,)(tr J1.
Show that (A" J1) is singular in N but nonsingular in the subalgebra A = Am of matrices
of trace zero, so that Am is semisimple.
2. Let M denote the commutative subalgebra of the Lie algebra A of Exercise 1
consisting of diagonal matrices (with trace zero). Consider the adjoint representation
of M on A: J1-> Ad~, where
(Ad~A,)rs = (Il" - Ilss))"rs (r, S = 1, ... , n).
Consider the roots and root vectors of this representation. Show that the root space Ao ,
which consists of all matrices A, such that (Il" - Ilss)k Ars = 0 for some k, for all J1 in M,
consists also of the diagonal matrices; hence Ao = M, hence M is a Cartan subalgebra.
Show that the other roots cx() and corresponding root vectors A,a are obtained by choos-
ing fixed i and k and setting
cx(J1) = Ilii - Ilkko
A,a = A,(i, k),
Models of the Simple Complex Lie Algebras 169
Show that the angle between Il., and Il., + 1 is 120 and that otherwise the angle between
Il., and "' j is 90, so that the Dynkin diagram of A is as given above for Am' namely,
For the classification and models of the simple real Lie algebras, which are needed
for a classification of Lie groups, the reader is referred to Hausner and Schwartz. It is
recalled that if A is a simple real Lie algebra, its complexification Ais either simple or is
the direct sum of two identical (i.e., isomorphic) simple complex algebras, Hence, to
classify the simple real algebras, one must examine each simple complex algebra and then
find all simple real algebras from which it can be obtained by complexification.
Given a simple complex algebra A, one possible choice of A consists of the elements
of A but regarded as a linear space over the real field ~, rather than C, as the field
of scalars, but that is not the only possibility. Other possibilities are found by con-
sidering the so-called conjugations in A. A conjugation in a complex Lie algebra is
an antilinear mapping C [that is, C(aA + bll) = ilCA + bCIlJ, which preserves Lie
products (that is, C[A, IlJ = [CA, CIlJ), and whose square is the identity mapping
[that is, C(CA) = A]. The set of all A in A such that CA = A, with ~ as the field of
scalars, is a simple real Lie algebra. A complete analysis of the conjugations in the
simple complex Lie algebras, and the enumeration of the resulting simple real algebras,
is given in Hausner and Schwartz. As an example, if A is the simple complex algebra
A 1 of 2 x 2 matrices of trace zero, there are three corresponding simple real algebras,
namely A 1 itself (with ~ as the field of scalars) and
and
QA 1 = {2 x 2 matrices of the form iH, where H is Hermitian and of trace zero},
We mention in passing that some of the corresponding Lie groups are SL(2, 1[:), 2p,
SL(2, ~), SU(2), and SO(3).
Corresponding to each simple cpmplex algebra Am' for m > 1, there are
4 + [em + 1)/2J simple real algebras, where [ J denotes integer part.
Corresponding to the exceptional algebra G z there are three real algebras, called G z
(over ~), HGi3 ), and HG~l).
170 Lie Groups
In this appendix, we give two examples of Lie groups that are not linear, that
is, have no faithful finite-dimensional representations, hence cannot be
realized as groups of matrices.
For the first example, let G denote the so-called Heisenberg group
g;'~,o = g-x,o,O'
-1
go,y,o = go,-y,O, (25.A-1)
-1 -1
gx,o,ogo,y,ogx,o,ogo,y,o = go,O,xy'
It follows that if p is any representation of G, then p(go. 0. z) is a unimodu-
lar matrix for every z, because det(p(go,O,XY)) is equal to
det p(gx,o,o)det p(go,y,o)det p(gx,O,O)-l det p(go,y,O)-l = 1.
Now let Go denote the normal subgroup
g""" ~ {G ~ z ~} ~ 0, t, 2,}
of 3 x 3 matrices. It is easily seen that, in analogy with (25.A-1),
- - --1 --1 -
gx,o,ogo,y,ogx,o,ogo,y,O = go,o,Z'
where z == xy (mod 1). Hence, as above, if p is any representation of GIGo,
then det p(g) = 1 for every g in the subgroup
G
H = {go , , z: 0 S z < 1} < -.
Go
But H is isomorphic to SO(2), with 2nz playing the role of (); hence H is
compact and Abelian. According to the general theory of representations
given in Sections 21.1-21.4, every representation of a compact group is
equivalent to a unitary representation, and every unitary representation of
172 Lie Groups
_ (e21tinlZ . (0))
p(go,o.z) = .. .
(0) e21tinrnz
RM = (~ ~), ac = 1.
M= (
COS e (2S.A-2)
sine
This is the desired canonical form. It follows that the manifold of SL(2, ~)
is the Cartesian product of a circle and two lines, C 1 x ~2. Since the universal
covering of C 1 is ~, the manifold of Gis ~3.
The Lie algebra A of SL(2, ~) has as basis the matrices
L1 = aa~lo = (~ -~),
L2 = ~~Io = (~ _~),
L3 = ~;Io = (~ ~),
Appendix to Chapter 25-Two Nonlinear Lie Groups 173
where, in each case, the subscript zero indicates that the matrix in question is
evaluated at () = x = y = O. A direct calculation shows that
[Ll> L 2] = 2Ll + 4L 3 ,
[L 2, L 3] = 2L 3, (2S.A-3)
[L 3, L 1 ] = L 2
These equations can be solved for L 1 , L 2, and L 3; i.e., the Lie products on the
left also form a basis for A. It follows from the definition of the Lie product
in terms of commutators in the group that the group is generated by com-
mutators, i.e., by elements of the form ghg - 1 h - 1. By the argument used in
the first example, it then follows that if p is any representation, then det p(g) =
1 for all g in SL(2, IR).
Since G and SL(2, IR) are isomorphic in a neighborhood of the identity
element, they have identical Lie algebras, and A is also (in the sense of iso-
morphism) the Lie algebra of G. Hence, if p is any representation of G, it
follows that det peg) = 1 for all g in G. That is, any representation of G is
unimodular.
It can be shown that the Lie algebra A is simple. Namely, if A has an ideal
J that contains a nonzero vector A == aLl + bL 2 + cL 3 , then J also contains
the three vectors [Lj,A] and the nine vectors [Lk> [Lj,AJ]. A direct calcula-
tion, starting with (2S.A-3), shows that L 1 , L 2, and L3 can be expressed as
suitable linear combinations of those 13 vectors (it is not necessary to go to
higher Lie products); hence J coincides with A, i.e., A is a simple Lie algebra.
Let go. x. y denote the element (2S.A-2) of SL(2, IR), where 0 ::; (}::; 2n.
Then (), x, y with () unrestricted, can be taken as coordinates of elements
go, x, y in G in such a way that in the covering of SL(2, IR) by G, the element
gw,X,y of G lies over the element go,x,y of SL(2, IR) if ()' == () (mod 2n).
Now let p be a representation of G on an m-dimensional vector space
vm = em. It will be sHown that p is nonfaithful. A represent:;ttion of A, which
will also be called simply p, is induced in the usual way:
Scalar, vector, and tensor fields; Lie brackets; covariant and contravariant
vectors; transformation laws; inner and outer multiplication; contraction;
quotient law; derivations; metric tensor; definite and indefinite metric;
Riemannian and pseudo-Riemannian manifolds; raising and lowering of
indices; geodesics; Euler variational equation; natural, affine, or preferred
parameter; Christoffel three-index symbols; spacelike, null, and timelike
geodesics; initial-value and two-point problems of geodesics; Volterra
integral equations; Picard iterations; Whitehead's theorem; continuation of
geodesics; affinely connected manifolds; Riemannian and
pseudo-Riemannian covering manifolds.
1
Vi ( X , . , X
n) a f~(x, ... ,xn),
= --;----' 1 (26.1-4)
ux'
or, more concisely,
a ~
v;(x) = ox f(x)
i (i = 1, ... , n). (26.1-5)
then the relation between the two sets offunctions {v;} and {v;} in the overlap
of the two coordinate systems is
(26.1-9)
(26.1-10)
derivative to learn what the independent variables are; if a prime appears, the
independent variables are X '1 , , x in , while if a double prime appears, the
independent variables are X" 1 , ... , x" n , etc. These conventions are standard in
this subject.
A covariant vector field on 9R is defined as a collection of sets {Vi} of n
functions each, one set associated with each chart in 9R, such that the trans-
formation law (26.1-10) holds for any two such sets in the overlap of the cor-
responding charts.
Comments. (1) A vector field is not necessarily the gradient of a scalar, as was
the case in the foregoing example. (2) The two charts may cover exactly the
same portion U of9R, in which case the transformation law (26.1-10) refers to a
"change of independent variables," in the ordinary sense.
(26.1-11)
k = 1, ... , n,
are called its generalized velocity components. [They are the Cartesian velocity
components of the representing point x(t) in the coordinate space \Rn .] If
X'k(t) and V'k(t) are the corresponding coordinates and velocity components
relative to a second chart {U', <p', N' }, then
c:lAlk
uX.
Vi k(t) = -. vJ(t).
ox J
If, to describe the flow of an entire fluid, not just one particle, the velocity
components of the fluid particle which is at point x at some instant t are called
vi(x), then the transformation law is
OX 'k .
V'\X') = - . (x)vJ(x).
ox J
Like (26.1-1) and (26.1-9), this is an identity in the overlap ofthe charts if both
sides are expressed in terms of the Xi or both in terms of the X'i. Again, the
circumflex will be dropped.
178 Metric and Geodesics on a Manifold
(26.1-12)
holds for any two such sets in the overlap of the corresponding charts. Contrast
this with (26.1-10) by noting where the prime appears in the derivative.
This transformation law is also transitive.
EXERCISES
. k 0 . k 0 .
wJ = vJ uJ (26.1-13)
U -
ox k
- V -
ox k
Show that the quantities {w j }1 transform according to the law (26.1-12), hence con-
stitute another vector field. We write
w = [u, v],
and we call w the Lie bracket of u and v. Clearly [v, u] =- [u, v]. Show that if u j, vj,
and wj are any given smooth vector fields, then
[[u, v], w] + [[v, w], u] + [[w, u], v] = 0 (Jacobi identity).
It follows that if we start two or more Coo vector fields in a Coo manifold, and form all
possible vector fields by repeated construction of linear combinations (with constant
coefficients) and Lie brackets, the result is a Lie algebra (possibly infinite-dimensional).
Scalar and Vector Fields on a Manifold 179
2. Let ui and vi be smooth contravariant vector fields whose Lie product is =0,
so that
ata.res, t) = .
&(x(s, t)),
s
Figure 26.1
as Xl(s, t) ) = axk
a (a.
at as ~(s, t) ) ,
avi (x(s, t)) (a
180 Metric and Geodesics on a Manifold
that is,
ata. av j
(U 1(X(S, t))) = axk (X(S, t))(uk(x(S, t)))
The summation convention applies to all the repeated indices r, s, ... on the
right; a multiple sum results. Covariant tensors ~k> ~kl' etc. transform
according to the law
T. = ox r ox s T. (26.2-2)
Jk... ox'j OX,k'" rs ... '
whose contravariant and covariant ranks are the sums of the corresponding
ranks of T and S. The process just described is called outer multiplication of
vectors and tensors.
Tensor Fields 181
If a tensor carries the same symbol for a superscript as for a subscript, then
the summation convention applies to that symbol, and the result is a tensor of
lower rank; e.g., given a tensor Rjklm' a tensor Rkl can be defined as Rkl =
R\ Ij. Similarly, from the tensor Si kim there can beformed the scalars Si ki k and
Si\i. This process is called contraction. Outer multiplication followed by
contraction is called inner multiplication; for example, if vi and W j are vectors,
then vjw i is a scalar.
An easily verified converse to the last result is the quotient law which says
that if a collection of sets of quantities {Vi} is given, one set associated with
each coordinate chart that contains some point Po, and if for every covariant
vector {w} defined at Po the quantity viwi is a scalar (an invariant under
coordinate changes), then the sets {Vi} define a contravariant vector (at Po).
The roles of the co- and contravariant vectors can be interchanged. More
generally, if, for example, sets of n 3 quantities {~/} are given and are such that
the quantities
Si = ~klVkWI
transform according to the law (26.1-10) for a covariant vector, for arbitrary
vectors {Vk} and {WI} defined at Po, then the sets {~kl} transform according to
the law for a tensor of the indicated kind-covariant of rank 2 and contra-
variant of rank 1. One says more briefly that ~kl is such a tensor.
of = 09 (j = 1, ... , n) (26.2-5)
ax} pOx)
1 1
p
holds in any chart (hence in every chart containing P), then Lf and Lg must
agree at P. (It turns out that if L is any linear operator that satisfies this last
condition, it is necessarily a derivation.) To construct the field vi at any point P
in a given chart, we define functions f(j)(x), j = 1, ... , n, by the requirement
that
182 Metric and Geodesics on a Manifold
. of I
Lf Ip = Lglp = vJ ax j p' (26.2-6)
Since the left member here is a scalar, it is seen from the quotient law that the
vj transform as the components of a contravariant vector field under changes
of coordinates, and it is seen also that L is a derivation. Hence there is a one-
to-one correspondence between local derivations and vector fields. Because
of this correspondence, some authors define a vector field as a local derivation
in the algebra of coo functions on a manifold. We shall continue to use the older
definition, which has been traditional in most branches of physics.
It is clear that the same functions g j k(' .. ) are obtained, for the given coordinate
system xl, ... , x n , if the X j are replaced by other Cartesian coordinates, say
xj, obtained from the X j by a rotation of the axes (or by a rigid motion,
generally). Now, let Xi and Xi + Llx i (i = 1, ... , n) be the coordinates of two
points P land P 2, and let Xi and Xi + LlX i denote the Cartesian coordinates
of the same two points, i.e.,
Xi = Xi(X l , ... , x n),
(26.3-2)
Xi + LlX = X i(X 1 + Llx 1, ... , Xn
i + Llxn).
The square of the distance from P 1 to P 2 is given by
(d(P 1, P 2))2 = (LlX1)2 + ... + (Llxn)2; (26.3-3)
if the Llx i are regarded as small quantities, then, by expanding (26.3-2) in a
Taylor's series, it is seen that
(26.3-4)
Riemannian and Pseudo-Riemannian Manifolds 183
where 1~x 1 stands for max 1~xj I. This equation is usually paraphrased by
writing
(26.3-5)
ds is called the line element. From (26.3-1) it follows that if a transformation is
made from xl, ... , xn to new coordinates X,l, ... , x rn , then the functions
gjk(' . -) transform like the components of a second-rank covariant tensor, as
the notation indicates. Furthermore, this tensor is symmetric: gjk = gkj'
From (26.3-4) it follows that if the gj k are regarded as the elements of a matrix,
then this matrix is positive definite. A similar metric tensor will be assumed to
exist in any Riemannian space, but will not generally be expressible in the
form (26.3-1) because Cartesian coordinates do not exist if the space is
non-Euclidean.
rank tensor gjk is defined, which, in all IDl, is (1) symmetric, i.e., gjk = gkj' and
(2) positive definite, i.e., gj kvjv k > for any non vanishing vector {v j}. The
eigenvalues of the matrix (gjk) are all positive. Note that if gjk is symmetric,
then
oxr oxS
is also symmetric. Note also that the positive definiteness is compatible with
this transformation law, because gjkVjvk is a scalar.
The determinant of the matrix (gjk) is denoted by g or g(x\ ... , xn);
it is not a scalar, since its value at a given point in the manifold depends on the
coordinate system.
The transformation law (26.4-1) can be written in matrix notation as
(26.4-2)
where G is the matrix (gjk) and where J is the Jacobian matrix.
In a pseudo-Riemannian manifold, the matrix G is not required to be positive
definite, but only nonsingular and symmetric. Each eigenvalue of G is then
either> or < 0, and the signature s of G is defined as the number of positive
eigenvalues minus the number of negative ones. According to Sylvester's law
ofinertia of quadratic forms, the signature of the matrix G' = J GJT is the same
as that of G-a proof can be found in Bacher 1922; hence, the signature s is
independent of the choice of coordinate system, at each point. The eigenvalues
of G are continuous functions of its elements gjk> hence of the coordinates
Xl, ... , x n , and no eigenvalue is ever zero; hence no eigenvalue can ever switch
its sign as the Xi vary. Therefore, since IDl is connected, the signature s is
constant throughout IDl. In general relativity, IDl has dimension 4 and signa-
ture 2, so that G has three positive eigenvalues and one negative one through-
outIDl.
184 Metric and Geodesics on a Manifold
Since det G i= 0, G has an inverse; the elements of the inverse are denoted
by gjk = gjk(xl, ... , xn); these functions transform according to the law
g'lk
ax X grs
= ____
Alj a Aik
(26.4-3)
aXr aXs
[this equation can be obtained by taking matrix inverses in (26.4-2) and by
noting that the Jacobian matrices of inverse transformations are inverse
matrices]; hence gjk is a contravariant tensor of rank 2.
If [I' is an n-dimensional hyper surface in a Euclidean space EN, where
N> n, then [I' may be regarded as a Riemannian manifold with the metric
that Y' inherits from EN. If Xl, ... , Xn are intrinsic coordinates in a portion U
of [1', if xl, ... , X N are Cartesian coordinates in EN, and if, in U,
i = 1, ... , N,
then, as in Section 26.2, the distance d (in EN) between two points x j , xj + I1xj
in [I' is given by
N
d2 = L [Xi(xl, ... , xn) - Xi(X I + I1xl, ... , xn + I1xn)]2
i= 1
where
N ax i axi
gjk = L -a -a
i=1 X
j
X
k'
One says that [I' is immersed in EN and that gjk is the inherited metric tensor.
EXAMPLE
If Xl = e and X Z = cp are polar coordinates on the unit sphere, where 0 < fJ < n,
-n < cp < n, and if X, Y, Z are Cartesian coordinates in E 3 , then
Z = cos xl,
and
gil = 1, gIZ=gZI=O,
this result is usually written as ds z = dfJ z + sin z fJ dcpz; it can also be obtained from
the line element given by ds z = dr z + rZ dfJ z + r2 sin 2 fJ dcp2 for spherical polar
coordinates in E3 by setting r = 1 and dr = o.
EXERCISES
and where ~ and I) are allowed to vary over the entire ~, I) plane. Show that .Y can be
immersed in E3 as a hemisphere.
2. Find similarly an immersion in E3 of the manifold with metric tensor
note that the horizontal ordering of the indices must be maintained, unless
the tensor is symmetric. Clearly gjk is simply the result of raising both indices
of gjk; the mixed form of the metric tensor is
J. _ J'1
gk - g glk -
_ {I for j = k}_ J.
0 for j #- k - 6b
.
in this special case, it is customary to write the indices without horizontal
separation (g~ rather than gjk or gk j ), which is permissible because of sym-
metry.
186 Metric and Geodesics on a Manifold
The rest of this chapter, except for the last section, deals with geodesics. It is,
however, not quite yet geometry, because we shall be concerned mainly
with analytic tools and relations. The geometry proper starts in the next
chapter with notions like parallel transport along a curve.
Let ~ be a smooth curve in a Riemannian manifold Wl with initial and
terminal points P 1 and P 2' We assume first that ~ lies in a single coordinate
chart and is described in that chart by the functions xi(w), for a ~ w ~ b,
which are assumed to be of class C 2 . Transformations from one chart to
another will be considered later. The quantity
L = fJgkl.xh~;1 dw = f ds (26.6-1)
where the dot denotes d/dw and where each argument of <1>(... ) is understood
as the corresponding function of w.
We note in passing that if the curve ~ lies in the intersection oftwo charts,
the same length L is obtained from either coordinate system, because the
expression (26.6-2) is a scalar; for any w, the value of <I> is independent of the
coordinate system; hence, the value of L given by (26.6-1) is an invariant. If~
is piecewise smooth, its total length is understood as the sum of the lengths of
its pieces.
If a smooth curve ~ 0 from P 1 to P 2, namely
~o: Xk = I(w) (given), (26.6-3a)
where a ~ w ~ b, can be found such that the integral L is smaller for ~0 than
for any other curve from P lto P 2, then ~ 0 is clearly the shortest such curve.
To compare ~o with neighboring curves, consider curves ~ ofthe form
(26.6-3b)
where 6 is a small parameter and the i'( . ) are arbitrary class C 2 functions such
that i'(a) = i'(b) = 0 (k = 1, ... , n); then
L = L o +.i6
2 l b
a
<1>-1/2 (8<1>
uX
~ k
uX
8<1
_ Z k + _Zk dw + 0(6 2 )
~'k '
(26.6-4)
Geodesics in a Riemannian Manifold 187
where Lo is the length of '{lo. In <1> and its derivatives, the arguments Xk and Xk
are understood to be given the values on '{lo, namely lew) and few). For L
to be a minimum, the above integral must vanish for all choices of the func-
tions Zk(W), and this leads to differential equations for the functions lew) that
describe the curve '{lo. If the integral vanishes for all Zk, i.e., if the l satisfy
these differential equations, then the curve '{lois called a geodesic (or geodesic
curve) in m; '{lo mayor may not be the shortest curve from P1 to P 2 (see
examples, below), but in any case the value of L is stationary on '{lo.
In particular, if m is a Euclidean n-space, and the variables Xl, .. , Xn are
curvilinear coordinates, so that the metric tensor is given by (26.3-1), then '{lo
is a segment of a straight line, expressed in the curvilinear coordinates.
To simplify the following steps, it is convenient to choose the parameter w
on the curve '{loin such a way that <1> is constant on '{lo (but not necessarily on
the neighboring curves '{l; in fact, w cannot be chosen so that <1> is the same
constant on '{lo and on the neighboring curves, because w = a and w = b at
P 1 and P 2, respectively, on these curves; hence, if w could be so chosen, all the
curves would have the same length). This can be done by introducingla new
parameter A = A( w) on '{lo according to the equation
(26.6-5)
for any w1 between a and b (A is arclength along '{lo); using A rather than w as
the variable of integration, the above equation is
Jgklfl = fo == 1 on '{lo
The factor <1>-1/2 can now be dropped from the integral (26.6-4). That is,
when the parameter A is chosen as arclength on '{lo, the variational problems
have the same solutions. Although <1> = 1 on '{lo, the partial derivatives of <1> in
(26.6-4) do not vanish, because they involve differentiations in other direc-
tions than merely along '{lo.
Integrating by parts in the second term in (26.6-4) after deleting <1>-1/2 (the
integrated parts vanish because Zk = 0 at P 1 and P 2) and equating the
integral to zero give
fa
b (0<1>
OX k -
d 0<1
dA OXk Z
k
(A)dA = O.
188 Metric and Geodesics on a Manifold
"on C(] 0" means that the functions Xk(A) that appear in <I> (see 26.6-2) are to be
taken as the functions Y\A) that describe the curve C(] o. Equations (26.6-7) are
the Euler variational equations of the problem
(26.6-8)
From the expression (26.6-2) for <1>, it is seen that the Euler equations are
specifically [k is replaced by m in (26.6-7)]
(26.6-10)
and
{kl} = grm[kl, m] (summed on m), (26.6-11)
SPHERE
MINIMUM-LENGTH
GEODESIC
FROM A TO B
MAXIMUM-LENGTH
GEODESIC
FROM A TO B
CYLINDER
without further consideration (see examples in Figure 26.2, and note also
that there are infinitely many geodesics on a sphere from a given point to its
antipode, all having the same length). It will be seen below, however, that
given any point A of the manifold, there is a neighborhood 91 A of A such that
if B is in 91 A, then there is only one solution !(] 0 of (26.6-12) from A to B lying
in 91 A , and L is a minimum on this curve !(]o.
Notes. The quantities [kl, m] are not the components of a 3rd rank tensor;
neither are the quantities h';,,}' for they don't satisfy the appropriate trans-
formation laws; for example, in a Euclidean space, these quantities are all
identically zero in Cartesian coordinates, but not in curvilinear coordinates.
The quantities Xk are the components of a contravariant vector, but the
quantities j(k are not. Nevertheless, equation (26.6-12) has a certain invariant
character; namely, if it is satisfied, for a given curve peA), in one coordinate
system, then it is satisfied ip any other, because it was derived from the
invariant equation (26.6-8). "Geodesic" and "natural parameter" are
invariant concepts. The n quantities in the left members of (26.6-12), whether
evaluated for a geodesic or not, are, at any point of a curve!(], the components
of a contravariant vector (obtained by the so-called absolute differentiation of
the vector xk-see Section 27.6), although the individual terms of the left
members of (26.6-12) are not, by themselves, the components of a vector. If a
curve!(] runs through several coordinate charts and satisfies (26.6-12) in each
of them, then !(] is also called a geodesic in the manifold.
190 Metric and Geodesics on a Manifold
S:
In this case <I> can be negative; hence L, defined as <1>1/2 dw, is meaningless as
S:
a length. Even if L is redefined as 1<I> 11/2 dw, it is still meaningless as a length
in the ordinary sense, because, given any two points P and Q, a (piecewise
smooth) curve can always be found from P to Q for which L = O. Nevertheless,
a curve ~: Xk = Xk(A) that satisfies (26.6-12) is still called a geodesic, and A is
called a natural parameter. The quantity <I> = gikXixk is constant on ~, and
three cases arise:
If <I> > 0, ~ is called a space like geodesic;
if <I> = 0, ~ is called a null geodesic; (26.7-1)
if <I> < 0, ~ is called a timelike geodesic.
Since <I> is quadratic in the xk, this classification is independent ofthe choice of
the natural paramefer. The parameter A can be so chosen that <I> = 1 in the
first case and <I> = -1 in the third; then, A is called distance and proper time,
respectively, along ~. Geodesics playa role in general relativity.
l
of a geodesic through Po with tangent vector ~1, .. , ~" at Po, and with
natural parameter A, is the following:
dX i .
dA = 1"
diff. eq. . U= 1, ... , n), (26.8-1)
dpJ i k I
- = -hl}PP
dA
Xi(O) = ai
initial condo { pi(O) = ~i U= 1, ... , n). (26.8-2)
It will be proved in the next section that this initial-value problem always has
a unique solution for A in some interval [ - Ao , Ao].
It is convenient to call
l=xk-ak (k = 1, ... , n),
y"+k = pk (k = 1, ... , n),
and to rewrite the differential equations as
dl
dA= fk( y,
1 ... ,y2") (k = 1, ... , 2n), (26.8-3)
Geodesics; the Initial-Value Problem; the Lipschitz Condition 191
where fk denotes the function on the right side of the kth differential equation
of the set (26.8-1), for k = 1, ... , 2n, the Christoffel symbols now being re-
garded as functions of yl, ... ,y2n. The functions fk are defined for all
yn+ 1, ... ,in, and for all i, ... , yn such that the corresponding point
Xl, .. , xn lies in the given chart. It is assumed that the {it} and their first
partial derivatives are continuous throughout the chart, and it is asserted
that the functions fk are Lipschitz continuous in any compact region of the
yl, ... , in space in which they are defined. That is, suppose that (j is a con-
stant such that the fk are defined for all yl, ... , in in the cube W determined
by Iyi I ~ (j (i = 1, ... , 2n). Then there is a constant L = L((j) such that if
{i} and {ji} are two points in W, then
dl
"df= fk( y,
1
... ,yN)
(k = 1, ... , N = 2n). (26.8-5)
leo) = y~ (given)
It will be proved in the next section that this problem has a unique solution
near A = 0, hence we have the following:
vector is changed from gk} to {ae}. If a is taken =.10 , it follows that, given any
direction, there is always a solution, valid for all Ain [0, 1], which starts in the given
direction, provided that the components of the initial tangent vector are <ill less than
some positive constant. It can also be proved that the constant depends continuously
on the direction of ~k, and hence has a positive lower bound or minimum. Therefore,
where y denotes the vector with components yl, ... , yN. It is convenient to
introduce the norm
Ilxll = max Ixjl;
(j)
[i.e., suppose that yeO) lies not merely in the cube W but in the central part of
W] and suppose that IAI ::;; L/log 2, where L is the Lipschitz constant in
(26.9-2). Then,for q = 1,2, ...
6 1
II/'lY(A, q)11 ::;; -2'
q.
(LIAI)q (26.9-6)
and
IIY(A, q)11 < 6. (26.9-7)
PROOF. The case q = 1 follows from (26.9-4) and (26.9-5). The other cases follow
H
by induction on q, because Aq dA = Aq + l/(q + 1) and
This last permits the use of the Lipschitz conditions for each q.
It is clearly the denominator q! in (26.9-6) that gives the method its great
power. Since the sum in (26.9-8) is majorized by the power series for eLIAI , the
sum converges absolutely and uniformly with respect to A in any finite
interval. Hence the sum can be integrated term by term, and we see that the
function
yeA) = lim yeA, q)
satisfies the integral equation (26.9-1); hence it satisfies the initial value
problem (26.8-5). Theorem 1 of the preceding section is thereby proved.
Theorem. Any point Po in 9J1 has a neighborhood 91 such that if xj(O) and
x j (1) (j = 1, ... , n) are the coordinates of any two points Qo and Q1 in 91,
then there is a unique geodesic segment joining Qo to Ql and lying in 91.
Corollary 1. If the curve Xj(A) is continuous for a :-s; A :-s; b and satisfies the
geodesic equation for a < A < b, then it satisfies this equation also for A = a
and A = b.
The existence prooffor the two-point problem is similar to the one for the
initial-value problem, in that the system (26.10-1) is converted into an inte-
gral equation, which is then solved by the Picard iterative scheme. The
procedure is somewhat more complicated, because the integral equation is of
the Fredholm type (the upper limit of the integrals is 1, not A), so that one has
neither the factor 1/q! nor the explicit dependence on A that appeared in
(26.9-6). For details, see Whitehead 1932.
It can be proved that, in a Riemannian manifold, if the curve Xj(A) satisfies the
geodesic equation for a < A < b and lies in a compact region of the manifold,
then the limits of Xj(A) exist for A ~ a and A ~ b; hence, in particular, Corol-
lary 1 above applies. In a pseudo-Riemannian manifold, that does not hold, as
Affinely Connected Manifolds 195
the following example shows: Let 9Jl be the surface of a cylinder, let Xl = z
and x 2 = ebe cylindrical coordinates, and let the metric be given by
z = A, (0 < ,1<1)
Note. This does not imply that A ~ 00 on the geodesic; also it does not
imply that the geodesic may not have a beginning or end in some space in
which the manifold is immersed.
Since the transformation laws for tensors are known, and gik is a tensor, it is
easy to find the transformation law for the three-index symbol CD from
equations (26.6-10, 11). If {M' refers to coordinates x'i, ... , x m , and to {M
the unprimed coordinates, then the transformation law is
. [ ax s ax t a2 xr ] ax'i
{),}' = Gt} ax'j ax'k + ax'j ax'k ax r ' (26.12-1)
Since only the tU appear in the geodesic equation (26.6-12), and not the
g jk directly, a more general kind of geometry, called affine geometry, is obtained
if one does not assume the existence of a metric tensor gjk at all but only the
existence of a set of quantities which transform like the CD and which
appear in place of the U} in the geodesic equation. Then there mayor may
not exist a tensor gjk from which these quantities can be obtained via equa-
tions (26.6-10, 11).
An affine connection of a manifold 9Jl is defined in analogy with a tensor as a
collectionofsetsofn 3 functionseach,r)k = rh(xi, ... , xn),onesetassociated
196 Metric and Geodesics on a Manifold
with each chart on 9](, such that in the overlap of two charts the two sets of
functions are related by the transformation law
. [ OX 'ox t
S
02X r ] OX'i
r'jk = r~t ox'j OX'k + ox'j OX'k OX r ' (26.12-2)
which is the same as (26.12-1). Since the r}k are not assumed to be derived
from a metric tensor as the CD were, it is now necessary to verify directly that
this transformation law is transitive (see Section 26.1), in order to be sure that
the definition of a connection is self-consistent. Verification is left as an
exercise.
In an affinely connected manifold 9]( (i.e., a manifold with an affine con-
nection defined on it), a smooth curve ~: Xi = Xi(A.) (i = 1, ... , n) is called a
geodesic, and A. is called a natural parameter on ~, if the equations
(r = 1, ... , n) (26.12-3)
are satisfied on ~; the dot denotes differentiation with respect to A.. Compare
with (26.6-12).
As in Riemannian geometry, the geodesic equations (26.12-3) are invariant
under coordinate changes in the sense that when a curve ~ lies in the overlap
of two charts, it satisfies the equations in one of the coordinate systems if and
only if it satisfies them in the other. The equfltions are also invariant under a
transformation A. ~ aA. + b (a =f. 0) of the natural parameter on a given
geodesic.
The theorems of Section 26.8 on the initial-value problem of geodesics
continue to hold, provided that the components r}k of the connection are
functions of class C~, which, according to (26.12-2), requires that the rranifold
be of class C 3
Whitehead's theorem also continues to hold (in fact, Whitehead stated
and proved it o/iginally for an affinely connected space): each point P has a
neighborhood 9l such that any two points in 9l can be connected by a unique
geodesic in 9l.
The question whether, given an affine connection rh on a manifold, a
metric tensor gjk can be found which is consistent with the metric, i.e., which
is such that rh = {M, is discussed briefly at the end of Section 27.10.
The geometric structure based on the geodesics in an affinely connected
manifold is called the geometry of paths; it is discussed in some detail in the
next chapter. It is clear from (26.12-3) that for that purpose the connection
may be assumed to be symmetric in the subscripts:
(26.12-4)
More generally, a further geometric structure is sometimes introduced, called
torsion, based on the anti symmetric part !(r) k- rL) of the connection -see
Flanders 1963. Since torsion has no effect on the geodesics, it has to be re-
garded as something outside of and superposed on the geometry of the
manifold as determined by its geodesics.
Riemannian and Pseudo-Riemannian Covering Manifolds 197
Let Wl be a covering manifold of 91, and !/J a projection ofWl onto 91. If f(P) is
any function (say, of class Ck ) in ~n, then the functionl(Q) ~f f(!/J(Q)), defined
in Wl, may be said to be lifted from 91 to Wl, in analogy with the lifting of curves
and surfaces, discussed in Section 24.2.
Now suppose that 91 is a Riemannian manifold. Let epi(p) be the coordin-
ates in a good neighborhood min 91, and let gjk be the components of the
metric tensor in these coordinates. The epi and the gik are all functions in 91,
which can be lifted to Wl as functions ifJi and gjk' Each connected component
U j of!/J-I(m) thus becomes a coordinate chart with a metric tensor in it, and it
is easily established that 9Jl is thereby made into a Riemannian manifold,
called a Riemannian covering manifold of 91. If Wl is the universal covering
manifold of 91, then Wl is called its universal Riemannian covering manifold.
Pseudo-Riemannian covering manifolds are similarly defined.
Consider now the problem of constructing manifolds covered by a given
Riemannian manifold 91. For a general manifold Wl, such an 91 was con-
structed in Section 24.6 by means of a homeomorphism (J ofWl onto itself such
that ifP is any point ofWl, then thesetofpointsP,(J(P),(J((J(P)), ... , (J-I(p), ...
is discrete (has no limit point in Wl); it was constructed by identifying all such
points, for each P, i.e., by regarding the set of these points as a single point of
the manifold 91 being constructed. In order for this process to give a Rieman-
nian manifold 91, it is necessary only to require that (J be an isometric homeo-
morphism, that is, one that preserves the metric. That is, let {U, cp, N} be any
chart in Wl, and let U' be the set of all Q = (J(P), for P in U. Define cp'(Q) as
cp((J- '(Q)). Clearly, {U', cp', N} is a possible chart in Wl (i.e., is compatible with
the charts alread y there), by definition of a homeomorphism. Then, the gj k are
required to be the same functions of the coordinates X,i = ep'i in the second
chart as the gj k are of the coordinates Xi = epi in the first chart. In this case, all
the charts obtained from {U, cp, N} by means of (J and its iterates have
identical metric tensors in them and can be identified as being a single chart
of 91.
These ideas are used in Chapter 28 in the study of the global properties of
Einstein manifolds.
CHAPTER 27
Riemannian, Pseudo-Riemannian,
and Affinely Connected Manifolds
The subject of this chapter is the geometry of a manifold that has a metric, a
pseudo metric, or an affine connection defined on it. The dividing line between
the preceding chapter and this may seem rather fine and arbitrary, since the
last topic in that chapter was geodesics, and the first in this is geodesic
coordinates. However, there is a fundamental difference between the two. The
preceding chapter was mainly analytic, the only geometric notion being that
of the distance between two points, whereas this one is mainly geometric, and
even in the sense of Euclid, except that the concepts are somewhat extended
and the formulation is analytic. The fundamental concepts, such as paral-
lelism, length, curvature, and angle, are really geometric, and must be so
regarded. The use of analytic methods does not detract from the geometric
nature of those concepts any more than did the introduction of numerical
coordinates into Euclidean geometry by Descartes. From that point of view,
one of the main results of the preceding chapter, Whitehead's theorem, serves
the same purpose as Euclid's postulate that through any two distinct points
there can be drawn one and only one straight line, even though, in Whitehead's
theorem, the two points must not be too far apart.
In modern mathematics, where all branches have merged to some extent,
the question what is geometry and what is analysis is somewhat abstruse, but
surely those notions that can be traced directly back to Euclid ought to be
198
Geodesic or Riemannian Coordinates 199
called geometric. In Euclid's time, geometry was regarded as the science that
dealt with the physical space around us. If general relativity is correct, then
Riemannian, pseudo-Riemannian, and affine geometry do also; even if
relativity has to be modified, its primary ideas, which have stood for over
60 years, will surely continue to be basic for physical space-time.
The best understanding of the subject is obtained if one keeps geometric
ideas (like parallel transport of a vector along a curve) in the foreground of
one's thinking, and detailed analytic formulas a little bit in the background.
(27.2-2)
dx k
di(O) = y\ (27.2-3)
where yl, ... , yn are real numbers, not all zero, which specify the direction in
which the geodesic starts out from Po. According to Theorem 2 of Section
26.8, these equations have a solution xk(ll; yl, ... , yn), k = 1, ... , n, valid for
o s Il s 1, for all vectors yl, ... , yn in some neighborhood ofthe origin. If this
solution is expanded in powers of Il and the coefficients of the first three terms
are obtained from the above equations, we find
where Iy I denotes max(k) Ill. The subscript 0 indicates that the connection r
is to be evaluated at Po. The above may equally well be regarded as a power
series expansion in the small quantities Ilyl, ... , llyn; hence, without loss of
generality, we may simply set Il = 1, if the l are themselves sufficiently small.
That is,
is equal to 67 at Po, i.e., when all the l are zero, and it follows that the
equation
Xk = xk(l; yl, ... , yn)
can be inverted, in some neighborhood of Po, to give the y's as functions of the
x's. The first few terms of the corresponding expansion are
l = Xk - Xk(O) + !r7m Io(x 1 - xl(O))(xm - xm(o)) + . . .. (27.2-5)
The yl, ... , yn are called geodesic (or Riemannian) coordinates about Po.
The connection components r7m in these equations refer to the original
coordinates. The corresponding components in the geodesic coordinate
system will be denoted by t7m; we can find their values as follows. A geodesic
is given in the geodesic coordinates by l = leA) = Ak, where l, ... , ~n are
constants; hence, d 2l/dA2 = 0, and by comparison with the general equation
(27.2-1) withxk replaced by land rfm by tfm, we see that tfm lmis =0 on the
geodesic, for all A and all l, ... , n. Therefore
t7m = 0 for all k, I, m, at Po, (27.2-6)
and generally
(27.2-7)
throughout the neighborhood of Po in which the geodesic coordinates are
defined.
Note. Except in flat space, geodesic coordinates about Po are not in general
geodesic coordinates about a neighboring point Qo. Stated differently, a
straight line l = Ak + ak in the coordinate space ~n of y1, ... , yn does not
generally represent a geodesic unless the constants a l , ... , an are all zero, i.e.,
unless that straight line passes through the origin of ~n.
The new coordinates about Po are still not unique, and the remaining degree
of non uniqueness is described by the following theorem:
hence for any given point peA) on a geodesic through Po, we have
y'j =
ax'j I Y\
-k (27.2-8)
ax Po
which is the required linear transformation.
EXERCISE
Show that the next term in the series (27.2-4) can be written as
(27.2-9)
where
(27.2-10)
the summation being over the cyclic permutations of the subscripts I, m, and r. Still
higher-order terms, with coefficients r~m ... v are discussed in Eisenhart 1926.
+1
+1 (0)
+1 (27.3-1)
-1
(0)
-1
(in the Riemannian case, all the signs are +). That can be done as follows:
First, the yi can be transformed by an orthogonal matrix such that the matrix
(gjk) becomes diagonal at Po. Let the diagonal elements be called d 1 , , dn ,
Geometric Concepts; Principle of Equivalence 203
where the positive ones come first (they are all #0). Then, by a further trans-
formation y'i = Jld;!i (here the summation convention is suspended), the
matrix (gjk) is reduced to the above standard form. Then, the i are unique,
except for an orthogonal transformation (a rotation with possibly an inversion)
in the Riemannian case, and except for a Lorentz transformation in the pseudo-
Riemannian case. The yi are then called normal or normal geodesic or normal
Riemannian coordinates.
If a Riemannian manifold is immersed as an n-dimensional surface S in a
Euclidean space EN of higher dimension, as described in Section 26.4, then the
normal coordinates are Cartesian coordinates in the n-dimensional hyper-
plane tangent to S at Po, the projection from the surface to the hyperplane
being such that geodesics through Pogo into tangent lines, and distances
along the geodesics go into distances along the tangent lines.
(In some of the older literature, any coordinates in which the first partial
derivatives of the gjk vanish at Po are called "geodesic coordinates" about
Po)
Geometry deals with points and lines and objects constructed from them. It
deals specifically with spaces of points in which certain sets of points called
"straight lines" or " geodesics" are singled out. In the geometries considered
here, it is also assumed that the space is an n-dimensional continuum, that is,
that it has the topological properties that make it an n-dimensional manifold.
A basic geometric notion is that of congruence. There is generally a group
of transformations in the space, called the congruence group, under which
geometric relations are preserved. Examples are the rigid motion group in
Euclidean space and the Poincare group (nonhomogeneous Lorentz group)
in Minkowski space. Then, two figures (point sets) ,ue said to be congruent if
one can be transformed into the other by a transformation of the group. (In
Felix Klein's "Erlangen Program" of 1872, the point of view was reversed:
When a group of transformations is given in a space, it is taken to determine a
geometry consisting of those relations that are preserved under the transforma-
tions of the group.)
By analogy with Euclidean and Minkowskian geometries, one might
suppose that, in a space with a metric tensor gjb the group ought to consist of
those transformations under which the tensor gjk is mapped into itself. How-
ever, there are generally no such transformations (except the identity), unless
the space is fiat or has constant curvature. Hence, generally, the ordinary
concept of congruence is lost.
An approximate concept of congruence can be defined for very small
figures. We consider first a Riemannian manifold. Let y\ ... , yn be normal
coordinates about a point Po and w1 , . . . , wn normal coordinates about
another point Qo . Let l/J denote the mapping of a neighborhood of Po onto a
neighborhood of Qo obtained by equating the normal coordinates. That is, a
204 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
point P with coordinates yl, ... , yn is mapped onto a point Qwith coordinates
wI, ... , wn if i = Wi, i = 1, ... , n. A small figure near Po is said to be ap-
proximately congruent to a small figure near Qo if!/J carries the first figure into
the second. Since a normal coordinate system is unique only modulo ortho-
gonal transformations, the result is an approximate Euclidean geometry.
Geodesics through Po are carried by !/J into geodesics through Qo, and it is
easily seen that the angle between two geode-sics (see formula below) is
preserved under the mapping. (Geodesics not through Po generally do not go
into geodesics at all, unless the space is flat.) If three or more smooth curves
intersect at Po, the star formed by their tangent vectors at Po is mapped into a
similar star at Qo with preservation of all angles. If Po AB is a small triangle
with one vertex at Po, i.e., if Po A, Po B, and AB are short geodesics, and its
image under!/J is a figure Qo CD, then Qo C and QoD are geodesics having the
same length as Po A and Po B; CD is nearly a geodesic, if the triangle is small,
and has nearly the same length as AB. If Po and Qo are the same point and the
land wk are two normal coordinate systems about that point, then the con-
gruences are rotations and reflections that keep Po fixed.
Angles are defined by the formula that gives angles in curvilinear coordinates
in Euclidean space. If two smooth curves, given by Xi = Xi(A) and Xi = xi(/1),
intersect at a point with coordinates Xi(Ao) = xi(/1o), their respective direc-
tions at that point are given by the tangent vectors
gk~j~k
(27.4-1)
cos () = II~II II~II'
where
(27.4-2)
The absolute value signs in (27.4-2) are unnecessary here but are included for
later use in the pseudo-Riemannian case.
Similar considerations apply to a pseudo-Riemannian manifold. As before,
let yI, ... , yn and wI, ... , wn be normal coordinates about points Po and Qo,
respectively, i.e., geodesic coordinates such that the metric tensor takes the
standard form (27.3-1). These coordinates are unique only modulo Lorentz
transformations; hence the geometry is approximately Minkowskian rather
than Euclidean. Under the mapping !/J given as before by Wi = yi, geodesics
through Po are mapped into geodesics through Qo with preservation of type
(spaceIike, null, or timelike), and the angle between two geodesics, given by
(27.4-1), is preserved. Now, however, cos () can be > 1 or < -1; hence () can
be imaginary. If either ~ or ~ is a null vector, cos () is infinite or undefined,
according as g j k ~j ~k is # 0 or = O.
It should be noted that, even if ~ and ~ are both spacelike or both timeIike,
the value of cos () given by (27.4-1) may lie outside the interval [ -1, + 1].
Geometric Concepts; Principle of Equivalence 205
EXERCISE
Show that a necessary and sufficient condition for Icos eI to be < 1 is that all linear
combinations of ~ and ~ be spacelike, or all timelike. If cos e = 1, there is a null
vector of the form a~ + b~ with a and b not both zero. (In the Riemannian case, that null
vector is the zero vector; hence ~ is a scalar multiple of (.)
For figures in an affinely connected manifold, angles and lengths are not
defined; nevertheless, certain geometric relations are invariant under the
mapping ljJ given by yi = Wi as above. Here, yi and Wi are arbitrary geodesic
coordinates; since there is no notion of normal coordinates, they are defined
only modulo nonsingular linear or affine transformations. For example, if
~,~, and ~ are the tangent vectors at Po to three curves, and if ~ lies in the plane
determined by ~ and ~, i.e., if the three vectors are linearly dep~ndent, then
the same is true after the mapping. In fact, a relation of the form ~ = a~ + b~
is preserved under ljJ, for given a and b. More generally, any set of such vectors
is linearly dependent if and only if the set of their images under ljJ is linearly
dependent.
The further geometric concept of parallel displacement or parallel transport
along a curve, which will be described in Section 27.7, was introduced by
Levi-Civita in 1917. The idea is this: In flat geometries, Euclidean, Minkow-
skian, or affine, where the congruence group, when referred to Cartesian
coordinates, consists of transformations of the form
X --+ Mx + a,
M being an orthogonal, Lorentz, or general nonsingular matrix, the sub-
group of the pure translations
plays a special role. (Note that in the Minkowski case, no relative motion is
involved, only a displacement.) Namely, if one figure can be mapped into a
second by a pure translation, the figures are said to be not only congruent but
also to have the same orientation in space, or to be obtainable from each other
by parallel displacement. The corresponding concept in a curved space is the
parallel transport of a small figure along a curve, with generally different
results, however, depending on what curve is used to connect the initial and
final points.
Our approach to these questions will be in analogy with the "equivalence
principle" of general relativity, according to which certain laws can be
formulated by simply asserting that the corresponding laws of special
relativity hold in an inertial or "freely falling" reference frame. Geodesic
coordinate systems about a point play the role of inertial frames. For example,
the parallel transport of a vector along a curve will be so defined that at the
instant when the vector is passing a point P of the curve, if yl, ... , y" are
geodesic coordinates about P, then the vectors components relative to the yi
system are undergoing no change at that instant, i.e., have zero derivatives
with respect to a parameter on the curve.
206 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
In this and the following two sections, three closely related concepts are
discussed (covariant differentiation, absolute differentiation, and parallel
transport), anyone of which might be taken as fundamental and the others
derived from it. We consider a general affinely connected manifold.
The partial derivatives of the components of a tensor transform like the
components of a tensor of one-higher rank under linear transformations, but
not under more generai ones. For example, in the Euclidean plane, a vector
field with constant Cartesian components has variable polar-coordinate
components; hence the nonvanishing of the partials with respect to rand eof
the latter is a peculiarity of the r, ecoordinate system. To eliminate effects of
this kind, we define a tensor of higher rank by giving its components at any
point P in a geodesic coordinate system about P as the appropriate partial
derivatives; then, to find its components in other coordinate systems,
we must use the transformation laws.
Let P :xi-= d be any point, and let yl, ... , yn be the corresponding geodesic
coordinates. As in Section 27.2,
Xi = d + yi - !rhyj/ + ... , (27.5-1)
yi = Xi _ ai + !rh(xi _ ai)(xk - ak) + .... (27.5-2)
In these equations, the connection coefficients refer to the Xi coordinate sys-
tem and are to be taken at the point P. If Vi(X 1 , ... , xn) is any covariant vector
field,- we denote by V;(l, ... , yn) its components in the geodesic $ystem. A
second rank covariant tensor Vi; j ' the covariant derivative of Vi' is d~fined at P
by giving its components in the yi system as
hence
vi;jlp = vi;ilp = (a:i (Vk ~~;) t = (:~~ :;; ~~;t + V a~:;:i jp.
k
(27.5-4)
Covariant Differentiation 207
because this quantity is a scalar (an invariant) and is obviously equal to the
Laplacian of f in Cartesian coordinates.
EXERCISE
1. (a) Show that if Vi = vi(xl, . .. ,x") is a contravariant vector field, then the
formula
(27.5-5)
defines a second-rank mixed tensor field. (b) Show that if Tij is a second-rank covariant
tensor field, then the formula
(27.5-6)
The covariant derivative of a general tensor has one additional term (i.e.,
in addition to the partial derivative) as in (27.5-4) for each covariant index
and one additional term as in (27.5-5) for each contravariant index. The
covariant derivative of a scalar field f is simply its gradient: fk = of/oxk.
The operation of covariant differentiation provides the basis for the
invariant formulation of field theories and generally of physical theories
where partial differential equations occur.
EXERCISE
2. Show that the operation satisfies the product rule of differentiation. First, if
Ti j in (27.5-6) is a product ViWj, then
(27.5-7)
More generally, if T::: and S::: are arbitrary tensors, then
(27.5-8)
where the indicated product may be an outer or arbitrary inner product, i.e., contracted
any number of times.
208 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
.(A )
W, 0
= dVi(A)
dA
I .
'<='<0
We now transform this equation back to the original coordinates.
d ( OXk) at A == AD
W;(A) = dA Vk(A) oyi
Parrallel Transport 209
At A = ,10' the x's and the y's agree to first order, and the second derivative,
according to (27.5-1), is = -rfj. That is,
.(,1) = dV;{A) _ (A)rk. dxj(A)
W, dA Vk 'J dA
Since the geodesic coordinates no longer appear, this equation is valid at any
point of CfJ, and in any coordinate system. It is customary to denote Wi by
bv/bA, called the absolute derivative of Vi along CfJ; hence, along CfJ
bVi dVi k dx j
bA = dA - rijVk dA (27.6-1)
If Vi is given not only on CfJ but as a vector field defined in a region containing
CfJ, then
bVi dx j
bA = Vi;j dA (27.6-2)
The absolute derivatives along CfJ of other tensors are similarly defined, in
complete analogy with covariant derivatives. For example, if a contravariant
vector Vi is given on CfJ, then <
(27.6-3)
For absolute, as for covariant differentiation, the product rule holds, the
metric tensor behaves like a constant, and the absolute derivative of a scalar f
is its ordinary derivative df IdA (another scalar).
In particular, if Vi is the tangent vector to CfJ, given by
. dx i
v'(A) = dA'
are the components of the given vector at Po, the transported vector is given
at any point peA) of the curve as the solution Vi(A) of the initial value problem
i5v
i5; = 0 on~, (i = 1, ... , n). (27.7-1)
where the ~; are obtained from the ~i by the transformation law for a vector at
Po. That is because i5VJi5A is a vector, so that if all its components vanish in one
coordinate system they also vanish in any other.
Parallel transport of a contravariant vector or a general tensor along a
curve is similarly defined.
Now consider, in particular, a Riemannian or pseudo-Riemannian mani-
fold. If Vi(A) and Wi(A) are any smooth vector-valued functions on ~, then,
since i5g jk/i5A = 0,
d . k
dA (gjk VJW ) = ( .
i5v J k
gjk i5A w
. i5w
+ vl bf .
k)
Therefore, if Vj(A) and Wk(A) are the result of parallel transport of two given
vectors along~, it follows that gjkVjwk is constant on~. That is, in a Rieman-
nian or pseudo-Riemannian manifold, magnitudes of vectors and angles between
vectors are preserved under parallel transport.
EXERCISE
A Mobius strip made of flat paper without stretching can be regarded as a 2-di-
mensional Riemannian manifold, which can be covered by two or more coordinate
charts, in each of which gjk is = (jjk throughout. Show that this manifold is not orientable.
The Riemann Tensor, General; Laplacian and d'Alembertian 211
Since Vi is arbitrary, the quotient law, applied to (27.9-1), shows that the n4
quantities Rijkl are the components of a tensor of rank 4, which is called the
Riemann tensor or the Riemann curvature tensor.
In Euclidean space, the Riemann tensor vanishes identically in any co-
ordinate system, because it clearly vanishes in Cartesian coordinates, and if
212 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
all components of a tensor vanish in one coordinate system, then they all
vanish in any other. It will be seen in Section 27.12 that the vanishing of the
Riemann tensor is also sufficient for a Riemannian manifold to be Euclidean,
for a pseudo-Riemannian one to be Minkowskian, and for an affinely con-
nected one to be flat. In each case, this statement refers just to the metric, but
if the manifold is simply connected, it can be extended to a complete Euclidean,
Minkowskian, or flat space.
In the special case of a Riemannian or pseudo-Riemannian manifold, where
there is a metric and where indices can be raised and lowered, the Riemann
tensor can be expressed in other forms-see next section.
If Rjkl is contracted with respect to its first and fourth indices, we obtain the
Ricci tensor
(27.9-3)
which plays a role in relativity.
The remainder ofthis section is devoted to the Laplacian and d'Alembertian
operators. In n-dimensional Euclidean space, in Cartesian coordinates, the
Laplace operator is given by
V2 = of + o~ + ... + o~,
where
o
Ok = oxk'
where vii, ... , yn) are the components of the vector field relative to the
geodesic coordinates. According to Exercise 4 below, this is not in general
equal to the result of symmetrizing the second covariant derivative Vj; k; I with
respect to the indices k and t. According to the principle of equivalence, we
therefore define the Laplacian or d'Alembertian operator, as applied to the
vector field, as
(27.9-5)
The Riemann Tensor, General; Laplacian and d'Alembertian 213
EXERCISES
(27.9-6)
(27.9-7)
where the superscript indicates that the connection components are those that refer
0
(27.9-9)
The last shows that the Riemann tensor gives all the intrinsic information
about a space in the immediate neighborhood of a point P th~t is given by the
connection components and their first derivatives at P. Namely, by a suitable
choice of coordinates (geodesic coordinates), the r~l can all be made = 0 at P;
hence they give no intrinsic or coordinate-free information, and then their
first derivatives are all determined by the Riemannian tensor as in (27.9-9).
In the next section it will be seen in a similar fashion that when there is a
metric tensor g j k' the Riemann tensor gives all the intrinsic information that is
conveyed by the gjk and their first and second partial derivatives at P.
214 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
We now suppose that a metric tensor gkl is defined in the manifold. Then, the
connection components r~l' etc., that appear in the definition (27.9-2) of the
Riemann tensor are to be identified with the Christoffel symbols of the second
kind, {M, etc., defined by (26.6-10, 11). A straightforward calculation then
shows that the Riemann tensor R ijk1 (the first index has been lowered) can be
expressed in terms of gkl and its derivatives as follows:
where the square brackets denote the Christoffel symbols of the first kind,
given by (26.6-10).
The number of independent components of this tensor is less than n4
because of the following symmetry relations, which follow from the above
equation:
(27.10-2)
R ijk1 + R ik1j + R i1jk = O. (27.10-3)
We shall now show that, in consequence of these relations, the number of
independent components of the Riemann tensor is
n 2 (n 2 - 1)
(27.10-4)
12
which is = 1,6, and 20, respectively, in 2, 3 and 4 dimensions. First, it follows
from the three relations (27.10-2) that any non vanishing component can be
written (by change of sign, if necessary) as R ijk1 , where i < j and k < f, and
where, if (ij) and (kf) are regarded as two-digit base-n integers, (ij) ::; (kl). The
number of possible values of (ij) or (kl) is then tn(n - 1) = mand the number
of possible pairs (ij), (kl) is
1
zmm+
( 1) n(n - l)(n 2 - n
=--------~------
2) + (27.10-5)
8 .
Next, unless i,j, k, and f are all different, (27.10-3) reduces to a combination of
the preceding identities (27.10-2), and if they are different, it can be assumed,
without loss of generality, that i < j < k < f, for if i',j', k', l'is any permuta-
tion of i,j, k, f, the identity (27.10-3) for i,j, k, f can be obtained from the same
one for i', j', k', l' by use of the preceding identities (27.10-2). Therefore the
number of independent identities oftype (27.10-3) is n(n - 1)(n - 2)(n - 3)/4!,
and subtraction ofthis form (27.10-5) gives (27.10-4). It can be proved that the
R ijk1 satisfy no further algebraic identities independent of (27.10-2,3).
The Riemann Tensor in a Riemannian or Pseudo-Riemannian Manifold 215
rr
kI
=!2 grm(_ Ogkl + ogml + Ogkm)
oxm ox k ox l '
(27.10-11)
where, as usual, the matrix (grm) is the inverse of the matrix (gkl)' This equation
can be solved to give
(27.10-12)
EXERCISES
1. Show from the compatibility condition of the system (27.10-12) that the solution
gk I' if it exists, must satisfy the equation
(27.10-13)
216 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
which is just the first of the relations (27.10-2). Derive further conditions from this
equation by covariant differentiation.
2. Consider a manifold in which the affine connection is given by
otherwise q k = O.
Show that, in this mal1ifold, (27.10-13) has no symmetric solution at the origin. Show
that the Ricci tensor is not symmetric: R 1z of R z I'
3. Show that in a Riemannian or pseudo-Riemannian manifold, if the Ricci
tensor Rij vanishes, as in empty space-time according to general relativity, the two
natural definitions of the Laplacian or d'Alembertian operator, as applied to a vector
field, agree:
Otherwise, as noted in the preceding section, it is the expression on the left that gives the
correct form at the origin of geodesic coordinates.
4. Show that in a Riemannian or pseudo-Riemannian manifold, at the origin of
geodesic coordinates,
a2
ayl aym gjk = -W~.jklm + Rk1jm )
Figure 27.1
vertices); one measures the area A of the triangle and the sum L of its angles.
Then the radius r of the earth is given by the formula
A
2" = L - n. (27.11-1)
r
Figure 27.2
218 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
It will be shown that the vanishing ofthe Riemann tensor is sufficient as well
as necessary for a space to be flat. Consequently, it is now assumed that
R ijkZ vanishes everywhere in an affinely connected manifold, and it will be
shown that coordinates yl, ... , yn can be so chosen in any simply connected
region that the corresponding connection coefficients r}k vanish identically.
If there is a metric tensor, the y's can be so chosen that the gjk assume the
standard constant values bj k throughout.
Let Xl, ... , xn be coordinates in a simply connected chart and consider the
following initial-value problem for a covariant vector field vlx\ ... , xn):
h . OVi k (27.12-1)
DE: Vi;j = 0, t at IS, ox j - rijVk = 0,
o k 0
OXZ (rijVk) - OXj (ri/Vk) =
k
(27.12-3)
that is
Vi;j - Vj;i =
(the other terms in the covariant derivatives cancel), and hence is satisfied,
because the covariant derivatives are zero. We denote the solution of the new
220 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
initial-value problem by yP(Xl, ... , x n), for each p, and we choose new co-
ordinates by setting yP = yP(x 1 , .. ,xn), p = 1, ... ,n, which is possible
because the Jacobian of the y's with respect to the x's at the origin is =J. 0
[because it is the determinant whose columns are the vectors v;(O, ... ,0; p),
which were chosen to be linearly independent]; according to the implicit
function theorem, we can solve for the x's in terms of the y's in some neighbor-
hood No of [Rn, so that the y's become new independent variables in No.
With respect to the new coordinates, the vector fields vj have the components
ox
i
Vj Y 1, ... ,yn.)
o (
,p = Vi (1
X , .. , x ".,p) oi
oyP OX i
- - - - 61!
- OX i oi - J'
Since these vector fields were so constructed as to have vanishing covariant
derivatives, we have
0= Vj;k = 0 - thVi = - t j k6f
-tfk'
Therefore the connection coefficients are all zero in the new coordinates, as
was claimed.
Now suppose the manifold is Riemannian or pseudo-Riemannian, and
hence has a metric tensor. The initial vectors v;{O, ... , 0) in (27.12-2) can be
chosen orthonormal, so that
gjk(X\ ... , x")Vj(x\ ... , xn; P)Vk(X\ ... , xn; q) = 6pq
at x = 0; but gjk and the vi all have vanishing covariant derivatives; hence this
equation holds for all x. The metric tensor (contravariant form) in the yj
coordinate system has components
opq _ jk oyP oyq
9 - 9 ox ox
j k
( \7 2 - -
1
-
C 2 (}t 2
(}2) 'I' + V'I' =
'
(27.13-1)
g
i.[
J
(}21/J
(}Xi (}x j -
k (}I/J]
LJ (}x k + (A + V)I/J
_
- 0. (27.13-4)
Whether this equation for I/J can be solved by the method of separation of
variables depends on the forms of the functions gil -) and V(-).
In the separation procedure (see, for example, Morse and Feshbach 1953),
one starts by looking for special solutions in the form of a product
When this product is substituted into (27.13-4), it turns out that, for certain
choices ofthe coordinate system (hence of the gi) and ofthe function V, one
obtains a system of second-order ordinary differential equations, one for each
of the functions Xi' depending on n arbitrary so-called separatiop constants,
of which A in the above equations is the first. In this way one obtains a large
enough family of special solutions (depending on the separation constants
and constants of integration that appear in the solution of the ordinary
differential equations) to serve as a complete set for the expansion of an
arbitrary function of Xl, ... , X".
In brief outline, the inves~igations of various authors showed that three
conditions on the metric are necessary for separability, i.e., for success of the
procedure just described. First, the coordinates must be orthogonal; that is,
the matrix (gi) must be diagonal. The other two conditions on the gij are the
222 Riemannian, Pseudo-Riemannian, and Affinely Connected Manifolds
or
(28.2-2)
The Einstein Gravitational Field Equations 225
where
(h'") ~ ~ r1 1(O~ 11
(h,,) (0;
(2S.2-3)
x
Il + hllv 8<ppu
8 pu -
xx -
0,
XV
where the dot stands for differentiation with respect to proper time r along
the trajectory and the x/l( r) are the coordinates of the particle. The equation
resembles that of a geodesic, but of course the second term comes from the
gravitational field and has nothing to do with geometry.
Such a theory may be called the special relativity theory of gravitation. It
was never considered seriously, to any extent, because Einstein discovered the
general theory before observational tests of relativistic effects became feasible.
In the special theory, when there is no gravitational field, so that <Pll v == 0, free
bodies move along straight lines, i.e., geodesics, in the 4-dimensional Minkow-
ski space, whereas, when <P/lV =/=. 0, the trajectories depart from straight lines,
and relative accelerations are observed.
In Einstein's general theory, the trajectories are assumed to be always
geodesics (in the absence of nongravitational forces), and the relative ac-
celerations are attributed to the curvature of space time. That assumption
simplified physics in that it then became unnecessary to consider how a
gravitational field might modify the laws of electromagnetism, quantum
theory, and so on, for those laws were assumed to take their field-free form in
a local inertial or "freely falling" frame of reference, in which the gravitational
field has been "transformed away." A study of the equations of the geodesics
shows that, when space-time is almost flat (almost Minkowskian), a coordinate
system can be found in which the tensor gil v differs only slightly from the hll v
and differs in such a way that the motion is the same as though space time
were exactly flat and there were a gravitational potential given by <P/lV =
(c 2 /2)(g/lv - h/l v). In this limit, the Einstein field equations must reduce to
pu 8 8 _ SnG
9 8x P 8xU gil V - 7 T/l v (2S.2-4)
226 The Extension of Einstein Manifolds
where JiJj, v is a tensor containing the components of the metric tensor and
their first and second derivatives with respect to the x/l and where JiJj,v was
required to have the following properties: (1) It must be symmetric and
divergence-free, i.e.,
gVlTW/lV;U = 0;
this is necessary because the stress-energy tensor 7;, v of the matter has these
properties. (2) When the fields are weak (i.e., space-time is nearly flat), it must
reduce to the left member of (28.2-4) in the coordinates referred to. Einstein
found that the only choice of the JiJj, v that has all these properties is the left
member of the following equation, which is called the Einsteinfield equation:
1 8nG
R/l V - zRg/l v - Ag/l v = -4- 7;,., (28.2-6)
C
here, R/l v is the Ricci tensor defined in Section 27.9, and A is a constant. The
second term in (28.2-6) is necessary to make the left member divergence-free,
according to equations (27.10-9, 10).
In applications to small systems like the solar system or a single galaxy, one
would like to suppose that space-time is asymptotically flat at large distances;
this requires that the so-called cosmological constant A be zero. Einstein
assumed, however, that A is not exactly zero but very small. He believed that
that was necessary in order to obtain closed (i.e., finite) models of the universe
and thus avoid the various forms of Olbers's paradox that appear in an
infinite arid asymptotically uniformly populated universe A. A. Friedman
showed in 1922, however, that closed models could also be obtained for
A = O. Since then, A has usually been taken as zero.
In the remainder of this chapter, A will be taken as zero, and the discussion
will be further restricted to empty regions of space time, where T/l v = O. In
this case, since g/lVg/lV = 4, contraction of (28.2-6) shows that the scalar
curvature R is zero; hence, the gravitational equation reduces to R/l v = O. An
Einstein manifold is therefore defined as a 4-dimensional manifold of signature
2 in which the Ricci tensor R/l v is zero throughout. More general definitions
are clearly possible and are sometimes found, for example in Petro v 1969.
The problem considered in this chapter is how to extend a given Einstein
manifold (usually given by a single chart) so as to obtain a larger Einstein
manifold, in fact, to obtain a maximal extension, in some sense, of the given
manifold.
An extension of that kind played an important role in the development of
relativity. The famous Schwarzschild solution for the field around a spherical
mass, which will be discussed in the next section, appeared to indicate a
singularity of space-time at a certain distance (the so-called "Schwarzschild
radius") from the center. The nature of this "singularity" was much discussed
The Schwarzschild Charts 227
used to distinguish one of the invariant spheres from another. Then we take
e,
r, <p as coordinates in the chart, and then the metric has spherical symmetry
with respect to those coordinates in the usual sense.
We now have a 4-dimensional chart in 9)(, whose coordinate domain N in
~4 is given by
0< e< n,
(28.3-2)
-n < <p < n,
- 00 < X4 < 00,
(28.3-3)
X 14 = X 4 + f-g14
g44
d r,
The Schwarzschild Charts 229
Ricci tensor RJl v. These are all expressions in rx, [3, and their first derivatives, so
that setting RJl v = 0 gives differential equations for rx(r) and [3(r). For the
details of this some somewhat lengthy calculation, the reader is referred to any
book on general relativity (for example, Tolman 1934 or Weber 1961). It is
found that
different metric, the Schwarzschild interior metric, must be used, which takes
into account the nonvanishing ofthe stress-energy tensor Til Since ro ~ R 0 ,
V.
space-time is very nearly flat (Mitlkowskian) at all points outside the sun, i.e.,
for r > R0 (also at all points inside, it turns out), and the gravitational field is
very nearly a Coulomb field. In fact, as the reader is doubtless aware, the
observational tests of general relativity require the measurement of exceeding-
ly small effects.
This chapter is concerned with a primarily mathematical problem that
comes out of the theory, namely that of finding the maximal extensions of the
empty space solutions, such as the Schwarz schild exterior solution or the
Kerr solution for the field around a rotating mass, into further regions of
space-time devoid of matter. The astronomical interpretation of these solu-
tions in terms of "black holes in space" or cosmological models, is outside
the scope of the present discussion.
Henceforth, units oflength and time will be used such that ro = c = 1, and
t will be written for X4. Then, the Schwarzschild line element is
Three coordinate charts can be constructed, using this metric. For the present,
each of them is to be thought of as a separate Einstein manifold. To define a
coordinate chart, it is only necessary to specify the region N in the coordinate
space ~4 in which r, &, cp, and t vary. Clearly, singularities of the gil v must be
avoided; hence there are three possibilities:
NI : 1< r < 00,
The resulting charts will be called the (Schwarzschild) Charts I, II, and III,
respectively.
If r is replaced by - r, it is seen that Chart III is simply the solution around
a negative point mass. This is an Einstein manifold, according to the definition
adopted here, and it is even geodesically complete, in the sense of Section 28.6,
below. By itself, it is uninteresting, since negative masses presumably do not
exist. However, it will be seen in Section 28.8, that a metric very similar to that
of Chart III appears in part of the Kerr manifold, which represents the field
around a rotating mass.
The following points concerning the time dependence are noted in passing:
It was shown in 1923 by G. D. Birkhoffthat the metric of Chart I is obtained
The Finkelstein Extensions of the Schwarzschild Charts 231
even if one drops the assumption of stationarity. That is, the metric of Chart I
is the only spherically symmetric one that is asymptotically flat at infinite
distances. This comes about as follows: If the functions r:x and f3 that appear
above are allowed to depend on t as well as on r, then a more complicated
general solution appears. However, this solution can always be transformed
into the stationary one (28.3-9) by a pure transformation of coordinates. It
follows, for example, that the gravitational field around a radially pulsating
star is static. In electromagnetic terminology, there is no monopole radiation
of gravitational waves. There is no dipole radiation, either, because there are
no negative masses; however, quadrupole radiation can occur, and this is
believed to provide one mechanism for loss of energy by pulsars. Finally, it
is noted that the metric of Chart II is not stationary in the sense of the defini-
tion at the beginning of this section; the time like variable is r, not t.
The definition of spherical symmetry adopted above needs one comment.
If P is any point in the manifold IDl3 (x 4 being constant), and if S(P) denotes
the set of all points into which P is carried by the transformations of the group
G, then S(P) is a surface r = const., which is 2-dimensional and is a sphere
(a 2-sphere) by any ordinary criterion. That need not be true in a general
spherically symmetric manifold. Let 9)( be the manifold of the group SO(3),
and let G be the group ofleft translations cp(h): g -+ hg (Vg) in 9)(. Then, the
mapping h -+ cp(h) is an isomorphism of SO(3) onto G. However, if P is any
point of 9)(, then the set S(P) of all points into which P is carried by trans-
formations of G is not 2-dimensional-it is all of 9)(, hence 3-dimensional. In
the Schwarzschild Chart I, the asymptotic flatness at infinite distances
evidently puts an additional restriction on the effect of the group G on the
manifold, so that we have only 2-dimensional invariant sets. The manifold of
SO(3) is compact, so no such restriction can be applied to it.
FINKELSTEIN I
r r
c'
o o
_ _ _ _--'-_-L-_ _ _ t'
SCHW ARZSCHILD I FINKELSTEIN I
Figure 28.2
The Kruskal Extension 233
FINKELSTEIN I
FINKELSTEIN II
Figure 28.3
lll
< r < 00) is called the Finkelstein Chart II. In this case, the curve cor-
responding to ~ of Figure 28.4(2) escapes to t = - 00 instead of t = + 00 ;
hence, this extension of the Schwarz schild Chart I leads to a still different
part of space-time, as indicated schematically in Figure 28.3. Indefinite further
extensions can be obtained by alternate successive use of transformations of
the type t --> t log(r - 1) and t --> t 10g(1 - r) in the intervals (1, 00) and
(0, 1) of r. Throughout the manifolds described so far, the Einstein field equa-
tions are satisfied in the form R/l v = 0, and the signature is 2.
c- t
v = vr - le r / 2 sinh 2'
/I
/i
,n
/i r=O
/i
r= 1
II r=2
l' /
U
/
/
/
/ r=2
II'
/
/ r= 1
/
r=O
A manifold based on (28.7-2) which will be called Sl', is given by the chart in
which the ranges of the variables are:
N': -1<~<00 }
except ~ = 11 = 0,
-00<11<00
e, cp as usual.
This manifold contains one copy each of the Schwarzschild manifolds I and II;
they are in the regions 0 < ~ < 00 and -1 < ~ < 0, respectively. It will be
seen below that this manifold Sl' is maximal-i.e., it cannot be extended to any
larger one; it might therefore possibly seem superior to the Kruskal manifold,
since the formulas are simpler. However, the singularity at ~ = 11 = 0 is in a
sense a singularity of the coordinate system and not a genuine singularity; it
will be seen that all curvature invariants have finite limits as the point
~ = 11 = 0 is approached.
The manifold Sl' has the further property of time reversal. That is, if tan a
denotes the slope d~/dl1 of a null geodesic in the~, 11 plane (a geodesic on which
ds 2 = 0 while e and cp are constant), and if tan f3 denotes ~/11, it is seen from
(28.7-2) that
tan f3(1 - tan 2 a) = 2 tan a,
which is equivalent to the equation tan f3 = tan 2a; hence, if a point in the
~,11 plane encircles the origin once clockwise, f3 increases by 2n while a
increases by n; i.e., the directions of the null geodesics have been reversed.
Clearly, the type of singularity exhibited by the manifold Sl' at ~ = 11 = 0 is
physically unacceptable; generally, a maximal extension of a given Einstein
manifold is not physically reasonable unless it is geodesically complete.
The nature of the singularity at ~ = 11 = 0 is further explored in the
following exercises:
EXERCISES
(28.7-4)
The Kerr Manifolds 237
of the coordinate space. Show that illl is fiat (i.e., that Ropya == 0), and is of signature zero,
hence is locally Minkowskian. Show that illl exhibits time reversal.
2. Find the geodesics on the manifold 9:11 of the preceding exercise. Show that,
when a geodesic C(J goes to 00 in the ~, I] plane, the natural parameter Aon C(J goes to 00,
whereas, when a point approaches the origin on a geodesic, A tends to a finite value. The
geodesics on which the latter happens are the half-lines ~ = r cos C(, I] = r sin C(, C( a
constant and r > O.
Hints: These exercises become trivial if one transforms to new variables x, t such
that
I] = 2xt, (28.7-5)
where
z p a
kll dx ll = - dz + 2 2 (x dx + y dy) + 2 2 (x dy - y dx) - dt,
P P +a p +a
(28.8-2)
(28.8-3)
The axis of rotation is the z axis; a is a constant equal to twice the ratio ofthe
angular momentum to the mass in the units used. For rapidly rotating stars,
a ~ 1, while a can be of the order of 1 or smaller for slowly rotating ones.
Various charts can be based on this solution. There is a singularity on the
circle x 2 + y2 = a 2 , z = 0, where Rafh~RaPy~ _ 00. This corresponds to the
singularity at r = 0 in the Kruskal solution, and in fact the circle contracts to
the origin as a - O.
Equation (28.8-3) has two roots p(x, y, z) of opposite sign at every point
x, y, z except on the disk x 2 + y2 < a 2, z = 0, where p = O. As the point x, y, z
passes through this disk, it is necessary to switch from the one solution to the
other, in order to make the derivatives of the gil v continuous. This is achieved
by introducing charts IDlI' ... , IDl4 as follows. In all of them, ds 2 is given by
(28.8-1), but with the sign of p(x, y, z) specified in each case.
For oni
ar,
{N = {all x, y, z, t} - {x 2 + y2 :::;; a2, z = O},
(28.8-4)
p(x, y, z) > O.
The equation for N means that the closed central disk is excluded from N.
In analogy with function theory, this disk is called a branch cut.
The next two charts serve to connect IDlI and IDl2 across the disk. For them, N
could be any simply connected region containing the open disk (but not the
circle, of course). It could be some sort ofthin wafer, but for simplicity we take
The Kerr Manifolds 239
it to be all space with the region of the x, y plane exterior to the disk excluded
as a branch cut.
N = {all x, y, z, t} - {x 2 + y2 ~ a2, z = O},
For Wl3 { (28.8-6)
sgn p(x, y, z) = sgn z.
For
on
;I.1l4
{N = same as for Wl3 . (28.8-7)
sgn p(x, y, z) = - sgn z.
The chart Wl3 agrees with Wll for z > 0 and with Wl2 for z < 0, while Wl4 agrees
with Wll for z < 0 and with Wl2 for z > o. A doubly connected manifold Wl o
can now be constructed by the mappings
(7~(~0l
Wl3 Wl 4 , (28.8-8)
(~~o)
Wl2
where, in each case, the mapping is the identity mapping
t ~ t, x~x,
y~y, z~z.
which is the metric (28.4-2) of the Finkelstein Chart 1. That is, in the limit
a = 0, (28.8-1) is exactly the Finkelstein Chart I; hence, it is expected that, for
small a, it will be necessary to supplement this chart with another copy of it
and two copies of a chart that reduces to the Finkelstein Chart II for a = 0,
joined together as the Finkelstein Charts are in the Kruskal manifold. These
latter charts are obtained by changing the sign of dt in equation (28. 8-2) for
the quantity kJi dxJi.
There are therefore two versions of the manifold 9Jl o defined by the schema
o
(28.8-8), according to the sign of dt in (28.8-2), say 9Jl and 9Jl;j ; they are both
contained in the geodesically complete manifold constructed by Boyer and
Lindquist.
(these are six equations when written in component form), that the divergence
conditions (28.9-1) are then automatically satisfied for all t ~ 0. Hence, the
full system of eight equations contains enough redundancy so that only the
six equations (28.9-2) in six unknowns need to be regarded as the equations of
evolution.
The Einstein field equations
Rpy = (28.9-3)
have similar properties, but with a somewhat different consequence. First,
since the tensors gpy and Rpy are both symmetric, we may take the gpy with
f3 :::; y as the unknowns, and we only need to consider the equations (28.9-3)
with f3 :::; y. Then there are ten equations in ten unknowns. It will be seen that
these equations impose four conditions or constraints on the initial data and
are such that, if these conditions are satisfied at t = X4 = 0, then they are
automatically satisfied later. Therefore, there are only six independent
equations of evolution for ten unknowns; the solution is hence under-
determined and contains four arbitrary functions. That is just as it ought to be.
The initial-value problem, if properly formulated, ought to determine the
geometry of space-time for X4 > 0, or at least for X4 in some interval (0, T),
but the geometry does not uniquely determine the g Ji v' owing to the possibility
of coordinate changes. Any solution of the initial-value problem can be
The Cauchy Problem 241
g p. V>
og
oxp.4v (11 1
a x, x 2 ,x 3) i.'lor x 4 = 0. (28.9-4)
The choice X4 = 0 of the initial hypersurface Y does not imply that Y is fiat,
because the metric tensor gjk in Y is arbitrary (it is recalled that Latin indices
take values from 1 to 3), but the initial data are assumed to be such that the
3 x 3 matrix (gjk) is positive definite on Y. Also, (gp..) must be nonsingular
and of signature 2 on Y. It follows that [/' is spacelike, and, by use of the
formula for matrix inverse, that g44 < 0 on Y. To simplify the present
discussion, the functions (28.9-4) are assumed to be analytic, so that the power-
series method may be used to solve the Cauchy problem. All partial derivatives
ofthe gp. v that do not involve differentiation with respect to X4 more than once,
are then determined on Y by the functions (28.9-4).
The components Rpy of the Ricci tensor are obtained from equation
(27.10-1) for Ro;PYb by contracting with respect to Il( and <5, that is, by multi-
plying by gO; b and then summing on Il( and <5 from 1 to 4. The result is
(28.9-5)
(28.9-6)
(28.9-7)
where only those terms containing second derivatives with respect to X4 have
been written; the dots stand for terms containing quantities that have been
differentiated at most once with respect to X4. [The summation convention
applies in (28.9-6, 7), but the Latin indices take only the values 1,2, 3.J Since
g44 i= 0, the differential equations R j k = 0 and the initial data determine the
second derivatives of the gjk on [/'. When these second derivatives are sub-
stituted into the other four equations Rp 4 = 0, four conditions on the initial
data are obtained, while the second derivatives of the gP4 with respect to X4
are undetermined.
The equations of evolution can be separated from the auxiliary condition
by a device due to Lichnerowicz (see Adler, Bazin, and Schiffer 1965, where
there is an excellent discussion ofthe Cauchy problem). The mixed form of the
Einstein tensor is
(28.9-8)
242 The Extension of Einstein Manifolds
where R~ p = ga yRy p is the mixed form of the Ricci tensor and R = gl' vRI' v is
the curvature scalar. It is asserted that the system of differential equations
U,'k= 1,2,3,j~k), (28.9-9)
(/3 = 1, ... ,4) (28.9-10)
is equivalent to the original system (28.9-3).
Note. For any solution gl'v of these equations, R has the value zero, so that
Gpand Rp have the same value (namely zero), but, as expressions containing
the dependent variables g/lv and their derivatives, they are different; hence
(28.9-9, 10) is a set of differential equations different from the set (28.9-3), but
equivalent to it. To prove the assertion, it suffices to note that, according to
(28.9-8), if the Rjk are set equal to zero, then,
G4 k = g 4YR yk = g44R 4k> (28.9-11)
G4 4 = !g44R 44;
hence, since g4 4 1= 0, the system (28.9-10) implies the vanishing of all Rp Y' and
this in turn implies the vanishing of all the G~ p.
If the expressions (28.9-5, 6, 7) for the components of the Ricci tensor are
substituted into (28.9-8), it is seen that the differential equations G4 p = do
not contain any second derivatives with respect to X4. Hence, these equations
play the role of auxiliary conditions; in particular they are conditions that
must be satisfied by the initial data.
An important property of the Einstein tensor (which has already played a
role in Section 28.2) is that it is divergence-free, according to equation
(27.10-10); that is,
(28.9-12)
This equation will be used to show that if the functions g /lV satisfy the differ-
ential equations (28.9-9) for all X4 in some interval [0, TJ and satisfy the
auxiliary condition (28.9-10) for X4 = 0, then they satisfy the auxiliary con-
dition also for all X4 in [0, T]. The formulas for covariant differentiation in
Section 27.5 show that (28.9-12) can be written as
(28.9-13)
where the coefficients A ~ yP6 depend only on the gI' v and their first derivatives.
Since the functions g /l v are such that all R j k are = 0, equations (28.9-8) and
(28.9-11) show that each Gjp (as a function of the gil v and their first derivatives)
can be expressed in terms of the G\ (y = 1, ... , 4). Equation (28.9-13) then
takes the form
a4 G4 p _- BYIij -~-.
-8 8 G4y + CY p G4 y.
X ox}
Concluding Remarks 243
where the coefficients BY/ and C~ depend only on the gil v and their first
derivatives. This is a linear system of differential equations for the G4 p (when
the gil v are given), in which the time derivatives are given explicitly in terms of
the spatial derivatives. According to the Cauchy-Kovalevski theorem, the
solution is unique; hence if all four G4 p vanish on Y (x 3 = 0), then they vanish
also for X4 > 0, as was to be proved.
Lastly, the equations Gjk = 0 (28.9-9) can be regarded as differential
equations for the functions gjk (j, k = 1,2, 3,j ::::; k). However, they also
contain the functions ga4 [in the terms indicated by dots in (28.9-5)]. These
four functions can be specified arbitrarily (but smoothly) in all space-time,
provided only that they match the valuesg a4 and iJg a4 /iJX 4 given on the surface
Y(X4 = 0). Then, since g44 -=f. 0, (28.9-5) shows that the equations G jk = 0
(28.9-9) determine the second time derivatives of all the gjk. By the Cauchy-
Kovalevski theorem, again, these equations have a unique solution, in some
interval 0 ::::; X4 ::::; T, for the given initial data.
The general problem ofthe extension of Einstein manifolds is far from solved.
Almost nothing is known about the existence or uniqueness of geodesically
complete extensions. The simple manifolds in Section 28.7 have no such
extensions, and the following one has many. Consider a single chart in which
J
the gllv are defined to be Minkowskian
Bifurcations in Hydrodynamic
Stability Problems
The work of Lorenz 1963 and of Ruelle and Takens 1971 initiated the
introduction into hydrodynamic stability theory of concepts and principles
from the currently active mathematical field of topological dynamical
systems. It was immediately clear that some of the concepts, for example the
concept of generic properties of systems, have many applications elsewhere
in physics. Also the new concepts and principles put new light on old things,
such as bifurcation phenomena. The idea of strange attractors and their
connection with continuous power spectra gave a new understanding of
chaotic behavior generally.
In this chapter and the next two, the new ideas are presented in the setting
of the study of the onset of turbulence.
----------L-------~A --------~--------_+A
(a) (b)
,...::::::=====:::.
I I
<q----F>
<----J>
I
I
~ I
~-:---==~:--~
0::: ______ )
"-
Figure 29.3 Wavy vortices in the Taylor problem.
In terms of the fluid's velocity field u(x, t) and its pressure field p(x, t) the
Navier-Stokes equations are
AU
at + (u V)u + Vp - vV 2 u = 0, (29.3-1)
V u = 0, (29.3-2)
in a region flA of physical space, together with the boundary condition that
and a suitable initial condition. The density has been taken = 1 by suitable
choice of units.
Let u(x and p(x) represent a steady solution of these equations, for ex-
ample the Couette flow in the Taylor problem. For study of the effect of
perturbations (either finite or infinitesimal) on that solution, it is convenient
to write the total fields as
u(x) + u(x, t), fJ(x) + p(x, t),
248 Bifurcations in Hydrodynamic Stability Problems
where now u and p represent the departure from the steady solution. The
equations are then
The do Il1 inant terms of the Navier-Stokes equation (29.3-1) are the first
and last terms on the left, which give the equation the character of a diffusion
equation. With suitable boundary conditions, the corresponding diffusion
The Normal Modes 249
equation has a unique solution, for initial u in a set dense in f), and the solu-
tion depends continuously on the initial u. Because of this continuous de-
pendence, generalized solutions can be defined for arbitrary initial u and
they too depend continuously on the initial u. See Richtmyer and Morton
1967. The solution cannot in general be continued backward for negative t,
although certain solutions can; in particular, normal mode solutions can.
The general behavior of the full nonlinear Navier-Stokes equation is
similar (see Ladyzhenskaya 1969, 1975 and Marsden and McCracken 1976,
Section 9), but the proofs are more difficult and the theory is less complete.
We shall assume that (29.4-1) or (29.4-3) has a unique solution u(t) in
f), for t ;;:: 0, for arbitrary u(O) in f). For given initial u, we call the solution
qJ(u, t), so that
u(t) = qJ(u(O), t). (29.5-1)
For fixed u, qJ(u, t) (t ;;:: 0) is called a motion in f); for fixed t ;;:: 0, the cor-
respondence u ~ qJ(u, t) is a mapping in f), which is assumed continuous;
for t = 0, it is the identity mapping, because qJ(u,O) = u. The function
qJ( ., .) is called a semiflow in f).
Although the motions cannot generally be continued backward in time,
they are unique, insofar as they can be. Stated differently, two distinct
motions never coalesce, at a finite time, so as to be identical thereafter.
in the form
(29.6-2)
hence we look for eigenfunctions tjJ and eigenvalues A of L, given by
LtjJ = AtjJ, (29.6-3)
where, of course, tjJ =1= O. [We could also start from (29.4-3) and seek solutions
of LtjJ = AMtjJ.]
In the hydro dynamical problems, L is not self-adjoint; hence the usual
spectral theory does not apply. Nevertheless, in most cases, L has a pure
point spectrum, and there are denumerably many eigenvalues, so we write
(j = 1,2, ... ). (29.6-4)
The completeness of the set {tjJ j} of eigenfunctions for the expansion of an
arbitrary u in f) has been discussed for certain hydro dynamical problems by
DiPrima and Habetler 1969, using a theorem of Naimark on operators in a
Hilbert space, and more generally by Sattinger 1970, using a theorem of
250 Bifurcations in Hydrodynamic Stability Problems
To study the onset of turbulence, it is not necessary to know about all orbits
of the system (29.4-1) in the Hilbert space 5. It suffices to know about a
special family of orbits, which, according to the physical argument given
below, lie in the so-called unstable manifold that emerges from the origin in 5.
Reduction to a Finite-Dimensional Dynamical System 251
(29.7-1)
where the ak are constants. We assume that this disturbance is still suf-
ficiently small that these modes are growing exponentially and independently.
At a still later time, the nonlinear regime (which we think of as including the
"present" instant t = 0), the solution (29.7-1) has continued to grow until,
owing to the nonlinearities, it is no longer of that simple form (although it
still depends on the parameters ai' ... , aK), but may for example begin to
spiral toward a closed orbit or exhibit other complicated nonlinear behavior.
Figure 29.4
252 Bifurcations in Hydrodynamic Stability Problems
where the vectors {xd are eigenfunctions of the adjoint problem and form a
biorthogonal system with the {I/Ik}' Hence, for any u in the unstable manifold
Wl, we take the coordinates as
For orbits lying in m, the equations of motion (29.4-1) take the form
k = 1, ... ,K. (29.7-4)
For points near the origin, we have
F k(X 1, ... , x K) = AkXk + higher order terms. (29.7-5)
The calculations of the functions F k is described in the next chapter. It is
based on the idea that if U(X1, ... , x K) is the point ofm (a point in~) corres-
ponding to coordinate values Xl' ... , XK, and if Xk(t) (k = 1, ... , K) is any
solution of (29.7-4), then the quantity
u(t) = U(X1(t), ... , xK(t (29.7-6)
must satisfy the equation (29.4-1) of evolution in~. That requirement suffices
to determine both the dependence of the Xk( . ) on t and of u( . -) on the Xk .
The computational procedure assumes analyticity throughout, so that
U(X1, ... , XK) can be expanded in a power series in the Xk with coefficients
that are elements of ~ and the Fk(Xb ... , XK) as ordinary power series. That
assumption must be regarded as tentative, although the Navier-Stokes flow
in ~ is known to have at least Coo smoothness (see Marsden and McCracken
1976).
For the n-dimensional reversible systems of interest to celestial mechanics,
one defines also the stable manifold that emerges from the origin (similarly
from any other fixed point); it is tangent at the origin to the linear manifold
spanned by the remaining eigenvectors CfJK + 1, . . . , CfJn. It can be characterized
as consisting of motions u(t) such that u(t) ~ 0 as t ~ 00. In fact, the stable
manifold is usually discussed first, and then the unstable manifold is defined
as the stable manifold that would result from replacing t by - t. Although, in
hydrodynamics, most motions cannot be reversed in time, the particular
motions that lie on the unstable manifold m can be, and m can be character-
ized as consisting of these motions such that u(t) ~ 0 as t ~ - 00.
For use in Section 29.10, we mention another version of the unstable
manifold, which refers to mappings, rather than flows. In place of the family
of mappings u ~ CfJ(u, t) in a Hilbert space depending on a continuous
parameter t we consider a family of mappings x ~ <l>m(x) in an n-dimensional
manifold m depending on a discrete parameter m, given by iterating a
mapping <1>:
<l>m(X) = <1>(<1>( ... <I>(x) .. (m iterations).
.
R R
+
I I
,.
- ... ,
I
,
- - Rc - - , " - - _ " ' - " - -
.,.'" ,"
,, ,
~
\
-x -x _x
(a) (b) (c)
Figure 29.5
Bifurcation to a Periodic Orbit 255
Figure 29.6
expanded as
x = P(R - Re)x + higher order terms. (29.8-3)
The stationary orbits x = 0 are represented by the points on the locus
F(x; R) = 0 in the x, R plane. The locus consists of the R axis and a curve
passing through the point x = 0, R = RC' as shown in three cases in Figures
29.5a, b, c.
If the next lowest term in (29.8-3) is ax 2 , there is an un symmetric bifurca-
tion; if it is ax 3 , there is a symmetric one, which is supercritical if a < 0 and
subcritical if a > O.
Stability is determined by the sign of x at points near the curves. For
example, in the case of the unsymmetric bifurcation, the motion of points in
the x, R plane is indicated by the arrows in Figure 29.6. In all cases, upturning
branches are stable and down turning ones unstable, while the solution x =
is always unstable for R > R e
In the subcritical bifurcation illustrated in Figure 29.5c, there is no stable
equilibrium in the neighborhood of x = 0 for R > Re. In this case, if R is
increased very slowly past Rc> a typical orbit takes the system from x :::::
to distant points of the configuration space in a relatively short time as soon
as R exceeds Re. This phenomenon is called an explosive transition and
contrasts with the adiabatic sequence of stable states which the orbit follows
in the other cases.
We write
,11, ,12 = a iw = a(R) iw(R), (29.9-1)
where
a(RJ = 0, a'(RJ > 0, w(RJ -=1= 0. (29.9-2)
The manifold 9)1 has two dimensions. In place of the complex conjugate
coordinates Xl and X2 in 9)1, we take real coordinates X and y such that
Xl = X + iy, X2 = X - iy.
To lowest order, a motion in ill1 is given, according to (29.7-4, 5), by
dt
~(x + iy) = Al(X + iy) = (a + iw)(x + iy).
Close to the origin, the orbits are approximately the spirals
(x + iy) ~ const. e<Tt(cos wt + i sin wt).
In polar coordinates,
f = ar + 0(r2),
(29.9-3)
(j = w + OCr).
It follows that in some neighborhood of the origin, e is always increasing,
on any orbit, and r always positive; r may increase or decrease; close to the
origin, r increases if a > and decreases if a < 0. We now investigate what
happens a little farther out.
We define the Poincare mapping of the problem as a mapping X ~ <l>(x)
of the x axis in 9)1, by saying that if an orbit has coordinates x, 0, for some t,
then it has coordinates <l>(x), 0, when f) has increased by 2n. Note that x and
<l>(x) can be either both positive or both negative. We write
<l>(x) = x[l + g(x)]. (29.9-4)
The orbit structure depends on the properties of the function g(x). From the
spirals near the origin, we see that
1 + g(O) = e2 "<T/w; (29.9-5)
in particular, g(O) = for R = RC' because then
g(x) in a Taylor series in x and R - RC' we have
(J = 0. Hence, if we expand
The coefficient a must be = 0, for otherwise g(x, Rc) would have opposite
signs for x > and x < near x = 0, and that would imply that an orbit
crosses itself, i.e., its second turn would be farther from the origin than its
first turn on one side and closer to the origin on the other. The coefficient b
is positive, because, under the assumption (29.9-2) about a, orbits near the
origin spiral out for R > Rc and in for R < Rc.
Bifurcation from a Periodic Orbit to an Invariant Torus 257
The next bifurcation, after one that results in a closed orbit, hence a periodic
motion, can result in an invariant 2-dimensional torus, as shown by the
example of Hopf 1948. Theorems on such a bifurcation have been given by
various authors, including Naimark 1959, Sacker 1964, Ruelle and Takens
1971, and Lanford 1973. The theorems have been based largely on Floquet
theory, but we shall take a more intuitive approach, based on the notion of
a Poincare mapping.
Let R 1 be the critical value of the Reynolds number R for first appearance
of the periodic orbits in a supercritical bifurcation, as discussed in the pre-
ceding sections. We assume that for some R > Rl the unstable manifold
9Jl has dimension K. 9Jl contains the 2-dimensional manifold discussed in
the preceding section, and we suppose the coordinates in 9Jl so chosen that
the first two of them are the coordinates x, y of the preceding section; the re-
mainder are called X3, .. , XK. Then, if R is not too much above Rb the
closed orbit encircles the origin in the x, y subspace and cuts the positive
and negative x axes once each. Let V be the (K - I)-dimensional hyper-
surface in 9Jl given by y = 0; then, V is intersected twice by the closed orbit,
as shown schematically in Figure 29.7, and we denote one of the intersections
v .. x
"\'\'
""
4>(x)
Figure 29.7
258 Bifurcations in Hydrodynamic Stability Problems
while the other eigenvalues of M all lie inside the unit circle. As R is increased,
the closed orbit loses stability at R > R 2 ; hence, a new bifurcation occurs at
R = R2
For R > R 2 , according to Section 29.7, the mapping <D in V has a 2-
dimensional unstable manifold or surface S in V tangent at ~ to the linear
manifold So spanned by v and v. To simplify visualization, regard V as 2-
dimensional; then S = v. S is invariant under q,; hence we may think of <D
as a mapping in S. We let u and v be real coordinates in S such that the pro-
jection of x - ~ from S onto So is (u + iv)v + (u - iv)v, and we call z = u
+ iv. Then 'the Poincare mapping becomes z ~ z' = r:xz + higher order
terms, or, more explicitly,
where the qjk are coefficients (generally complex) and where the summation
is over all nonnegative integers j and k such that j + k ~ 1.
It will be shown, under certain further assumptions, that for R > R2 there
is a nearly circular invariant closed curve rc in S encircling the point~. Then,
if we let rc be carried along by the flow in m, it leaves the hypersurface V and
traces out an invariant tube in m
which is then closed, forming a torus, when
rc passes through V again near ~.
To facilitate analysis of the Poincare mapping, it is convenient to in-
troduce new coordinates ~ and 1'/ in S in place of u and v, so chosen as to make
the Poincare mapping <D take as simple a form as possible (a normal form),
namely to eliminate some of the nonlinear terms in (29.10-4). If Ir:xl were "# 1,
we could eliminate as many of the nonlinear terms as we wish (see Siegel,
Himmelsmechanik, 1956, Section 21), but in order to let R vary through the
value R 2, where Ir:x(R) I = 1, without a singularity of the transformation, we
must retain the term r:xqi i Z2 z in (29.10-4); otherwise, we can eliminate all
Bifurcation from a Periodic Orbit to an Invariant Torus 259
We shall choose the coefficients ({JIm so that when these equations are sub-
stituted into (29.10-4), the Pointcare mapping takes the normalform
(29.10-7)
where 0((5) contains terms of degree 5 and higher in ( and ( and where /3 is
a new constant. To determine the coefficients CPlm in terms of the given
coefficients Qjk, we substitute the right member of (29.10-5) for z into the
right member of(29.1O-4); then we substitute the right member of(29.10-7) for
(' into the right member of (29.10-6) and take the result as z' for the left
member of (29.10-4). Equation (29.10-4) thereby becomes an identity in ( and
~, and the net coefficient of (P~q can be equated to zero. The resulting equa-
tions, if taken in the right order, can be solved for the coefficients CPlm pro-
vided that
(29.10-8)
as explained in the Appendix to this chapter. (The equation for the case
p = 2, Q = 1, cannot be solved for CP2 1 when Iex I = 1, but can always be
solved for /3, and we simply set CP2 1 = 0.)
For this method of analysis to succeed, we must assume, according to
(29.10-8), that as R is increased and the point ex(R) crosses the unit circle
Iex I = 1 in the complex plane, it crosses at a point which is not a root of
unity of atly order less than 6. What happens if it crosses at such a point is
discussed briefly in the next section.
If we let 11 be a new dimensionless parameter in place of R given by 11 =
IexCR) I - 1, and if rand () are polar coordinates, given by ( = re i6 , the Poincare
mapping takes the form
il: {r' = (1 + Il)r + clr 3 + fer, ()r 5, (29.10-9)
()' = () + C2 + C3 r2 + g(r, ()r4, (29.10-10)
where f and g are smooth functions and where C l , C 2 , C3, f, and g all depend
smoothly on 11 in some neighborhood of 11 = o.
The constant C l plays the role of the Landau constant; if C l < 0, which
we shall assume, we have a supercritical bifurcation.
If the higher order term containing fer, () in (29.10-9) were missing, the
circle 1(60: r = r0, where
260 Bifurcations in Hydrodynamic Stability Problems
(29.10-11)
that suggests that I> is a contracting mapping. It is also shown that if the
maximum radial displacement of ~ n under I> is denoted by
Ll n = maxlrn+l(8) - rn(8)1,
(8)
then
(29.10-14)
If, in the notation of the preceding section, a(R) passes through the unit
circle Ia I = 1 at a root of unity of degree less than 6, as R increases, the
argument given for the existence of an invariant torus breaks down. In that
case, the bifurcation may lead to one or more further periodic orbits.
We consider only the simplest case in which the higher terms of the
mapping (29.10-4) are mostly already missing, and hence do not have to be
eliminated. We assume a = exp{2nip/q} (q ~ 5) and we take the mapping
to be
(f3 real, < 0)
or, in polar coordinates,
r' = (1 + J.1)f + Clf 3 (Cl < 0),
()' = () + 2n ~. (29.11-1)
q
J
In this case there are new orbits, at a distance fo = J.1/( -Cl) (in the surface
S) from the old orbit, which are closed, because each point of the circle f = fo
is transformed into itself after q-fold iteration of the Poincare mapping.
For small positive J.1 the old orbit is unstable, if Cl < 0, and the new ones
stable. As J.1 is increased past 0, the period of the observed orbit is suddenly
increased by a factor q.
In this case there may still be an invariant torus as result ofthe bifurcation,
although our method of finding it breaks down. In the above example, there
is such a torus, because the circle r()) = fo is invariant.
Conversely, even when the invariant torus is established, there may be
points on the curve j 00 that are invariant under a certain number of iterations
of the Poincare mapping, and hence lead to closed orbits on the torus.
and are solved for the coefficients C(Jpq in that order; then C(Jpq is the only
unknown in the equation in which it first appears, and it appears in the
following way:
aPfi.qC(Jpq + ... = aC(Jpq + ....
We can solve for C(Jpq except when
Since p +q> 1, this can happen only if Ia I = 1 and then only when
alp+q-11 = 1.
Unless p + q - 1 = 0, this last equation holds only when a is a Ip + q - 11th
root of unity, and in this way tile restrictions (29.10-8) are obtained. The case
p + q - 1 = 0 occurs only for (p, q) = (2,1); that equation cannot be
solved for C(J21 if lal = 1, but it can be solved for fJ; hence we merely set
C(J21 = o.
We also record, without derivation, some bounds on fl, which suffice to
ensure the existence of the invariant limiting curve CC 00 discussed in the text.
First, the annulus (29.10-11) is mapped into the annulus (29.10-12) if
maxlflfl < fl < 133,
1t;~4d and
where max I f I means the maximum of I fer, 8, fl) I in some region r :$ rb
Ifll :$ fl1 in which the normal form (20.10-9, 10) holds. Second, the bound
(29.10-14) on drn/d8 holds if also
4
= .
C4
3J - C1
Let rl and r2 be the radii of the inner and outer cylinders (assumed infinitely
long) that confine the flow, and let Q 1 and Q 2 be their angular velocities.
The unperturbed flow has angular velocity at radius r (rl ::; r ::; r2) given by
(30.1-1)
where A and B are determined by the no-slip conditions at the walls, namely
Q 1 = A + Bid and Q 2 = A + Bid. The problem is characterized by
various dimensionless parameters, for example rt/r2 (which is fixed in a
given apparatus) and the two Reynolds numbers
(30.1-2)
263
264 Invariant Manifolds in the Taylor Problem
The stability of the basic laminar Couette flow given by (30.1-1) against
small axisymmetric disturbances was studied by G. 1. Taylor 1923, both
theoretically and experimentally. He found stability for values of R1 and It{2
represented by points lying below the curve in Figure 30.1, for the case
rdr2 = 0.880. Rayleigh had shown earlier, by a simple argument, that the
flow is stable for 0 < R1 < R 2 , that is, to the right of the dashed line in the
figure.
Taylor's calculations showed that, as R1 is increased beyond its critical
value (the value on the curve) for fixed R 2 , the basic flow becomes unstable
with respect to a normal mode whose velocity field is of the form
u = Re[f(r)e iIlZ ] (30.1-3)
in cylindrical coordinates, r, e, z. This disturbance has the structure of a
series of circular vortices, uniformly spaced in the z direction, as sketched in
Figure 29.1 in the preceding chapter. Taylor found that the experimentally
observed vortices (which can be made visible by means of particles suspended
in the fluid) were in qualitative agreement with the calculations; in particular,
their axial separation was in agreement with the value of rJ. in (30.1-3) for
which the disturbance first becomes unstable (i.e., becomes unstable for the
smallest value of R 1).
For R1 not too far above its critical value, the vortices (now called Taylor
vortices) are stable; together with the basic flow, on which they are super-
posed, they form a new steady flow (at each point in space, the fluid velocity
vector is independent of time), which persists indefinitely, so long as the
cylinders are kept rotating.
Taylor observed experimentally that when a still higher critical value of
R1 is reached, the vortices become wavy, and the waves rotate around the
axis at approximately the mean angular velocity (0 1 + O2 )/2.
Taylor's analysis, being linear, said nothing about the flow except below
and immediately above the first critical value of R1 and said nothing about
7
/'
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
----------------------~/----------------------~R2
Figure 30.1 Taylor's stability diagram (schematic).
Calculation of Invariant Manifolds 265
the stability or instability of the Taylor vortices, once they have formed.
Evidently, for Rl in the range where the Taylor vortices are observed, the
exponential growth of the normal mode (30.1-3) levels off at a finite ampli-
tude, owing to the effect of nonlinearities.
Subsequent improved theoretical and experimental work has generally
confirmed Taylor's results. However, Taylor considered only axisymmetric
disturbances, and it was shown by Krueger, Gross, and DiPrima 1966 that,
for counter-rotating cylinders with n 2 /n 1 sufficiently negative (beyond the
range studied experimentally by Taylor), the mode that first becomes un-
stable is not axisymmetric, but has an angular dependence of the form eimO ;
m increases through the values 1,2,3, ... as n 2 /n 1 is made more and more
negative. Hence the leftmost part of the curve in Figure 30.1 must be revised
downward, but only by a quite small amount, because the stability of the
mode depends only very slightly on m, in the case of the rather narrow gap
(rdr2 = 0.880) studied by Taylor.
These predictions were confirmed experimentally by Snyder 1970, who
showed also that the nonaxisymmetric modes, when they appear, are helical.
(The linear theory cannot distinguish helical vortices from wavy ring vortices;
the four normal modes containing eiazeime are all equally likely and can be
combined to give real dependence either as ~~~ rl.Z ~~~ me or ~~~(rl.Z me); only
a nonlinear theory can say which of these is preferred.)
In recent years, the nonlinear theory developed by Davey 1962, Davey,
DiPrima, and Stuart 1968, and Eagles 1971 has led to an understanding of the
structure and stability of the finite-amplitude Taylor vortices, the second
bifurcation to the wavy vortices, and the structure and stability of the wavy
vortices, as discussed below.
In a problem like this, involving a sequence of bifurcations, the theory
consists ideally of a sequence of alternately linear and nonlinear investiga-
tions. After each bifurcation the structure and amplitude of the new flow is
found by a nonlinear calculation. Its stability is then investigated by linear-
izing the equations about the new flow and studying the growth of
infinitesimal disturbances, in order to find the next bifurcation, and so on.
(30.2-10)
Calculation of Invariant Manifolds 267
hence,
K
U= "L.. "q.a.
L.. j jp
Xp+q-eju
q
(p,qE.2') j~l
K
= I
(SE.2')
XS I I' qjajs+ej_qUq,
j~ 1 (q)
(30.2-11)
where I'
denotes the sum over all q in 2 such that s + ej - q is also in 2;
it is a finite sum, for given s.
It is convenient to introduce a norm for vectors in the lattice 2: Iq I =
If~ 1 qj; Iql is a positive integer. If Iq I = 1, q is one of the vectors ej .
We now substitute U from (30.2-7) and U from (30.2-11) into the equation
of evolution (29.4-3) and equate the net coefficients of X S on the two sides,
for each sin 2. First, ifsis one of the vectors el , the quadratic terms contribute
nothing, and we have
K
I ajelMuej - LU el = 0 (l = 1, ... , K). (30.2-12)
j~ 1
But we have already seen that uej is = I/lj; hence [from (30.2-1)J
AI for j = I,
ajel = { 0 for j =f. l. (30.2-13)
To lowest order for small Xl> ... , XK, (30.2-9) then gives Xj = const. exp(Ajt},
as expected.
Second, if s is not one of the el , that is, if Is I > 1, we find, using (30.2-13),
(f
j ~ 1
sjAjM - L)U s = -
j
f I"
~1 (q)
qjajs+erqMUq + I"'B(uq, U
(q)
S - q ), (30.2-14)
where I"
denotes the sum over all q in 2 such that s + ej - q is also in 2
and q =f. s, and I'" denotes the sum over all q in 2 such that s - q is also in
2; these are finite sums.
We now show that these equations, if taken in a suitable order for the
various s in 2, determine the unknown functions Us and unknown coefficients
ajs inductively. We assume the equations (30.2-14) so ordered that all equa-
tions with a given value of IS I appear earlier than the equations with one
higher value of Is I, and we assume that when any of the equations is en-
countered, all functions and coefficients appearing in previous equations
have been determined. (The ordering of the equations with a given value of
lsi is irrelevant.) We claim that then all the u q appearing on the right of
(30.2-14) are known. In fact, for most of the terms there, Iql is less than
lsi; the only exceptions are terms in which q is of the form s + ej - el ,
for some 1 =f. j (recall that q =f. s); however, the term with that value of q
contains the coefficient qj el' which is = 0 by (30.2-13); hence, all the uq that
appear on the right of the equation may be regarded as known. The un-
knowns in that equation are therefore the function Us and the coefficients
al s (l = 1, ... , K).
268 Invariant Manifolds in the Taylor Problem
To determine the coefficient al s ' for any 1 = 1, ... , K, we take the inner
product with Xl throughout (30.2-14). The left member gives zero, and all
the terms in the first sum on the right give zero, except when q = el andj = I;
hence,
With these coefficients known for 1 = 1, ... , K, equation (30.2-14) can then
be solved for Us' In this way, the invariant K-dimensional manifold 9Jl is
determined in that neighborhood of the origin in f, in which the series
(30.2-7, 9) converge.
This is the method of Davey, of Davey, DiPrima, and Stuart, and of
Eagles. Although those authors don't describe it in those terms, its main
feature is a calculation of the unstable manifold of the origin in f,; then the
equations (30.2-9) represent a finite-dimensional dynamical system in that
manifold and can be studied by any of the standard methods, by analytical
search for fixed points and cycles, or by calculations of orbits by numerical
methods for ordinary differential equations.
An important feature of their method is the assumed orthogonality
of all the uq , except for Iql = 1, to the adjoint functions Xl>"" XK' We
have presented that assumption by equations (30.2-5, 6) as simply a con-
venient way of prescribing the coordinates Xl"'" x K in the unstable
manifold, but it has a deeper significance, and is the fundamental idea that
makes the method successful. Each Us is determined in terms of known
quantities by an inhomogeneous equation (30.2-14), in practice a differential
equation with boundary conditions. For some of those equations, when the
Reynolds number is close to one of the critical values, the operator on the
left is nearly singular, and in fact is singular when the Reynolds number is
equal to that value. According to the alternative theorem of linear algebra,
which applies also to linear equations in a Hilbert space, singular inhomo-
geneous linear equations have no solution unless the right member is
orthogonal to all solutions of the corresponding transposed homogeneous
equation, which are just Xl' ... , XK; if it is, the solutions are nonunique, but
they can be made unique by requiring that they also be orthogonal to
Xl>"" XK'
If this assumption, or something similar, is not made, some of the uq
can be unreasonably large, for Reynolds numbers of interest, presumably
preventing the convergence of the series (30.2-7).
V. u = 0, (30.3-2)
Cylindrical Coordinates 269
where u(x) is the basic laminar flow (Couette flow for the Taylor Problem).
We introduce cylindrical coordinates r, e, z and corresponding velocity
components u, v, w, so that
(30.3-4)
When these operators are applied to a vector field in the form (30.3-3),
the dependence of the unit vectors kr and ke on emust be taken into account:
o
oe ke = -kr
For the Taylor problem, the basic laminar flow is given, according to
(30.1-1), by u = 0, W = 0, and
For the Taylor problem, theory and experiment agree in indicating that,
once the Taylor vortices (or wavy vortices, or helical vortices) have been
established, the entire flow is periodic in the z direction with a period of the
order of twice the separation of the cylinders, at least when the cylinders are
very long, and in the approximation in which end effects are neglected.
Here, we shall simply assume such periodicity, and we shall assume that the
wave number rx is known, so that the period is 2n/rx. We call &It the region
given by
2n
&It: 0::;; e::;; 2n, O::;;z::;;-, (30.4-1)
!Y.
and we take f> as the Hilbert space L2(&It)6 with the inner product
(U, V) = fff D
at
V drdedz. (30.4-2)
K
m(q) = L qkmk,
k=l
and ej is that vector in the lattice ! having its jth component = 1 and its
other components = O.
------+-------------~6
Figure 30.2
Taylor vortices
Figure 30.3 Inaccessible stable helical
vortices.
274 Invariant Manifolds in the Taylor Problem
helical vortices are unstable when they first appear, but then become stable
at slightly higher values of T, as indicated by the solid curves in the figure.
Then, they are stable modes that are inaccessible in the sense that they cannot
be reached from the basic flow by a continuous sequence of stable modes.
The experimental work of Gollub and Swinney 1975 indicates that,
at a Taylor number of the order of 200T 10 a strange attractor should appear,
because they observe a continuous power spectrum. It is not possible to
extend the calculations to such high values of T, because the number of
dimensions of the unstable manifold becomes unmanageably large. It is of
course possible to continue calculating with the manifold of smaller di-
mension, determined by the eigenvalues farthest to the right in the complex
plane. Such a manifold is invariant, but not attracting. Calculations of this
sort suggest that the wavy vortices may be stable to quite high value of T,
thus delaying the appearance of a stange attractor or even of further bi-
furcations.
0
v
- -Of] -vo z B
2V v
---Of] 0
r r r2
1
-Of]
vr
1
r
0 !v (V' + V)
r
- ~r' Of] 1
--B+-
v
1
r2
0
1 1
!o 0 0 0 --B
v z r v
1 1
0 0 0 - rOo -oz
r
0 1 0 0 0 0
0 0 1 0 0 0
B = C2+ 2)
v r2 Of] Oz V
- --;:Of];
,,
, -1 0 0
(0) ,, 0 l/v 0
,,
M= ,, 0 0 l/v ,
1__ - _______________________
(0) (0)
Appendix to Chapter 30-The Matrices in Eagles' Formulation 275
u V
- - - 88 - w8 z -V + -8
u
8 u8 z
r r r r
K(U) =
(0) ~v (~+ 8V)
r 8r ~ (~88 + W8 z) 0
l8w
v 8r
0 ~ (~88 + W8 z)
(0) (0)
CHAPTER 31
276
The Landau-Hopf Model 277
where g(., ., ... , -) is periodic in each of its arguments with period 2n, and
the frequencies Wi are incommensurable, which means there is no vanishing
linear combination elw l + ... + CmWm with rational coefficients CI' ... ' Cm
If WI> .. , Wm are commensurable, the number of independent frequencies is
less than m. Suppose, for example that m = 2 and WZ/w l = p/q, where p
and q are integers. Then, for
(31.1-2)
we find that WI to = 4nq and W z to = 4np; hence f(t) is periodic (not merely
quasi-periodic) with period given by (31.1-2).
It was shown in the last section of the preceding chapter that if the first
bifurcation leads to a closed orbit, the second can lead to an attracting
invariant torus in the phase space 5. If, furthermore, the motion is such
that its orbit covers the torus densely, then a resulting function of time, such
as one ofthe coordinates in the phase space, is quasi-periodic with two periods.
Specifically, one can define two intrinsic angle-coordinates e and cp on the
torus such that e = WIt + const., cp = Wz t + const., and the orbit is dense
on the torus if and only if WI and Wz are incommensurable. After the next
bifurcation there may be motion on a 3-torus, and so on.
278 The Early Onset of Turbulence
Exactly which branch of the tree in Figure 31.1 is followed depends on the
structure of the infinitesimal perturbation that caused departure from the
basic or laminar flow when the first critical value of the Reynolds number was
reached. More generally, the phases associated with the various frequencies
depend in a random manner on that perturbation, so that (31.1-1) might
better be written as
(31.1-3)
The idea behind the Landau-Hopf model was that as soon as there are
many independent frequencies, the motion is so irregular in appearance that
it must be regarded for practical purposes as chaotic.
There are various ways in which this model can be inappropriate:
1. One of the bifurcations in the tree of Figure 31.1 may be subcritical;
then, as soon as the corresponding critical value of the Reynolds number is
exceeded, there is no nearby stable motion for the system to follow, and there
is a so-called explosive transition to a motion involving more or less remote
parts of the phase space.
2. In some problems, such as flow in a circular pipe, the basic flow is
stable to infinitesimal disturbances at all Reynolds numbers, but is unstable
to finite disturbance of rather small amplitude, and the critical amplitude for
instability decreases toward zero as the Reynolds number increases, so
that stable flow cannot be achieved in practice at high Reynolds number,
owing to the presence of small but finite disturbances.
3. Although an invariant torus generally appears at the second bifurcation,
the orbit need not be dense on it; it may return to its starting point after
winding finitely many times around; then the orbit is closed and the motion is
periodic, as mentioned in Section 29.11. In fact it is now believed, on the basis
of Peixoto's theorem (see Appendix) that closed orbits on the torus are more
likely than dense ones. This may lead to the Feigenbaum model-see
Section 31.19.
4. A possibility discussed by Ruelle and Takens 1971 is that, after a few
bifurcations, there appears an invariant point set in the phase space, which is
not a torus but a so-called strange attractor; then, as explained below, the
motion is not quasi-periodic, but aperiodic.
(31.2-1)
oZ 02 Z
ot = Z 0 U +z 0 F + II ox 2 '
The Ruelle~Takens Model 279
then it is assumed that infinitely many of the an are positive, that the bn are not
rationally related (any finite set of them is linearly independent over the
rationals), and that no two of the quantities an/n 2 are equal.
The critical values of /1 are the numbers an/n 2 ; they can be arranged in a
sequence /11 > /12 > /13 > ... ~ O. The general solution represents a moving
point in an infinite-dimensional space n with coordinates un(t), zn(t), n =
0, 1, 2, .... Hopf proved that, for /1 > /11' the fixed point at the origin
ofn attracts all other solutions, so that Un ~ 0, Zn ~ 0, as t ~ 00; as /1 decreases
past /11' that solution becomes unstable, and there is a bifurcation to an
attracting periodic orbit that grows out of the origin; when /1 decreases
past /12, that orbit also becomes unstable, and there is a bifurcation to an
attracting torus (2-torus) that grows out of the orbit, and so on. After the
kth bifurcation there is an attracting k-dimensional torus, and the orbits on
the torus are dense on it.
For fixed x, cp(x, t) is a motion, while for fixed t 2!: 0, a one-to-one mapping
of 9Jl into itself is given by x ~ <p(x, t). The function <p(x, t) is called a semi-
flow; it obviously has the semigroup property that if t and s are 2!: 0, then
<p(<p(x, t), s) = <p(x, t + s). (31.4-4)
For t = 0, <p is the identity mapping: <p(x,O) == x. We assume that <p is
continuous in x and t.
A point ~ of 9Jl is called an OJ-limit point of a motion x(t) if x(t) comes
arbitrarily close to ~ at times arbitrarily far in the future, that is, if there is a
sequence {t n }1' such that
tn ~
Ix{tn) _ ~I ~
oo}
as n ~ 00. (31.4-5)
The set of all OJ-limit points of a motion is called its OJ-limit set and is denoted
by Qx, where x is the initial point of the motion; it is a closed point set. If y
is any other point of the same orbit, then Q y = Q".
As an example, if a motion tends to a fixed point, as t ~ 00, then that point
is the OJ-limit set of the motion. If a motion in a plane spirals outward toward
a closed curve, then that curve is the OJ-limit set of the motion. If the orbit ofa
motion on a torus is dense on the torus, then the entire torus is the OJ-limit
set of every point of that orbit.
The symbol OJ refers to future time. When a motion exists for all t, a-limit
points and a-limit sets are similarly defined by letting t ~ - 00.
If a motion x(t) lies in a bounded region of 9Jl, for t 2!: 0, then its OJ-limit
set Q is non empty, and x(t) gets closer to Q as times goes on; that is,
distance {x(t), Q} ~ 0, as t ~ 00. (31.4-6)
(See Nemytskii and Stepanov 1960, Chapter V, Section 3.) However, Q
need not be stable or attracting; other nearby motions may move away
from it and never return.
An important property of OJ-limit sets has to do with the reversibility of the
motion. A solution of(31.4-1) cannot generally be continued to all negative t.
For example, the solutions of x = - x 3 on IR cannot be [except the solution
x(t) == 0], and solutions of the Navier-Stokes equations generally cannot
be, because of the parabolic nature of the equations. However, certain
special motions can be.
From the continuity and semigroup property of <p, it follows that if the
initial point x(O) is an OJ-limit point of x(t), then every subsequent point
x(t o) is also an OJ-limit point of x(t). In other words, if a motion starts in its
own OJ-limit set, it stays there. It can be proved that such motions can also
be continued indefinitely backward in time, and hence lie in il,.(O) for all t.
See Sell 1971, Theorem 11.8. This last result may seem paradoxical from the
point of view of physical observation or numerical simulation, where there
is only a finite accuracy. According to (31.4-6), any bounded motion x(t)
lies for practical purposes in its own OJ-limit set after a finite lapse of time.
That doesn't imply, of course, that finite-difference methods can be used to
282 The Early Onset of Turbulence
31.5 Attractors
An attractor is roughly a set in 9Jl such that any sufficiently nearby motion
gets closer and closer to it as time goes on. Specifically, we shall call a con-
nected closed bounded set S in 9Jl an attractor if
1. S is contained in an open set :Yl o such that for any x in !!lo the motion
<p(x, t) is in :Yl o for all t > 0;
2. If:Yl is any open set containing S (see Figure 31.2), then for any x
in :Yl o there is a time T such that <p(x, t) is in !!l for all t > T.
3. For the given region !!lo , S is the smallest set having the above properties
in the sense that if :Yl(t) is the image of :Yl o under the flow <p, then, as
t -> 00, !!let) shrinks down onto S but no further:
S = n:Yl(t).
t20
For a given attract or S, the largest open set:Yl o having the properties stated
is called the region of attraction of S. We require S to be connected, for if it
consisted of two disconnected pieces S 1 and S 2, then each of these could be
enclosed in a suitable region !!l1 or :Yl2 and be an attractor in its own right.
Smale 1967 imposes the further requirement that there be an orbit dense
in S. An example in which that requirement is not met, while the others are,
is provided by the Taylor problem of flow between rotating cylinders. After
the first bifurcation, there is a closed curve (in fact a circle) in phase space
(Hilbert space) consisting of fixed points, and any nearby motion comes
x
f?to
Figure 31.2
The Power Spectrum for Motions in [p;n 283
Sew) = foo
-00
R(T)
eiw! -
2.
1l:lT
1
dT. (31.6-2)
Comments
1. It has been tacitly assumed that all components of the vector x(t)
contribute equally to the energy of the motion. A generalization is to replace
the scalar product by
x(t + T)' Bx(t), (31.6-3)
284 The Early Onset of Turbulence
R(r) ~ b _ a
1 fb x(t + r) x(t)dt,
a (31.6-5)
where (a, b) is a long interval. As noted at the end of Section 31.4, although
x(t) is regarded as an approximation to a motion on its own w-limit set,
hence defined for all t, it is known in practice only for t ;:::: 0; hence in the
above expression, we have 0 < a < b. Still, that expression should be a good
approximation if all transients have decayed sufficiently by the time t = a.
difference f(t + T) - f(t) is arbitrarily small. Not only that, but given
e; > 0, there is a T = T(e;) > 0 such that every t-interval oflength Tcontains
at least one value T such that
If(t + T) - f(t) I < e; for all t.
Any continuous function having this last property is called almost periodic
in the sense of H. Bohr and can be expanded in a series of the form (31.7-2),
which then converges in a certain L2 sense-see Riesz and Sz. Nagy 1953,
Chapter VI. A quasi-periodic function, as we have defined it, is an almost
periodic function in which only a finite number of the frequencies Wj are
linearly independent over the rationals.
A vector-valued almost periodic function x(t) is similarly defined.
As shown earlier (Section 4.6 of Volume I), the power spectrum of an
almost-periodic function is a pure line spectrum. Hence, in the Landau-Hopf
model of the transition to turbulence, the power spectrum is a pure line
spectrum after any finite number of bifurcations, although the number of
lines in a given frequency interval might increase considerably as the Reynolds
number increases.
The Ruelle-Takens model, as we shall see, predicts that a continuous
spectrum will appear after a rather small number of bifurcations.
The almost periodic character of the motion in the Landau-Hopf model
seems implausible on intuitive grounds. If the motion is in any sense random,
it should not be able to "remember" its past behavior so precisely as to
reproduce that behavior, to any desired accuracy, at times arbitrarily far in
the future, as would be the case if it were almost periodic. This can also be
expressed in terms of the autocovariance R(T), because R(T)jR(O) is the
autocorrelation, that is, the correlation of the functions f(t) and f(t + t). A
theor~m on almost periodic functions says that the convolution of two such
functions, in the sense of (31.6-1), is also almost periodic (see Riesz-Nagy).
Hence R(T) would be almost periodic if f(t) were, and the correlation
coefficient of f(t) and f(t + T) would come arbitrarily close to 1.0 repeatedly
for certain arj>itrarily large values of T.
By con ast, for typical motion on a strange attractor, R(T) decreases
rapidly zero as t - 00, and then the power spectrum is purely continuous
and is iven by(31.6-4). For the case of the Lorentzattractor, see Section 31.11.
If the motion is not stable in this sense, there is a positive e (not necessarily
very small) such that no matter how small c5 is, there is a perturbation of
initial size < c5 that grows to a size ~ e at some later time. One of the motiva-
tions for Lorenz's work, described below, was to show that a simple proto-
type of atmospheric motion is Lyapunov unstable, with obvious implication
for the problem of weather forecasting.
The first strange attract or in a problem arising from fluid dynamics was
discovered by E. N. Lorenz in 1963. Lorenz expanded the Benard equations
of thermal convection for a horizontal layer of fluid heated from below in a
triple Fourier series with respect to the space variables, then truncated the
resulting system of ordinary differential equations for the time dependence
of the Fourier coefficients to three equations. If the Fourier coefficients in
those equations are denoted by X(t), yet), and Z(t), the equations are
X = -(JX + (JY,
y= rX - Y - XZ, (31.9-1)
Z = -bZ + XY,
or, more briefly,
x= F(X). (31.9-2)
The constants (J, r, and b are dimensionless; for the physical system
consideied by Lorenz, they have the values
(J = 10, O<r<oo; (31.9-3)
r is proportional to the Rayleigh number and is a measure of the intensity of
the heating.
Lorenz was interested in exhibiting the general kind of instability found
in atmospheric physics and did not intend the above system to be a realistic
model of the atmosphere or of thermal convection. Later Curry 1978 studied
a more realistic model of the Benard convection equations by truncating to
14 rather than three equations. He found a more complicated sequence of
bifurcations, as the Rayleigh number r is increased, but the strange attractor
was still present for certain values of r.
Lorenz showed there is a constant R, depending on (J, r, and b, such that
any solution X(t) of (31.9-1) is eventually trapped in the ball
(31.9-4)
Furthermore, it follows from (31.9-1) that the divergence of the vector field
F(X) has the constant value
V F = -(J + b + 1) = -131, (31.9-5)
The Lorenz System; the Bifurcations 287
so that the volume of a region carried along by the flow (31.9-1) in [f;3
decreases with time as exp( -13.67t). Therefore there is at least one attractor
in the ball (31.9-4), and any such attractor occupies zero volume in [f;3.
The fixed points or stationary solutions of the system (31.9-1) are as follows:
1. For any r, the origin, X = Y = Z = 0 is a fixed point. For 0 < r < 1
it is stable (in fact attracting). For r > 1 it is unstable; the linearized problem
has one positive eigenvalue and two negative ones. There is a one-dimensional
unstable manifold with a horizontal tangent vector at the origin (a vector
parallel to the plane Z = 0) and a two-dimensional stable manifold with a
vertical tangent plane.
2. For any r > 1 there are two more fixed points, called P I and P 2:
P2 : X = Y = J b(r - 1), Z = r - 1,
(31.9-6)
PI: X = Y = - Jb(r - 1), Z=r-1.
Hence, there is a first bifurcation, of the type discussed in Section 29.9, at r = 1.
To determine the stability of the new fixed points, we write X = Xo + Xl>
where Xo is given by (31.9-6), and we linearize with respect to Xl; we find
(31.9-7)
When X o , Yo, and Zo are substituted from (31.9-6), the matrix of this system
becomes
a
-1 -Jb(~ - 1));
Jb(r - 1) -b
It has one negative real eigenvalue and two complex conjugate ones. The
complex eigenvalues are in the left half-plane (hence the new fixed points
are stable) ifr < ro, where
ro = a(a + b + 3)/(a - b - 1) = 24.74. (31.9-8)
Hence there is a second bifurcation at r = ro, and this one is of the kind
discussed in Section 29.10, leading to periodic solutions. However, this
bifurcation is subcritical, as shown by the calculations of Marsden and
McCracken 1976. Hence the periodic solutions 'are present only for r < ro
and are unstable, while for r > ro an explosive transition to something else
must be expected. It turns out, as discussed below, that the transition is not
really "explosive," because of the presence of another attractor (in fact a
strange attractor) in the near vicinity in [f;3; see Section 31.17.
With each of the points PI and P 2 is associated, for r > ro, a one-dimen-
sional stable manifold and a two-dimensional unstable one. In the latter,
solutions spiral outward from the fixed point. Nearby solutions also spiral
outward and, at the same time, are drawn rapidly toward the unstable
manifold, because of the large negative eigenvalue associated with the
stable one.
288 The Early Onset of Turbulence
o
Figure 31.3a The branched surface.
The Lorenz Attractor; General Description 289
I/Io(S)
See Figure 31.3b. Numerical orbit calculations show that this mapping has
the property called locally eventually onto by Williams 1977: If I is any
interval So < S < So + 8, no matter how narrow, then for some n, the n-times
iterated mapping of I is all of BE. In other words, no matter how close to-
gether two orbits on Lo are initially, they eventually become completely
separated as time goes on. This follows, for example, from showing that if the
parameter s is suitably chosen, then l/J'o(s) ~ const. > 1 for all s.
An orbit that goes down across the branch line BB to left of center encircles
p 1 clockwise before returning to BB, and one that goes down to the right of
center encircles P 2 counterclockwise. The number of times an orbit encircles
one of the points P 1 or P 2 before moving to the other depends on how rapidly
it spirals outward from that point and also in a critical way or how far from
the center it first cut BB after coming from the other side. It is the essence of
Lorenz's discovery that the successive numbers of circuits round those points
vary in a pseudorandom way so that the motion is aperiodic.
As Lorenz pointed out, the picture based on the branched surface Lo
cannot be precise, because two orbits that go down across the branch line BB
at the same point of BB would then coincide subsequently, and that would
contradict the unique reversibility of the orbits. Hence, the two sheets of the
surface can't merge into a single sheet, but must remain separated by a
possibly very small distance. If we follow the orbits round again, the two
sheets become four, and so on. We conclude that the attractor must contain
infinitely many sheets, probably shomehow connected together into a single
structure in ~3, which will be called the Lorenz attractor and denoted by L.
However, at least for the parameter values studied by Lorenz (0" = 10,
b = t r = 28), the fine structure just described is really quite fine; the
separation of the sheets is very small, so that, to something like four-decimal
accuracy, the branched surface L o , with the motions on it as sketched,
describes the attract or fully.
290 The Early Onset of Turbulence
i:
t ,I
j:
zn+1
so ..,"
..
"./ Zn~
aa ID
e
Figure 31.4 The Lorenz graph.
The Lorenz Attractor; Aperiodic Motions 291
{
2W, if 0 ::; w::; 1,
g(W) = 2 - 2W, if 1::; w::; 1. (31.11-2)
Lorenz considered the mapping g based on this function as a sort of model for
the mapping f that arose in his calculations, in order to predict qualitatively
the statistical properties of the motion. We show how to compute the trans-
formation (31.11-1) under certain assumptions about the function feZ).
First, we assume that a linear change of the coordinate Z has been made so
that the least and greatest possible values of Zn are 0 and 1. Then the function
feZ) maps the interval [0, 1] onto itself. We assume that f(O) = f(l) = O.
[Actually,f(O) is ~0.0035, but we shall ignore this difference along with the
fine structure of the Lorenz graph.] We also assume that feZ) is differentiable,
except at the cusp, and that I1'(Z) I is greater than some constant r:x > 1 for
all Z. Then, it can be shown-see Riissmann and Zehnder 1980 or Richtmyer
1981-there is a unique continuous increasing function ((W) that transforms
the mapping Z -+ feZ) into the mapping W -+ g(W).
To calculate ((W), we denote by CPl(Z) and CP2(Z) the inverses of the rising
and falling parts of feZ), as shown in Figure 31.5. Then, it is seen from
(31.11-2) that ((W) must satisfy the equations
((W) = CPl(((2W for 0 ::; w::; 1,
((W) = CP2(((2 - 2W for 1 ::; W ::; 1.
From these equations, ((W) is calculated in succession for dyadic values of
W in the order W = 0, 1, t, i, i, i, ... , starting with ((0) = 0 and ((1) = 1,
and then for other values by the requirement of continuity. Figure 31.6a
shows the result, and Figure 31.6b shows the result of applying the trans-
formation ((W) to the Lorenz graph.
CPl,2
Z Figure 31.5
292 The Early Onset of Turbulence
.. z;i
(w)
10
III >
0 .1 .z . . .1 .6 .'7 .1 . 1.0
Figure 31.6a
LCR:NZ GRIIPH IN W
1.0
I \
.I
r
\
...,
l
, ... \
I ;.
"'n+1 / ...
.,
/ ,
,,.I
\
\
,I \,
'...
/
.I \
\
\,
I ,
I
/ \
\.
.1 \
""n---;)oo \
.1 .z .1 .6 .'7 .1
1.0
Figure 31.6b
Statistics of the Mappings f and g 293
The function (W) is Holder continuous with Holder exponent log2 IX,
where rx is the greatest lower bound of IF(Z) I, as above, but it is not absolutely
continuous; hence what is true of the mapping g: W ~g(W) for almost all
W is not necessarily true of the mapping f: Z ~ feZ) for almost all Z.
A consequence of the continuity of (W) (which does not require absolute
continuity) is that the mappingfis locally eventually onto, in the terminology
of Williams, referred to in Section 31.10; if I is any interval a < Z < a + 1::,
then for some finite number n of iterations, pnl(J) is the entire interval [0, 1].
Clearly f has that property if and only if g also has it, but for g it is nearly
obvious. Namely, the length of any interval I in W is doubled under g unless
g(I) contains the point W = 1, in which case the length is at least not de-
creased. Hence, under each pair of iterations, the length is at least doubled,
unless, for some I, g(I) and g(g(I)) both contain the point W = 1, but in that
case g(g(I)) contains all of [1, 1], so that g(g(g(I))) is [0, 1].
It follows that iteration of the mapping f is unstable in the sense of
Lyapounov, for no matter how close together two points are initially, they
will eventually be separated by a finite amount (for example at least 1).
Hence, if we can neglect the fine structure of the Lorenz graph, the motion on
the Lorenz attractor is also Lyapounov unstable, and hence has a purely
continuous power spectrum.
(31.11-2), we represent W in binary form as W = .aOa 1 a2 ... , where each
ai is or 1. Then g(W) is simply
if ao = 0,
if ao = 1.
According to Section 31.1 0, the attractor L contains infinitely many sheets (in
fact, uncountably many), all lying close to the idealized branched surface L o ,
and somehow connected together to form a many-sheeted structure in [R3.
The structure of L was investigated by R. F. Williams 1977, and we shall
describe his main results in somewhat intuitive geometrical terms, ignoring
certain topological difficulties such as arise from the presence of infinitely
many vertices of the cell complex F in a bounded region of the strip S (see
below).
The attractor is of course completely determined by the differential
equations (31.9-1). At present, however, there is no known method for
determining it precisely, starting from those equations, or even for determin-
ing its general topological properties. Furthermore, the equations are
somewhat specialized and artificial; hence there is more interest in the general
kinds of attractor that can result from equations generally similar to (31.9-1).
The objective of the work of Williams was to find all attractors that are
related to a branched manifold Lo of the kind described above in the way
the attractor L appears to be. He calls them Lorenz attractors generally.
Williams used an abstract topological construction known as the inverse
limit. The topological properties of the resulting attractor and the flow on it
are completely determined by the given semiflow on L o , and in fact by the
so-called kneading sequences of the orbits Wr and W~. Williams showed
that there are uncountably many different such attractors, i.e., topologically
different ones.
The attractors found by Williams can be embedded in [R3, but the question
which of them would result from a given differential equation system of the
type (31.9-1) remains open.
To investigate the structure, following Williams, we consider a piece of the
surface and then continue it by following the orbits in it both forward and
backward in time. If an orbit encircles Pi it moves to a sheet of L behind the
The Lorenz Attractor; Detailed Structure I 295
one it was on before (from the point of view of an observer looking at Figure
31.3), whereas, if it encircles P 2, it moves to a sheet in front. If it encircles first
one, then the other, we don't know whether it is then in front of the starting
point or behind; it might even return to its starting point, and then we
should have a periodic orbit. One of Williams's results is that the periodic
orbits are dense in L.
The unstable manifold WU(O) plays a role. As stated in Section 31.10, it
consists of two orbits that start horizontally in two opposed directions from
0, form the boundary of L o , and then continue into the interior. Conceivably
either or both of them might eventually hit the exact center of BB and then
come to rest asymptotically at o. We shall ignore that possibility as being
unlikely and assume that they continue winding indefinitely round in Lo.
Following Williams, we denote by Wl'(O) the orbit that leaves 0 to the right,
hence first meets BB at its left end and by W~(O) the orbit that leaves 0 to the
left, hence first meets BB at its right end. We define so-called kneading se-
quences for these orbits as ZlZ2Z3 ... , where Zk is = lor =2 according as the
kth circuit of the orbit is around PI or P 2. Below, it will be assumed that these
sequences start as 2111 ... and 1222 ... , respectively, for Wi(O) and W~(O).
It will be seen that they completely determine the topology of the attractor L.
In order to carry out the program, we have to make the following as-
sumptions. [That is, we have to conjecture that these things are consequences
of the differential equations (31.9-1)-numerical calculations give consider-
able support for most of them.]
1. The branched surface Lo and the semiflow on it are as described in
Section 31.10. [We say "semiflow," not "flow," because an orbit x(t) in L o ,
in contrast with those in L, cannot be uniquely determined, for t < 0, from
x(O), owing to the branching.]
2. The Poincare map l/Jo of BB is locally eventually onto.
3. There is a continuous mapping p (projection) of L onto Lo which
carries orbits of L onto orbits of Lo; if x(t) is a motion in L, then p(x(t)) is a
motion in Lo.
4. Every motion in Lo is the image under p of a unique motion in L.
(It would seem to be difficult to get supporting evidence for this assumption
from the numerical work.) Furthermore, the motion in L depends con-
tinuously on the motion in Lo; if orbits xo(t) and xo(t) are close together in
Lo for a long interval ( - T, T), then they are the projections of orbits x(t)
and x'(t) that are close together in space (i.e., in L) for a long interval.
5. The kneading sequences of Wi(O) and W~(O) start as 2111. .. and
1222... , respectively. That is, Wi(O), after initially encircling P 2, then en-
circles Plat least three times (in Lorenz's system it does so 25 times) before
again encircling P 2, and similarly for W~(O).
c C'
Figure 31.8 A surface element in the Lorenz attractor.
meets S at the point 11; hence the final curve C(l' of the surface element I:
connects 11 to f2 in S, a little in front of C(l (toward the viewer).
Before we can carry the points of C(l' round again, we must note that since
11 and f2 are on opposite sides of 0 by assumptions 5 above, we must divide
"6" into two I-cells, say C(l'1 and '6'~, one connecting 11 to 0 and the second
connecting 0 to f 2. Then, '6"1 will be carried clockwise round P 1 and C(l~
counterclockwise round P 2, thus creating two new surface elements within
L, say I:'1 and I:~. This process can be continued indefinitely.
Generally, given two points Ii and fj of the intersection of Wr(O) and
W~(O), respectively, with S, then, if they lie on the same side of 0, there may
(or may not) be a I-cell of F connecting Ii to fj (i.e., a curve in S connecting
Ii to fj and consisting of points of L-ifthere is one, there are infinitely many).
If that curve is carried along by the flow, as described above, it sweeps out
a surface element I: of L.
The origin 0 is also denoted by 10 and fo. We denote it by fo if it is then to
be carried along W~(O) to r 1 and by 10 if it is then to be carried along Wr(O)
to 11. The general surface element I: of L is then obtained by carrying a I-cell
connecting Ii to rj in S (where now we permit i = 0 or j = 0) round P 1
or round P 2 to a curve connecting Ii + 1 to f j + 1. (Then, ifli+ 1 and f j + 1 lie on
opposite sides of 0, it is necessary to divide the new curve into two I-cells,
as before.)
For given i andj, ifli and fj are connected by a I-cell of F (hence, by infinitely
many, as we shall see), we denote that fact by saying that a symbol [i, j] is
defined or exists. We now state the rules for the existence of these symbols,
298 The Early Onset of Turbulence
The rules are so chosen that [i,j] precedes [i',j'] if and only if there is a
surface element L whose initial curve connects Ii to rj and whose final curve
connects Ii' to r j,. [This describes only a part of the final curve in case 2(a).]
The following are immediate consequences of the rules (some of them
require an induction in the proof):
1. If [i, j] is a symbol, then i =1= j.
2. Each symbol has either one or two successors, depending on whether
case (a) or case (b) holds.
3. A symbol [i, j] with i and j both > 0 has exactly one predecessor,
namely [i - 1,j - 1].
4. If [i,j] is a symbol, then either Ii < rj S 0 or 0 S Ii < rj' where <
denotes the ordering obtained by projecting the strip S onto the branch line
BB of L o .
5. Since, by assumption 5 of the preceding section, the orbit WI(O), after
coming to 11' goes at least twice more round the fixed point PI before crossing
over the center into the right half of the strip S, we see that the first few
symbols, starting with [1,0] are
[1,0]
[2,0]
/ \ [0,1]
j
[3,0]
\ /
[0, 1] [1,0]
\
[0,2].
In particular [0, 1] has at least two different predecessors, [1, 0] and [2, 0].
6. Each symbol is an ultimate predecessor of any other, in the sense that
if (J = [i,j] and r = [i',j'] are given, then there is a finite sequence (Jo = (J,
(J l' (J 2, ... , (J n = r such that (J k always precedes (J k+ 1. That is clear if (J 0 = [1,0]
Prehistories 299
(or [0, IJ), since, by the rules, all other symbols follow from [1, OJ or [0, IJ,
and we have just seen that [1, OJ and [0, IJ follow from each other. On the
other hand, if T = [1, OJ (or [0, IJ), and (J is arbitrary, the sequence can be
found by appeal to what is essentially the locally eventually onto property
of the Poincare mapping %(s) of the branch line BB, which says roughly
that the sequence can be so chosen that the width of the interval (Ii' r)
constantly increases. Williams proved that [1, OJ (hence also [0, IJ) can
be reached from arbitrary (J in a finite number of steps if the derivative
o/'o(s) is > fi for all s. In the case studied by Lorenz, the minimum of o/'o(s)
is more like 1.05; however, we can replace the archlength s by a new parameter
on BB, by the method discussed in Section 31.11, so as to convert the graph
of %(s) into two straight lines, and then o/'o(s) is indeed >fi (in fact about
1.9).
7. For each i > 0 (and eachj > 0) there is at least onej (one i) such that
[i, j] is a symbol.
31.15 Prehistories
eventually onto character of the Poincare map in BE. If we follow two orbits
forward in time, then no matter how close together they are at t = 0, they
will eventually separate so as to lie in different surface elements L at some
time t > o.
300 The Early Onset of Turbulence
We show first that if [i,j] is a symbol, there are uncountably many I-cells
in F connecting I; to rj. According to the preceding section, any such I-cell
corresponds to a unique sequence of symbols
(31.16-1)
According to (7) of Section 31.14, each point Ii and each point fj is the
terminus of uncountably many I-cells in F. If we move those I-cells along
by the flow, we see that the unstable manifold WU(O) is, throughout its entire
length, the spine of a Cantor book, in the terminology of Williams (see Figure
31.9).
A sequence (31.15-1), if periodic, corresponds to a periodic orbit on L.
Given any sequence (31.15-1), we can clearly find a periodic sequence that
agrees with the given one for say - K < k < K, where K is large. Hence,
given any orbit, we can find a periodic orbit arbitrarily close to it; the periodic
orbits are dense in L. (In a physical realization or numerical simulation, of
course, the notation of a strictly periodic orbit is an empty concept, owing to
the finite accuracy and the Lyapuonov instability.)
contact with the fixed points PI and P z only at r = roo For r only slightly
above ro, an orbit emerging from PI or P z is immediately an orbit on L.
In this sense the bifurcation at ro results in .an abrupt transition from a
stationary orbit at PI or P z to motion on the attract or L.
for large n. As n ~ 00, at least in the cases studied, the power spectrum of the
motion approaches a continuous spectrum with certain universal features.
At f.1 = f.100' the motion is presumably aperiodic on a strange attractor.
There is evidence (Lorenz 1981) for an example of this behavior in the
Lorenz system at considerably higher values of the dimensionless parameter
r than values studied by Lorenz. Namely, the strange attract or that appears
at r = 24.74 persists up to a value r = r* (~250). For r considerably greater
304 The Early Onset of Turbulence
than r*, there is a periodic orbit, and as r is decreased toward r*, there is a
sequence of doublings at values rn of r that converge to r* from above, with
rn+l - rn::::: 0.214.
rn-rn-l
even Lipschitz continuity at a single point; see Boas 1960, also Exercise 1
below.
The next two examples are part of a more complete version of Peixoto's
theorem than the one given above. As stated, it is generic for a vector field
on a compact 2-manifold to have a finite number of fixed points and periodic
orbits. In a coordinate system with a fixed point at the origin, the vector field
is
F(x) = Ax +"',
where A is a 2 x 2 matrix. The theorem goes on to say that generically each
such fixed point is hyperbolic, which means that no eigenvalue of A is pure
imaginary; hence there are three possibilities: If both eigenvalues have
negative real parts, the fixed point is attracting; if both have positive real
parts, it is repelling; and if one has a positive real part and the other a negative
one, it is a saddle point.
A further conclusion of Peixoto's theorem is that the existence of an orbit
going from one saddle point to another is nongeneric; if there is such an
orbit, for some vector field, the smallest perturbation can make it miss the
second saddle point.
A conjecture that has not been proved, or even fully formulated, was made
in Section 29.6. As was stated there, the existing completeness theorems on
systems of eigenfunctions for hydrodynamic problems require that the
generalized eigenfunctions, if any, be included; see equation (29.6-5). It is
conjectured that in some suitable space of hydrodynamic systems, the exis-
tence of generalized eigenfunctions is nongeneric. An eigenvalue A is said to
have index 1 if there are no corresponding generalized eigenfunctions. It
seems likely to be strongly generic for any eigenvalue Ak to have index 1, but
then only generic for them all to have index 1, since a countable collection
of dense open sets is not necessarily open, but is in any case a Baire set.
EXERCISE
Show first that En is a closed subset of C[O, IJ by showing that if Jk(X) --+ J(x) uni-
formly, as k --> 00, and if each Jk satisfies (31.F-l), then so does f. Show next that the
complement of En is dense; show in fact that ifJis any function in C[O, IJ, it can be made
differentiable by a small perturbation, using a mollifier (see Volume I), then can be made
to violate (31.F-l) by adding to it, if necessary, a further small perturbation having a
sufficiently large derivative at x = xo. Hence each En is a nowhere dense set, hence
U:~ 1 En is a meager set, and hence it is nongeneric for a function in C[O, 1] to be
Lipschitz continuous on the right at any point.
308 The Early Onset of Turbulence
EXERCISES
I.Let X be the unit interval on the real line, as a metric space with the distance
(x, Y) =Ix - YI. Let n be a positive integer. With each rational number p/q, where
o < p < q, assocaite the interval
pIp 1
----<x<-+--,
q qnq2 q q nq2
and let Sn be the union of those intervals. Show that Sn is open and dense in X and has
Lebesgue measure:::; 4/n. Conclude that a Baire set in X can have zero Lebesgue measure,
and a meager set can have measure 1.
2. Let S be the intersection of the sets Sn. According to Exercise 1, S is a Baire set.
Show that it is uncountable by considering numbers x in [0, IJ having a binary repre-
sentation
x = .a j 0 a2 00 ... 0 a 3 00 ... 0 a4 0 0 ... ,
where each al is = 0 or 1. Show that if the number VI of zeros between al and al+ j increases
rapidly enough with I, then x is in all the sets Sn. Show that the set of all such x is un-
countable.
EXERCISES
1. Show that differentiation is nongeneric also in the Hilbert space fl = L 2 [0, 1],
by filling in the steps of the following outline of a proof: First define a linear manifold
in fl by
(31.H-l)
where the derivative is meant in the distribution sense (see Chapter 5 in Volume I).
For any if; in fl, write
00
(31.H-3)
(31.H-4)
for each M = 1, 2, .... If it can be shown that each '!lM is nowhere dense, i.e., that its
complement C'!lM is dense and open, it will follow that '!l = UKi~ 1 '!lM is a meager set
in fl. Show first that C'!lM is dense by showing that any if; E L 2 that does not already
violate the condition I if; , I :S; M can be made to do so by-adding to it an arbitrarily small
function with a large derivative. Then show that C'!lM is open as follows: Consider any
if; in C'!lM' that is, any if; such that either IWII = CJJ or IWII = M + b for some b > O.
It must be shown that there is a neighborhood of that if; that is contained in C'!lM'
Choose K such that
and show that if Xis any function in L2 such that I xii is < b/4K, then IW + x'il > M + ib,
so that C'!lM is open.
2. Show that the same is true in the real Hilbert space L 2 [0, 1] by using the same
argument as above but assuming throughout that ~-k = ~.
310 The Early Onset of Turbulence
~o = XO,
j:
c.,k -
_ Xk +
fiiX- k (k
= 1, 2, ...), (31.H-5)
00
AtjJ(x) = I akxk({Jk,
-00
where tjJ(x) is given by (31.H-7), and where the ak are POSItIve and
L (l/a k ) < 00, so that B = A -1 is compact. We consider the cylinder sets
Z M.K = ~ Ikx
{'J'I'' E L 2.. 4n 2 -K
1... k 12 <
- M2} ,
so that
(31.H-1O)
Appendix to Chapter 31-Generic Properties of Systems 311
(31.H-12)
EXERCISES
M~
P(ZM K) < ~ .
, y2n 3 K
4. Now take ak = Ikl r , where r> 1 (to make L (l/a k ) converge). Show that if
\
1 < r < 2, then P(ZM,K)---> 0 as K ---> 00, so that from (31.H-I0), PCD M) = 0, and hence
by countable additivity,
Abraham, R., and Robbin, J. (1967): Transversal Mappings and Flows. W. A. Benjamin,
New York.
Adler, R., Bazin, M., and Schiffer, M. (1965): Introduction to General Relativity.
McGraw-Hill, New York.
Barut, A. 0., and Ra<;zka, R. (1977): The Theory of Group Representations and Appli-
cations. PWN Polish Scientific Publishers, Warsaw.
Behrends, R. E., Dreitlein, J., Fronsdal, c., and Lee, W. (1962): Simple groups and
strong interaction symmetries. Rev. Mod. Phys., vol. 34, pp. 1-40.
Bacher, M. (1922): Introduction to Higher Algebra. The MacMillan Co., New York.
Boerner, H. (1955): Darstellungen von Gruppen. Springer-Verlag, Berlin, Heidelberg,
New York.
Chevalley, C. (1946): Theory of Lie Groups I. Princeton Univ. Press, Princeton.
Curry, J. H. (1978): A generalized Lorenz system. Comm. Math. Phys., vol. 60, pp.
193-204.
Davey, A. (1962): The growth of Taylor vortices in flow between rotating cylinders. J.
Fluid Mech., vol. 14, pp. 336-368.
Davey, A., DiPrima, R. c., and Stuart, J. T. (1968): On the instability of Taylor vortices.
J. Fluid Mech., vol. 31, pp. 17-52.
Dirac, P. A. M. (1928): The quantum theory of the electron. Proc. Roy. Soc. A, vol.
117, pp. 610-624.
Dirac, P. A. M. (1958): The Principles of Quantum Mechanics. Clarendon Press, Oxford.
Eagles, P. M. (1971): On stability of Taylor vortices by fifth-order amplitude expansion.
J. Fluid Mech., vol. 49, pp. 529-550.
Eisenhart, L. P. (1926): Riemannian Geometry. Princeton Univ. Press, Princeton.
Eisenhart, L. P. (1933): Continuous Groups of Transformations. Princeton Univ. Press,
Princeton.
Eisenhart, L. P. (1934): Separable systems ofStaeckel. Ann. Math., vol. 35, pp. 284 ff.
Ellis, H. G. (1972): Ether flow through a drainhole: a particle model in general relativity.
J. Math. Phys., vol. 14, pp. 104-118.
Feigenbaum, M. (1980): Universal behavior in nonlinear systems. Los Alamos Science,
vol. 1, pp. 4-27.
Finkelstein, D. (1958): Past-future asymmetry of the gravitational field of a point
particle. Phys. Rev., vol. 110, pp. 965-967.
313
314 References
Miller, W. (1973): Symmetry Groups and Their Applications. Academic Press, New York.
Montgomery, D., and Zippin, L. (l952)-see next item.
Montgomery, D., and Zippin, L. (1955): Topological Transformation Groups. Inter-
science Publishers, New York.
Morse, P. M., and Feshbach, H. (1953): Methods of Theoretical Physics I, II. McGraw-
Hill, New York.
Moser, J. (1973): Stable and Random Motions in Dynamical Systems. Princeton Univ.
Press, Princeton.
Nachbin, L. (1965): The Haar Integral. Van Nostrand, Princeton.
Naimark, M. A. (1976): The Theory of Group Representations. Nauka, Moscow.
Naimark, M. A. (1959): On some cases of periodic motions depending on parameters.
Dokl. Akad. Nauk SSSR, vol. 129, pp. 736-739.
Pauli, W. (1927): Zur Quantenmechanik des magnetischen Elektrons. Zeits.! Physik,
vol. 43, pp. 601-623.
Peixoto, M. (1962): Structural stability on two-dimensional manifolds Topology,
vol. 1, pp. 101-120.
Peter, F., and Weyl, H. (1927): Die Vollstandigkeit der primitiven Darstellungen einer
geschlossenen kontinuierlichen Gruppe. Math. Ann., vol. 97, pp. 737-755.
Redei, L. (1959): A(qebra. Akademische Verlagsgesellschaft, Leipzig.
Richtmeyer, R. D., and Morton, K. W. (1967): Difference Methods for Initial- Value
Problems. Wiley-Interscience, New York.
Richtmeyer, R. D. (1981): A study of the Lorenz attractor. Advances in Mathematics,
to appear.
Riesz, F., and Sz. Nagy, B. (1953): Ler;ons d' Analyse Fonctionnelle. Akademiai Kiad6,
Budapest.
Robertson, H. P. (1927): Bemerkung iiber separierbare Systeme in der Wellenmechanik.
Math. Annalen, vol. 98, pp. 749 ff.
Ruelle, D., and Takens, F. (1971): On the nature of turbulence. Comm. Math. Phys.,
vol. 20, pp. 167-192.
Russmann, H., and Zehnder, E. (1980): On a normal form of symmetric maps of[O, IJ.
Comm. Math. Phys., vol. 72, pp. 49-53.
Sacker, R. (1964): (a bifurcation theorem) Thesis (unpublished), New York University.
Schiff, L. (1955): Quantum Mechanics. McGraw-Hill, New York.
Schur, I. (1905): Neue Begriindung der Theorie der Gruppencharaktere. Sitzungsber.
preuss. Akad. Wiss., vol 1905, pp. 406-432.
Sell, G. R. (1971): Topological Dynamics and Ordinary Differential Equations. Van
Nostrand-Reinhold, London.
Siegel, C. L. (1956): Vorlesungen aber Himmelsmechanik. Springer-Verlag, Berlin,
Heidelberg, New York.
Siegel, C. L. and Moser, J. (1971): Lectures on Celestial Mechanics. Springer-Verlag,
Berlin, Heidelberg, New York.
Smale, S. (1967): Differentiable dynamical systems. Bull AMS, vol. 73, pp. 747-817.
Snyder, H. A. (1970): Waveforms in rotating Couette flow. IntI. Jour. Nonlinear Mech.,
vol. 5, pp. 659-685.
316 References
317
318 Index
Texts and Monographs in Physics includes books from any field of physics that might be used as
basic texts for advanced training and higher education in physics, especially for lectures and seminars
at the graduate level.