SISSA Groups Course2017

Download as pdf or txt
Download as pdf or txt
You are on page 1of 99

Basics of Differential Geometry

& Group Theory

Francesco Benini

SISSA – PhD course Fall 2019

Contents
1 Introduction 2
2 Topology 3
3 Differentiable manifolds 5
4 Lie groups and Lie algebras 24
5 Riemannian geometry 31
6 Fibre bundles 39
7 Connections on fibre bundles 45
8 Lie algebras 55
9 Low-rank examples 75
10 Highest-weight representations 81
11 Real forms and compact Lie groups 92
12 Subalgebras 97

Last update: October 25, 2019

1
1 Introduction
This course is divided into two parts. The first part is about differential geometry and fibre
bundles. The material is very standard, and is mainly taken from the book of M. Nakahara.
The second part is about Lie algebras, Lie groups and their representations. A good concise
exposition is in Chapter 13 of the book of P. Di Francesco. Another good and simple reference
is the book of R. N. Cahn. More technical details and proofs can be found in the lecture
notes by V. Kac. For spinors in various dimensions, a good reference is Appendix B of the
book of J. Polchinski, Volume 2.
We will first introduce Lie groups in terms of their action on differentiable manifolds,
which is a concrete way they appear in physics, and then move on to study their formal
properties.

Suggested readings:

• M. Nakahara, “Geometry, Topology and Physics,” Taylor & Francis (2003).

• C. J. Isham, “Modern Differential Geometry for Physicists,” World Scientific (1999).

• C. Nash, S. Sen, “Topology and Geometry for Physicists,” Elsevier (1988).

• P. Di Francesco, P. Mathieu, D. Senechal, “Conformal Field Theory,” Springer (1996).


Charpter 13

• V. Kac, “Introduction to Lie Algebras,” lecture notes (2010) available online:


http://math.mit.edu/classes/18.745/index

• R. N. Cahn, “Semi-Simple Lie Algebras and Their Representations,” Benjamin/Cummings


Publishing Company (1984). (Available online)

• J. Polchinski, “String Theory Volume II,” Cambridge University Press (1998).


Appendix B

2
2 Topology
One of the simplest structures we can define on a set X is a topology.
Definition 2.1. Let X be a set and I = {Ui | i ∈ I, Ui ⊂ X} a certain collection of subsets
of X, called open sets. Then (X, I) is called a topological space if I satisfies the following
properties:
(i) ∅, X ∈ I.
S
(ii) If {Uj | j ∈ J} is any (maybe infinite) subcollection in I, then j∈J Uj ∈ I.
T
(iii) If {Uk | k ∈ K} is any finite subcollection in I, then k∈K Uk ∈ I.
The collection I defines a topology on X.
Example 2.2.
(a) If X is a set and I is the collection of all subsets in X, then (i)–(iii) are satisfied. This
topology is called the discrete topology.1
(b) If X is a set and I = {∅, X}, then (i)–(iii) are satisfied. This topology is called the
trivial topology.2
(c) Let X be the real line R, and I the set of all open intervals (a, b) and their unions,
where a can be −∞ and b can be +∞. This topology is called the usual topology.
The usual topology on Rn is constructed in the same way from products of open sets
(a1 , b1 ) × . . . × (an , bn ).
Let (X, I) be a topological space and A a subset of X. Then I induces the relative
topology on A given by I 0 = {Ui ∩ A | Ui ∈ I}.

A topology on sets is useful because it allows us to define continuity.


Definition 2.3. Let X and Y be topological spaces. A function f : X → Y is continuous
if the inverse image of an open set in Y is an open set in X.
The inverse image of a subset C ⊂ Y is given by
f −1 (C) = {x ∈ X | f (x) ∈ C} .
¶ Exercise 1. Show that a discontinuous function, for instance
(
x for x ≤ 0
f (x) =
x + 1 for x > 0 ,
fails to be continuous by that definition. On the other hand, show that a continuous function
can send an open set to a set which is not open—for instance f (x) = x2 .
1
In this topology each point is an open set, thus is its own neighbourhood. Hence points are disconnected.
2
In this topology the only neighbourhood of a point is the whole X, therefore all points are “infinitely
close” to each other. If X contains more than one point, (X, I) is not Hausdorff (see footnote 3).

3
A special class of topological spaces is given by metric spaces on which there is an extra
structure—a metric—which induces a topology.

Definition 2.4. A metric d : X × X → R is a function that satisfies the conditions:


(i) It is symmetric, i.e. d(x, y) = d(y, x)
(ii) d(x, y) ≥ 0 and the equality holds if and only if x = y
(iii) d(x, y) + d(y, z) ≥ d(x, z).
If X is endowed with a metric d, there is an induced metric topology on X in which the
open sets are the “open discs”

U (x) = {y ∈ X | d(x, y) < }

and all their unions. The topological space (X, I) is called a metric space.3

A subset C ⊂ X is closed if its complement in X is open, i.e. if X\C ∈ I. From Definition


2.1, ∅ and X are both open and closed. Moreover the intersection of any (possibly infinite)
collection of closed sets is closed, and the union of a finite number of closed sets is closed.
The closure of A is the smallest closed set that contains A, and is denoted A. The interior
of A is the largest open subset of A, and is denoted A◦ . Notice that A is closed if and only
if A = A, and is open if and only if A◦ = A. The boundary of A is the complement of A◦
in A, and is denoted ∂A = A \ A◦ .
A family {Ai } of subsets of X if called a covering of X if
[
Ai = X .
i∈I

If all Ai are open, the covering is called an open covering.

Definition 2.5. A set X is compact if, for every open covering {Ui | i ∈ I}, there exists a
finite subset J of I such that {Uj | j ∈ J} is also a covering of X.

Theorem 2.6. A subset A of Rn is compact (with respect to the usual topology) if and only
if it is closed and bounded.

Definition 2.7.
(a) A topological space X is connected if it cannot be written as X = X1 ∪ X2 where
X1,2 are both open, non-empty, and X1 ∩ X2 = ∅. Otherwise X is disconnected.
(b) A topological space X is arcwise connected if, for any points x, y ∈ X, there exists
a continuous function f : [0, 1] → X such that f (0) = x and f (1) = y. With a few
pathological exceptions, arcwise connectedness is equivalent to connectedness.
3
All metric spaces are Hausdorff. A topological space (X, I) is Hausdorff, or separable, if for any two
distinct points x, y, there exist a neighbourhood U of x and a neighbourhood V of y that are disjoint.

4
(c) A loop is a continuous map f : [0, 1] → X such that f (0) = f (1). If any loop in X can
be continuously shrunk to a point,4 X is called simply connected.

Example 2.8. R \ {0} is disconnected, R2 \ {0} is connected but not simply-connected,


Rn≥3 \ {0} is connected and simply-connected.

Definition 2.9. Let X1 and X2 be topological spaces. A map f : X1 → X2 is a homeo-


morphism if it is continuous and it has an inverse f −1 : X2 → X1 which is also continuous.
If there exists a homeomorphism between X1 and X2 , they are said to be homeomorphic.

Quantities that are preserved under homeomorphisms are called topological invariants.
Clearly if two spaces have at least one topological invariant that differ, they cannot be
homeomorphic. Unfortunately we do not know a complete list of topological invariants that
can characterize different topologies.

A coarser equivalence class than homeomorphism, is homotopy type.

Definition 2.10. Two continuous functions f, g : X → Y from one topological space to


another are homotopic if one can be continuously deformed into the other, namely if there
exists a continuous function H : X × [0, 1] → Y such that H( · , 0) = f and H( · , 1) = g.
The two spaces X, Y are homotopy equivalent, or of the same homotopy type, if there
exist continuous maps f : X → Y and g : Y → X such that g ◦ f is homotopic to idX and
f ◦ g is homotopic to idY .

Two spaces that are homeomorphic are also of the same homotopy type, but the converse
is not true.
¶ Exercise 2. Show that the circle S 1 and the infinite cylinder S 1 × R are of the same
homotopy type.
Spaces that are homotopy equivalent to a point are called contractible.

3 Differentiable manifolds
Adding “topology” to a space X allows us to talk about continuous functions, i.e. it intro-
duces a concept of continuity. A much richer structure is the concept of differentiability, what
is called a “smooth (C ∞ ) structure”. A manifold is an object that is locally homeomorphic
to Rm , and thus inherits a differentiable structure.

Definition 3.1. M is an m-dimensional differentiable manifold if:


(i) M is a topological space
4
This can be stated as the existence of a continuous map g : [0, 1] × [0, 1] → X such that g(x, 0) = f (x)
and g(x, 1) = y for some point y ∈ X. Alternatively, we demand the existence of a disk in X whose boundary
is the loop.

5
(ii) M is provided with a collection of pairs {(Ui , ϕi )}.
{Ui } is a collection of open sets which covers M . ϕi is a homeomorphism from Ui onto
an open subset Ui0 in Rm
(iii) Given Ui and Uj with Ui ∩ Uj 6= ∅, the map ψij = ϕi ◦ ϕ−1
j from ϕj (Ui ∩ Uj ) to
ϕi (Ui ∩ Uj ) is C ∞ (we call it smooth).

Each pair (Ui , ϕi ) is called a chart (see Figure), while the whole family {(Ui , ϕi )} is called
an atlas. The homeomorphism ϕi is represented by m functions

ϕi (p) = {x1 (p), . . . , xm (p)} = {xµ (p)}

called coordinates. If Ui and Uj overlap, two coordinate systems are assigned to a point.
The functions ψij = ϕi ◦ ϕ−1
j , called coordinate transition functions, are from an open
set in R to another open set in Rm . We write them as
m

xµ (y ν )

and are required to be C ∞ with respect to the standard definition in Rm . If the union of
two atlases {(Ui , ϕi )}, {(Vj , ψj )} is again an atlas, the two atlases are said to be compati-
ble. Compatibility is an equivalence relation, and each possible equivalence class is called a
differentiable structure on M .
Manifolds with boundaries can be defined in a similar way. The topological space
M must be covered by a family of open sets {Ui } each homeomorphic to an open set in

6
Hm = {(x1 , . . . , xm ) | xm ≥ 0}. The set of points that are mapped to points with xm = 0 is
the boundary of M , denoted by ∂M (see Figure).5
The boundary ∂M is itself a manifold, with an atlas induced from M .
Example 3.2. These are examples of manifolds.
(a) The Euclidean space Rm is covered by a single chart and ϕ is the identity map.
(b) The n-dimensional sphere S n , realized in Rn+1 as
Xn
(xi )2 = 1 .
i=0

It can be covered with 2(n + 1) patches:6


Ui± = {(x0 , . . . , xn ) ∈ S n | xi ≷ 0} .
The coordinate maps ϕi± : Ui± → Rn are given by
ϕi± (x0 , . . . , xn ) = (x0 , . . . , xi−1 , xi+1 , . . . , xn )
where we omit xi . The coordinate transition functions are
 q 
ψis,jt = ϕis ◦ ϕ−1 0 j k 2 bi . . . , xn
P
jt = x , . . . , x = t 1 − k(6=j) (x ) , . . . x
5
The boundary of a manifold M is not the same as the boundary of M in the topological sense. In fact,
if X is a topological space then X = X ◦ = X and thus ∂X = ∅.
6
Two patches are enough to cover S n , just remove a point of S n in each case. However writing down the
coordinates in terms of those in Rn+1 requires the stereographic projection, and it is not completely trivial.
Our parametrization here is simpler.

7
where s, t ∈ {±1} and we omit xi from the list.
In fact, since S n \ {pt} is homeomorphic to Rn , we can cover the sphere we two patches
obtained by removing antipodal points.
(c) The real projective space RPn is the set of lines through the origin in Rn+1 . If x =
(x0 , . . . , xn ) 6= 0 is a point in Rn+1 , it defines a line through the origin. Another non-
zero point y defines the same line if and only if there exists a ∈ R6=0 such that y = ax.
This defines an equivalence relation x ∼ y, and we define
RPn = Rn+1 \ {0} / ∼ .


The n + 1 numbers x0 , . . . , xn are called homogeneous coordinates, but they are


redundant. Instead we define patches
Ui = {lines in RPn | xi 6= 0} .
and inhomogeneous coordinates on Ui given by
j
ξ(i) = xj /xi ,
i
where the entry ξ(i) = 1 is omitted. For x ∈ Ui ∩ Uj the coordinate transition functions
are
k k k i xk xj k i
ψij : ξ(j) → ξ(i) = x /x = j i = ξ(j) /ξ(j) .
x x
Let f : M → N be a map from an m-dimensional manifold M to an n-dimensional
manifold N . A point p ∈ M is mapped to a point f (p) ∈ N . Take charts (U, ϕ) on M
and (V, ψ) on N such that p ∈ U and f (p) ∈ V . Then f has the following coordinate
presentation:
ψ ◦ f ◦ ϕ−1 : Rm → Rn
(the function is defined on ϕ(U ) ⊂ Rm ). If we write ϕ(p) = {xµ } and ψ(f (p)) = {y α }, then
ψ ◦ f ◦ ϕ−1 is given by n functions of m variables, and with some abuse of notation we write
y α = f α (xµ )
(really we should write y α = (ψf ϕ−1 )α (xµ )). We say that f is smooth (C ∞ ) if ψ ◦ f ◦ ϕ−1 is,
as a function from Rm to Rn . The property of being C ∞ does not depend on the coordinates
used, since coordinate transformations are C ∞ .
Definition 3.3. If f is invertible and both y = ψf ϕ−1 (x) and x = ϕf −1 ψ −1 (y) are C ∞ for
all charts, f is called a diffeomorphism and M is diffeomorphic to N , denoted M ≡ N .
Clearly, if M ≡ N then dim M = dim N .7 Two spaces that are diffeomorphic are also
homeomorphic, however the opposite is not true. There can be multiple differentiable struc-
tures on the same topological space. For instance, S 7 admits 28 differentiable structures
(Milnor, 1956).8
7
If f is a diffeomorphism, then by definition y(x) : Rm → Rn is smooth and with smooth inverse. In
α α
∂xµ ∂xµ ∂y
particular, at any point, ∂y α µ
∂xµ ∂y β = δβ and ∂y α ∂xν = δν . This implies that m = n.
8
The 28 exotic 7-spheres can be constructed as follows: intersect in C5 the zero-locus of the complex

8
3.1 Vectors
An open curve on a manifold M is a map

c : (a, b) → M ,

where (a, b) is an open interval and we take a < 0 < b. We also assume that the curve does
not intersect itself. On a chart (U, ϕ), the curve c(t) has presentation x = ϕ c : R → Rm .
A function on M is a smooth map f : M → R. On a chart (U, ϕ), the function has
presentation f ϕ−1 : Rm → R which is a real function of m variables. We denote the set of
smooth functions on M by F (M ).

Given a curve c : (a, b) → M , we define the tangent vector to c at c(0) as a directional


derivative of functions f : M → R along the curve c(t) at t = 0. Such a derivative is

df (c(t))
.
dt t=0

In terms of local coordinates, this becomes

(∂f /∂xµ ) dxµ (c(t))/dt



t=0
.

We use the convention that repeated indices are summed over. [Note the abuse of notation!
∂f /∂xµ means ∂(f ϕ−1 (x))/∂xµ .] Thus, df (c(t))/dt at t = 0 is obtained by applying a
first-order linear differential operator X to f , where

∂ µ dxµ (c(t))
µ
X≡X and X = ,
∂xµ dt t=0

namely
df (c(t)) ∂
= X µ µ f = X[f ] .
dt t=0 ∂x
µ µ µ
We define X = X ∂/∂x = X ∂µ as the tangent vector to M at c(0) along the direction
given by the curve c(t).
We could introduce an equivalence relation between curves in M . If two curves c1 (t) and
c2 (t) satisfy
(i) c1 (0) = c2 (0) = p
(ii) dxµ (c1 (t))/dt t=0
= dxµ (c2 (t))/dt t=0
then they yield the same differential operator X at p. We declare c1 (t) ∼ c2 (t) and identify
the tangent vector X with the equivalence class of curves.
equation a2 + b2 + c2 + d3 + e6k−1 = 0 with a small sphere around the origin. For k = 1, . . . , 28 one obtains
all differentiable structures on S 7 (Brieskorn, 1966). These are called Brieskorn spheres.

9
The set of linear first-order differential operators at p ∈ M is a vector space called the
tangent space of M at p, denoted by Tp M . The structure of vector space is natural in
terms of linear differential operators.9 Evidently,

eµ = ≡ ∂µ µ = 1, . . . , m
∂xµ
are basis vectors of Tp M and dim Tp M = m. The basis {eµ } is called the coordinate basis.
If a vector V ∈ Tp M is written as
V = V µ eµ ,
the numbers V µ are called the components of V with respect to the basis {eµ }. The trans-
formation law of the components under change of coordinates follows from the fact that a
vector V exists independently of any choice of coordinates:
∂ eν ∂ xν
e ν = V µ ∂e
V =Vµ = V ⇒ V .
∂xµ xν
∂e ∂xµ
Notice that indices are contracted in the natural way.

3.2 One-forms
Since Tp M is a vector space, there exists a dual vector space Tp∗ M given by linear functions
ω : Tp M → R and called the cotangent space at p. The elements of Tp∗ M are called
one-forms.
A simple example of one-form is the differential df of a function f ∈ F (M ). The action
of df ∈ Tp∗ M on V ∈ Tp M is defined as
hdf, V i ≡ V [f ] = V µ ∂µ f ∈ R .
Since df = (∂f /∂xµ ) dxµ , we regard {dxµ } as a basis of Tp∗ M . Notice that, indeed,
D ∂ E ∂xµ
dxµ , ν = = δνµ .
∂x ∂xν
Thus {dxµ } is the dual basis to {∂µ }. An arbitrary one-form is written as ω = ωµ dxµ and
the action on a vector V = V µ ∂µ is
hω, V i = ωµ V µ .
The transformation law of the components under change of coordinates is easily obtained:
∂xµ
ω
eν = ωµ .

∂e
It is such that intrinsic objects are coordinate-independent.
9
We can define a sum of curves through p as follows: use coordinates where p is the origin of Rm , then
sum the coordinates of the two curves. We define multiplication of a curve by a real number in a similar
way. These definitions depend on the coordinates chosen, however the induced operations on the equivalence
classes (namely on tangent vectors) do not.

10
3.3 Tensors
A tensor of type (q, r) can be defined as a multilinear object
T ∈ Hom Tp M ⊗r , Tp M ⊗q = Tp∗ M ⊗r ⊗ Tp M ⊗q .


In components it is written as
T = Tνµ11...ν
...µq
r
∂µ1 . . . ∂µq dxν1 . . . dxνr .
q
We denote the vector space of (q, r) tensors at p by Tr,p M.

3.4 Tensor fields


If a vector is assigned smoothly to each point of M , it is called a vector field on M . A
vector field is a map
V : F (M ) → F (M ) .
The set of vector fields on M is denoted as X (M ). A vector field V at p is denoted by
V |p ∈ Tp M .
Similarly, we define a tensor field of type (q, r) by a smooth assignment of an element of
Tr,p M at each point p ∈ M , and we denote the set of tensor fields by Trq (M ). Notice that
q

T00 (M ) = F (M ) = Ω0 (M ) , T01 (M ) = X (M ) , T10 (M ) = Ω1 (M ) .

3.5 Pull-back and push-forward


Consider a smooth map f : M → N between manifolds. If we have a function g : N → R in
F (N ), then f can be used to pull it back to M :
f ∗g ≡ g ◦ f : M → R
is in F (M ).
The map f also induces a map f∗ called the differential map, or push-forward of a
vector,
f∗ : Tp M → Tf (p) N .
Since a vector in f (p) is a derivative, we define
(f∗ V )[g] = V [f ∗ g]
where V ∈ Tp M . Writing such a definition in coordinates and setting V = V µ ∂/∂xµ ,
f∗ V = W ν ∂/∂y ν and y ν (xµ ) representing f in a chart, we find10
∂y ν
W ν = (f ∗ V )ν = V µ .
∂xµ
ν
∂g ∗ µ ∂ µ ∂g ∂y
10
Indeed (f∗ V )[g] = W ν = V [f g] = V (g ◦ f ) = V .
∂y ν ∂xµ ∂y ν ∂xµ

11
This is the same transformation law as for a coordinate transformation, but here it is more
general. The map f∗ is linear, and equal to the Jacobian (∂y ν /∂xµ ) of f at p. The push
forward is naturally extended to tensors of type (q, 0).
In general the push-forward cannot be promoted to a map between vector fields: different
points on M can be mapped by f to the same point on N , and f could be not surjective.
However if f is a diffeomorphism11 then f∗ : X (M ) → X (N ). [More generally, f could be a
diffeomorphism from an open subset of M to an open subset of N .]

The map f also induces a map f ∗ called the pull-back,

f ∗ : Tf∗(p) N → Tp∗ M .

Since a one-form is a linear function on the tangent space, we define

hf ∗ ω, V i = hω, f∗ V i .

In components
∂y ν
(f ∗ ω)µ = ων
.
∂xµ
If ω is a one-form field on N , then f ∗ ω is well-defined at all points on M . Therefore the
pull-back f ∗ can be promoted to a map of one-form fields:

f ∗ : Ω1 (N ) → Ω1 (M ) .

The pull-back naturally extends to tensor fields of type (0, r).

3.6 Submanifolds
Definition 3.4. Let f : M → N be a smooth map and dim M ≤ dim N .
(a) f is an immersion of M into N if f∗ : Tp M → Tf (p) N has rank f∗ = dim M (f∗ is
injective) for all p ∈ M .
(b) f is an embedding if f is an immersion and is an homeomorphism (in particular f is
injective).
In this case f (M ) is a submanifold of N . [In practice, f (M ) is diffeomorphic to M .]

See Figure.
11
This condition cannot be weakened. For example, consider the vector field V = ∂x on R. Consider the
map f : M = R → N = R given by y = f (x) = x3 . This map is C ∞ , invertible, f −1 ∈ C 0 but is not smooth.
The push-forward of the vector field is f∗ V = 1 · 3x2 · ∂y = 3y 2/3 ∂y . The vector f∗ V is defined at all points
of N , but does not form a vector field because it is not smooth at y = 0 (the component is only C 0 ).

12
3.7 Lie derivative
Let X be a vector field in M . An integral curve x(t) of X is a curve in M , whose tangent
vector at x(t) is X|x(t) for all t. Given a chart, the condition can be written in components
as12
dxµ
= X µ x(t) .

dt
This is an ODE, therefore given an initial condition xµ0 = xµ (0), the solution exists and is
unique.
Let σ(t, x0 ) be an integral curve of X passing through x0 at t = 0 and denote its coordinates
by σ µ (t, x0 ). It satisfies

d µ
σ (t, x0 ) = X µ σ(t, x0 ) σ µ (0, x0 ) = xµ0 .

with
dt
The map
σ :R×M →M
is called a flow generated by X ∈ X (M ). Each point of M is “evolved” along the vector
field X, and in particular 
σ t, σ(s, x0 ) = σ(t + s, x0 ) .
[¶ The two sides of the equation solve the same ODE.]
If we fix t ∈ R, the flow σt : M → M is a diffeomorphism. In fact, we have a one-parameter
commutative group of diffeomorphisms, satisfying
(i) σt ◦ σs = σt+s
(ii) σ0 = id
(iii) σ−t = (σt )−1 .
The last property guarantees that they are all diffeomorphisms.

12
This definition does not depend on the coordinates on M . It can be written in a more invariant way as
d
x∗ =X x(t)
.
dt
However it does depend on the parametrization of the curve. This is because an integral curve has more
structure on R: one obtains a group structure.

13
We would like to define derivatives, in the direction of a vector X, of the various objects
we have constructed on a manifold M . We have already defined the derivative of a function:
let us denote it by
d d ∗
LX f = X[f ] = f (σt ) = σ f ,
dt t=0 dt t t=0
where we have rewritten the definition of a vector X in terms of the flow σt it generates.
To compute the derivative of a vector field Y ∈ X (M ) we encounter a problem: how do
we compare vectors at different points of M , since they live in different vector spaces Tp M ?
Since the points are connected by the flow σt generated by X, we can use the push-forward
(or differential map) (σt )∗ to map—in a natural way—the vector space at a point into the one
at another (since σt is a diffeomorphism, the push-forward is a map between vector fields).
More precisely we use
(σ−t )∗ : Tσt (p) M → Tp M .
Thus we define the Lie derivative of the vector field Y along the vector field X as
 
1 d
LX Y = lim (σ−t )∗ Y |σt (p) − Y |p = (σ−t )∗ Y |σt (p) .
t→0 t dt t=0

¶ Exercise 3. Compute the Lie derivative in components on a chart with coordinates x. Let
X = X µ ∂µ and Y = Y µ ∂µ . The flow of X is

σt (x) = xµ + tX µ + O(t2 )

and we can work at first order in t. We have

Y |σt (x) ' Y µ σt (x) ∂µ x+tX ' Y µ (x) + ∂ρ Y µ (x) · tX ρ ∂µ


  
x+tX
.

Then we use the formula for the push-forward:



(σ−t )∗ Y |σt (x) ' Y |σt (x) ∂µ (xν − tX ν ) ∂ν x

Putting the pieces together, we find

LX Y = X µ ∂µ Y ν − Y µ ∂µ X ν ∂ν .


¶ Exercise 4. Define the Lie bracket [X, Y ] of two vector fields X, Y ∈ X (M ) by

[X, Y ]f = X[Y [f ]] − Y [X[f ]] ,

in terms of f ∈ F (M ). Show that [X, Y ] is a vector field (a field of linear first-order


differential operators) given by

[X, Y ] = X µ ∂µ Y ν − Y µ ∂µ X ν ∂ν .


14
These two exercises show that the Lie derivative of Y along X is given by

LX Y = [X, Y ] .

Proposition 3.5. The Lie bracket satisfies the following properties:


(a) Skew-symmetry
[X, Y ] = −[Y, X] .

(b) Bilinearity
[X, c1 Y1 + c2 Y2 ] = c1 [X, Y1 ] + c2 [X, Y2 ]
for constants c1 , c2 .
(c) Jacobi identity
[[X, Y ], Z] + [[Y, Z], X] + [[Z, X], Y ] = 0 .

(d) Liebnitz rule


[X, f Y ] = X[f ] Y + f [X, Y ] .

They are evident from the definition.


Remark.
(a) The properties above imply that the Lie derivative satisfies the Liebnitz rule:

LX f Y = (LX f ) Y + f LX Y
LX [Y, Z] = [LX Y, Z] + [Y, LX Z] .

We also have
Lf X Y = f LX Y − Y [f ] X .

(b) Given X, Y ∈ X (M ) and a diffeomorphism f : M → N , one can show that

f∗ [X, Y ] = [f∗ X, f∗ Y ] .

In a similar way, we can define the Lie derivative of a one-form ω ∈ Ω1 (M ) along the
vector field X using the pull-back:
 
1 ∗ d
LX ω = lim (σt ) ω|σt (p) − ω|p = (σt )∗ ω|σt (p) .
t→0 t dt t=0

¶ Exercise 5. With ω = ωµ dxµ , we compute the Lie derivative in components. At first order
in t we find
(σt )∗ ω σt (x) ' ων (x) + tX ρ ∂ρ ων (x) ∂µ (xν + tX ν ) dxµ x
 

which leads to
LX ω = X ν ∂ν ωµ + ∂µ X ν ων dxµ .


15
The Lie derivative of a tensor field is defined in a similar way. Given a tensor of type (q, r),
q
we can map Tr,σ t (p)
q
M → Tr,p M using (σ−t )⊗q∗ ⊗ (σt )
∗ ⊗r
. However it is more conveniently
defined by the following proposition.

Proposition 3.6. The Lie derivative is completely specified by the following properties:
(a) For a function and a vector field,

LX f = X[f ] , LX Y = [X, Y ] .

(b) For two tensors t1,2 of the same type,

LX (t1 + t2 ) = LX t1 + LX t2 .

(c) For arbitrary tensors t1,2 the Liebnitz rule holds:

LX (t1 ⊗ t2 ) = (LX t1 ) ⊗ t2 + t1 ⊗ (LX t2 ) .

This holds even when the tensors are (partially) contracted.

Proof. It is enough to notice that (σ−t )⊗q


∗ ⊗ (σt )
∗ ⊗r
T |σt (p) , when expanded at first order, is
a sum of terms each from the action of a single factor. If T = t1 ⊗ t2 , we get the sum of
terms from t1 and the sum of terms from t2 .

The formula for the Lie derivative of a one-form follows from contraction with an arbitrary
vector field Y :
 
hLX ω, Y i = LX hω, Y i − hω, LX Y i = X hω, Y i − hω, [X, Y ]i .
µ ...µ
For an arbitrary tensor field, we write T = Tν11...νrq ∂µ1 . . . ∂µq dxν1 . . . dxνr and apply the Lie
derivative to each factor separately. We find
q r
X X µ ...µ
(LX T )µν11...ν
...µq
r
=X λ
∂λ Tνµ11...ν
...µq
r
− ∂λ X µs
Tνµ11...ν
...λ...µq
r
+ ∂νs X λ Tν11...λ...ν
q
r
.
s=1 s=1

This reproduces the special cases above.

Proposition 3.7. For an arbitrary tensor field t:

L[X,Y ] t = LX LY t − LY LX t .

Proof. The equality is valid when t is a function or a vector field. Moreover both sides are
linear in t and satisfy the Liebnitz rule. It follows that they are equal.

16
3.8 Differential forms
Definition 3.8. A differential form of order r, or r-form, is a totally antisymmetric
tensor of type (0, r).

A (0, r)-tensor ω is a multi-linear function on (Tp M )⊗r . In the coordinate basis,

ω(∂µ1 , . . . , ∂µr ) = ωµ1 ...µr

gives its components. A totally antisymmetric tensor is such that

ω(VP (1) , . . . , VP (r) ) = sgn(P ) ω(V1 , . . . , Vr )

where P is a permutation of r elements and sgn(P ) = +1 for even permutations and −1 for
odd permutations.13

Definition 3.9. The wedge product ∧ of r one-forms is the totally antisymmetric tensor
product X
dxµ1 ∧ . . . ∧ dxµr = sgn(P ) dxµP (1) ⊗ . . . ⊗ dxµP (r) . (3.1)
P ∈Sr

For example,
dxµ ∧ dxν = dxµ ⊗ dxν − dxν ⊗ dxµ .
Clearly dxµ1 ∧ . . . ∧ dxµr = 0 if some index µ appears at least twice.

We denote the vector space of r-forms at p ∈ M by Ωrp (M ). A basis is given by the r-forms
(3.1), and a general element is expanded as
1
ω= ωµ ...µ dxµ1 ∧ . . . ∧ dxµr
r! 1 r
where the components ωµ1 ...µr are totally antisymmetric. We can introduce the antisym-
metrization of indices:
1 X
T[µ1 ...µr ] = sgn(P ) TµP (1) ...µP (r) .
r! P ∈S
r

Then we can write


ωµ1 ...µr = ω[µ1 ...µr ] .
The dimension of Ωrp (M ) is
 
m m!
= .
r r! (m − r)!
In particular Ω0p (M ) = R, Ω1p (M ) = Tp∗ M , and there are no r-forms for r > m.
13
The sign of a permutation P can be defined as the determinant of the matrix that represents the
permutation.

17
Definition 3.10. The exterior product of forms,
∧ : Ωqp (M ) × Ωrp (M ) → Ωpq+r (M ) ,
is defined as the linear extension of the same object acting on products of one-forms.
It immediately follows that, if ξ ∈ Ωqp (M ) and η ∈ Ωrp (M ),
ξ ∧ η = (−1)qr η ∧ ξ
and in particular ξ ∧ ξ = 0 if q is odd. With this product, we define an algebra
Ω•p (M ) = Ω0p (M ) ⊕ Ω1p (M ) ⊕ . . . ⊕ Ωm
p (M )

of all differential forms at p. The spaces of smooth r-forms on M are denoted as Ωr (M ),


and in particular Ω0 (M ) = F (M ) is the space of functions.
Definition 3.11. The exterior derivative is a map (more precisely, a collection of maps)
d : Ωr (M ) → Ωr+1 (M ) whose action on a form
1
ω= ωµ ...µ dxµ1 ∧ . . . ∧ dxµr
r! 1 r
is
1 ∂ 
dω = ω µ ...µ dxν ∧ dxµ1 ∧ . . . ∧ dxµr .
r! ∂xν 1 r
Notice that only the antisymmetric part ∂[ν ωµ1 ...µr ] contributes to the sum.
Proposition 3.12. The exterior derivative satisfies the two following important properties:
(a) Liebnitz rule: let ξ ∈ Ωq (M ) and ω ∈ Ωr (M ), then
d(ξ ∧ ω) = dξ ∧ ω + (−1)q ξ ∧ dω .

(b) Nilpotency:
d2 = 0 .
In fact, these two properties together with df = (∂f /∂xµ ) dxµ are an alternative definition
of d.
Proof. (a) can be proved as an exercise. (b) is proven by computing d2 ω:
1 ∂ 2 ωµ1 ...µr λ
d2 ω = dx ∧ dxν ∧ dxµ1 ∧ . . . ∧ dxµr .
r! ∂xλ ∂xν
This vanishes because the derivative is symmetric in λ ↔ ν while the differential is antisym-
metric. Our previous definition of d immediately follows from the properties.
One can write coordinate-free expressions for the exterior derivative of forms. For instance
for one-forms:
dω(X, Y ) = X[ω(Y )] − Y [ω(X)] − ω([X, Y ]) .
(¶ It can be proven in components.)
¶ Exercise 6. The formula can be generalized to n-forms.

18
Recall that a map f : M → N induces the pull-back f ∗ : Tf∗(p) N → Tp∗ (M ), which is
naturally extended to tensor fields of type (0, r) and thus also to forms.
¶ Exercise 7. Let ξ, ω ∈ Ω• (M ) and let f : M → N . Then verify that

d(f ∗ ω) = f ∗ (dω)
f ∗ (ξ ∧ ω) = (f ∗ ξ) ∧ (f ∗ ω) .

The exterior derivative induces the sequence


i d d d d d
− Ω0 (M ) →
0→ − Ω1 (M ) → − Ωm−1 (M ) →
− ... → − Ωm (M ) →
− 0.

This is called the de Rham complex. Since d2 = 0, we have

im d ⊆ ker d .

An element of ker d is called a closed form, namely dω = 0. An element of im d is called


an exact form, namely ω = dψ for some ψ. The quotient space ker dr / im dr−1 is called the
r-th de Rham cohomology group.

Definition 3.13. The interior product iX : Ωr (M ) → Ωr−1 (M ), where X ∈ X (M ) is a


vector field, is defined as

iX ω(X1 , . . . , Xr−1 ) = ω(X, X1 , . . . , Xr−1 )

when ω ∈ Ωr (M ). In components it takes the following form:


1
iX ω = X ν ωνµ2 ...µr dxµ2 ∧ . . . ∧ dxµr .
(r − 1)!

Proposition 3.14. The interior product iX is an anti-derivation, meaning that is satisfies


the Liebnitz rule
iX (ω ∧ η) = iX ω ∧ η + (−1)r ω ∧ iX η
when ω ∈ Ωr (M ), and it is nilpotent:
i2X = 0 ,
while mapping forms to forms of lower degree.

Proof. (¶) Show as an exercise.


Proposition 3.15. The Lie derivative of a form can be expressed as

LX ω = (d iX + iX d) ω

where ω ∈ Ωr (M ). In other words LX = {d, iX }.

19
Proof. A formal proof is to show that both LX and {d, iX } satisfy the four defining properties.
First, they agree on 0-forms (functions):

LX f = X[f ] = iX df = (iX d + diX )f .

Second, they agree on a coordinate basis of 1-forms:

LX dxµ = ∂ν X µ dxν = d(X µ ) = (d iX + iX d)dxµ .

Third, both LX and {d, iX } are linear maps, because both d and iX are. Finally, one verifies
that {d, iX } satisfies the (ungraded) Liebnitz rule.
One could check agreement on general one-forms ω in the following way. We evaluate the
result on a generic vector Y :
   
(diX + iX d)ω, Y = dhω, Xi + iX dω, Y = Y hω, Xi + iY iX dω = dω(X, Y ) + Y ω(X)
  
= X ω(Y ) − ω [X, Y ] = LX hω, Y i − hω, LX Y i = hLX ω, Y i .

To go from the first to the second line, we used the coordinate-free expression for the differ-
ential of a one-form that we found before.
A more practical proof is in components. Consider the case of a 1-form ω = ωµ dxµ . One
computes

(diX + iX d)ω = d(X µ ωµ ) + iX 12 (∂µ ων − ∂ν ωµ )dxµ ∧ dxν


 

= (∂ν X µ ωµ + X µ ∂ν ωµ dxν + X µ ∂µ ων − ∂ν ωµ )dxν




= (∂ν X µ ωµ + X µ ∂µ ων )dxν = LX ω

comparing with our previous expression for LX ω.


¶ Exercise 8. Verify the formula in the general case of an r-form.

¶ Exercise 9. Let X, Y ∈ X (M ). Show that

[LX , iY ] = i[X,Y ] .

The nilpotency of d and iX can be used to easily prove

[LX , iX ] = 0 and [LX , d] = 0 .

Proof. We can show that [LX , iY ] and i[X,Y ] agree on functions and one-forms, and are both
anti-derivations. When acting on functions, both operators give zero. When acting of one-
forms: 
LX iY − iY LX ω = LX hω, Y i − hLX ω, Y i = hω, LX Y i = i[X,Y ] ω .
The (graded) Liebnitz rule can be proven easily.

20
3.9 Integration of forms
An integration of a differential m-form (top form) over an m-dimensional manifold M is
defined only when M is orientable. So we first define an orientation.
At a point p ∈ M , the tangent space Tp M is spanned by the basis {∂µ ≡ ∂/∂xµ } in terms
of the local coordinates xµ on the chart Ui to which p belongs. Let Uj be another chart with
coordinates {exν }, such that p ∈ Ui ∩ Uj . Then Tp M is also spanned by {∂eν = ∂/∂e
xν }. The
change of basis is
∂xµ
∂eν = ∂µ .

∂e

If J = det ∂xµ /∂e xν > 0 on Ui ∩ Uj , then {∂µ } and {∂eν } are said to define the same
orientation on Ui ∩ Uj , otherwise if J < 0 they define opposite orientations.
Definition 3.16. Let M be a connected manifold covered by {Ui }. Then M is orientable
if there exist local coordinates {xµ(i) } on Ui such that, for any overlapping charts Ui and Uj ,
one has J = det ∂xµ(i) /∂xν(j) > 0.


Definition 3.17. An m-form (top form) η which nowhere vanishes on a manifold M is called
a volume form.
Proposition 3.18. A volume form η exists on a manifold M if and only if M is orientable,
and η defines an orientation.

Proof. Suppose M is orientable. Start with a chart U and take an m-form

η = h(p) dx1 ∧ . . . ∧ dxm

with h(p) a positive-definite function on U . Then extend η to the whole M , in such a way
that it never vanishes. When you transit from Ui to Uj , choose coordinates that preserve
the orientation. In the new coordinates
 ∂xµ 
η = h(p) det ν
dy 1 ∧ . . . ∧ dy m ,
∂y
namely the component of η transforms with the Jacobian J. The component is still positive-
definite on Ui ∩ Uj thanks to orientability, J > 0, and can be extended in a positive-definite
way on Uj . Proceeding in this way, one can consistently construct a volume form on M .
Conversely, suppose a volume form η on M is given. Since it is nowhere vanishing, its
component on any chart Ui is either positive or negative. On the charts where it is negative,
perform the change of coordinates x1 → −x1 which makes it positive. The resulting set of
charts is then oriented.

Recall that a manifold


 1 M with boundary is such that the open sets {Ui } are mapped to
open sets in H = (x , . . . , x ) (−1)m xm ≥ 0 , [This choice avoids unpleasant signs in
m m

Stokes’ theorem] and the boundary ∂M consists of the points mapped to points with xm = 0.

21
This automatically provides an atlas of the boundary ∂M , which is itself a manifold. If M
is oriented, the atlas for ∂M is also oriented. This is because the Jacobian evaluated at the
boundary is
∂xi ∂xm
 
∂xµ  ∂e
 x
j ∂exj 
= >0 with i, j = 1, . . . , m − 1 .
xν xem =0 
∂e ∂xm 
0 >0
xm
∂e
This leads to
Proposition 3.19. Given an oriented manifold M with boundary ∂M , the boundary inherits
an orientation from M .
Consider an oriented manifold M . The integral of a top form ω on a coordinate neighbor-
hood Ui with coordinates xµ is defined by
Z Z
ω1...m ϕ−1
 1 m
ω ≡ i (x) dx . . . dx .
Ui ϕi (Ui )

This definition is coordinate-invariant, as long as we consider oriented changes of coordinates,


because ω1...m transforms with the Jacobian while dx1 . . . dxm transforms with the absolute
value of the Jacobian.
Then we want to define the integral on M of a top-form ω with compact support. The
last assumption ensures that we can cover the support with a finite number of open patches.
Definition 3.20. Take an open covering {Ui } of M . A partition of unity subordinate to
the covering {Ui } is a family of differentiable functions {εi (p)} satisfying
(i) 0 ≤ εi (p) ≤ 1
(ii) εi (p) = 0 if p 6∈ Ui
(iii) ε1 (p) + ε2 (p) + . . . = 1 for any point p ∈ M .
It follows that X
ω(p) = εi (p) ω(p) ,
i
and each term in the sum can be integrated on Ui because it vanishes outside. Thus we can
define Z XZ
ω ≡ εi ω .
M i Ui

Notice that, since the support of ω is compact, we can cover it with a finite number of Ui
and thus the sum is finite.
Theorem 3.21. (Stokes’ theorem) Let ω ∈ Ωm−1 (M ). Then
Z Z
dω = ω.
M ∂M

If M has no boundary, the integral vanishes.

22
Proof. Using a partition of unity, we can reduce the formula to coordinate patches:
XZ XZ
d(εi ω) = εi ω .
i Ui i Ui ∩∂M

1
Given εi ω = εω
(m−1)! i µ1 ...µm−1
dxµ1 ∧ . . . ∧ dxµm−1 , its differential can be written as
m
X
(−1)µ−1 ∂µ εi ω1...µ̂...m dx1 ∧ . . . ∧ dxm ,

d(εi ω) =
µ=1

where µ̂ means that we omit that index. Recalling that coordinate neighborhoods Ui are
mapped by ϕi to open subsets in Hm = {x ∈ Rm | (−1)m xm ≥ 0}, we integrate by parts:
Z m
X Z
µ−1
∂µ εi ω1...µ̂...m dx1 . . . dxm
 
d εi ω = (−1)
Ui µ=1 {(−1)m xm ≥0}
Z
= (−1)m−1 ∂m εi ω1...m−1 dx1 . . . dxm

{(−1)m xm ≥0}
Z Z
1 m−1
= εi ω1...m−1 dx . . . dx = εi ω .
{xm =0} Ui ∩∂M

In the second equality we have picked the only term that, after integration by parts, has a
chance to give a non-vanishing result.

23
4 Lie groups and Lie algebras
We will give a first look at Lie groups in the context of differential geometry.

4.1 Groups
A group is a set G with a binary operation (product)
· :G×G→G
that combines two elements a and b into another element, ab. The product must satisfy:
(i) Associativity: for all a, b, c ∈ G:
(a · b) · c = a · (b · c) .

(ii) Identity element: there exists an element 1 ∈ G such that


1·a=a·1=a
for all a ∈ G.
(iii) Inverse element: for each a ∈ G, there exists an element a−1 such that
a · a−1 = a−1 · a = 1 .

Notice that both the identity element and the inverse of a are unique. Suppose 1 and 10 are
identify elements, then 1 = 1 · 10 = 10 . Suppose that b and b0 are inverses of a, then
b = b · 1 = b · (a · b0 ) = (b · a) · b0 = 1 · b0 = b0 .
In general the order of the two factors in the product matters, namely a · b 6= b · a. If
a·b=b·a ∀ a, b ∈ G
then G is an Abelian group (and the group operation can be represented as +). Groups,
as sets, can be either finite, countably infinite or continuous.

Example 4.1. Consider the following examples of groups:


(a) The integers (Z, +) with addition (Abelian group).
(b) The non-zero rationals (Q \ {0}, ·) with multiplication (Abelian group).
(c) The positive real numbers (R+ , ·) with multiplication (Abelian).
(d) The integers modulo n (Zn ≡ Z/nZ, +) with addition (Abelian).
(e) The group of permutations of n objects (Sn , ◦) with composition.
(f) The group of phases eiα with α ∈ [0, 2π), (S 1 ≡ U (1), ·) with multiplication (Abelian).
(g) The group of n × n matrices with non-vanishing determinant, (GL(n), ·) with matrix
multiplication.

24
4.2 Lie groups
Definition 4.2. A Lie group G is a differentiable manifold which is endowed with a group
structure such that the group operations
(i) · : G × G → G
−1
(ii) :G→G
are differentiable.
The unit element is written as e or 1. The dimension of a Lie group is the dimension of
G as a manifold.
Example 4.3. The following are examples of Lie groups.
(a) Let S 1 be the unit circle on the complex plane,

S 1 = eiθ θ ∈ R (mod 2π) ,




and take the group operation eiθ eiϕ = ei(θ+ϕ) and (eiθ )−1 = e−iθ , which are differen-
tiable. This makes S 1 into a Lie group, called U (1).
(b) The general linear group GL(n, R) of n × n real matrices with non-vanishing de-
terminant is a Lie group, with the operations of matrix multiplication and inverse. Its
dimension is n2 and it is non-compact. Interesting Lie subgroups are:

O(n) = M ∈ GL(n, R) M M T = 1n

orthogonal

special linear SL(n, R) = M ∈ GL(n, R) det M = 1
special orthogonal SO(n) = O(n) ∩ SL(n, R)
Sp(2n, R) = M ∈ GL(2n, R) M ΩM T = Ω} ⊂ SL(2n, R)

(real) symplectic

where the symplectic unit is Ω = −01n 10n .14




An interesting fact is Sp(2n, R) ∩ SO(2n) ∼= U (n).


(c) The Lorentz group is

O(3, 1) = M ∈ GL(4, R) M ηM T = η


where η = diag(−1, 1, 1, 1) is the Minkowski metric. This is non-compact and has 4


connected components, distinguished by the sign of the determinant and the sign of
the (0, 0) entry.15 ¶
14
The fact that a symplectic matrix has determinant 1 can be shown as follows. The Pfaffian of an
antisymmetric matrix A = −AT is defined as Pf(A) = 2n1n! Aa1 a2 . . . Aan−1 an a1 ...an . It follows that

Pf BAB T = 2n1n! Ba1 j1 Aj1 j2 Ba2 j2 . . . Ban−1 jn−1 Ajn−1 jn Ban jn a1 ...an = det(B) Pf(A) .


Therefore the defining equation implies det M = 1.


15
From the equation it follows det M = ±1 and M00 6= 0.

25
(d) The general linear group GL(n, C) of n × n complex matrices with non-vanishing
determinant has (real) dimension 2n2 . Interesting subgroups are:

U (n) = M ∈ GL(n, C) M M † = 1n

unitary

special linear SL(n, C) = M ∈ GL(n, C) det M = 1
special unitary SU (n) = U (n) ∩ SL(n, C)
(complex) symplectic Sp(2n, C) = M ∈ GL(2n, C) M ΩM T = Ω ⊂ SL(2n, C)


compact symplectic U Sp(2n) = Sp(2n, C) ∩ U (2n) .

Let G be a Lie group and H ⊂ G a Lie subgroup of G. Define an equivalence relation


g ∼ g 0 if there exists an element h ∈ H such that g 0 = gh. An equivalence class [g] is a set
{gh | h ∈ H}. The coset space

G/H = [g] g ∈ G

is the set of equivalence classes, and it is a manifold with dim G/H = dim G − dim H. H is
said to be a normal subgroup of G if

ghg −1 ∈ H for any g ∈ G and h ∈ H ,

i.e. if H is preserved under the adjoint action of G. When H is a normal subgroup of G,


then G/H is a (Lie) group. The group operations are simply defined as

[g] · [g 0 ] = [gg 0 ] , [g]−1 = [g −1 ] .

Let us check that they are well-defined. If gh and g 0 h0 are representatives of [g] and [g 0 ],
respectively, then
ghg 0 h0 = gg 0 g 0−1 hg 0 h0 = gg 0 h00 h0
in the same class [gg 0 ]. Similarly

(gh)−1 = h−1 g −1 = g −1 gh−1 g −1 = g −1 h00

in the same class [g −1 ].

4.3 Action of Lie groups on manifolds


Lie groups often appear as set of transformations acting on manifolds. We already discussed
one example: a vector field X defines a flow on M , which is a map σ : R × M → M in which
R acts as an additive group. We can generalize the idea.
Definition 4.4. Let G be a Lie group and M be a manifold. The action of G on M is a
differentiable map σ : G × M → M such that
(i) σ(1, p) = p for all p ∈ M

26

(ii) σ g1 , σ(g2 , p) = σ(g1 g2 , p).
In other words the action respects the group structure. We use the notation gp for σ(g, p).
The action is said to be
(a) transitive if, for any p1 , p2 ∈ M , there exists an element g ∈ G such that g p1 = p2 ;
(b) free if every g 6= 1 of G has no fixed points on M , namely if the existence of a point
p ∈ M such that g p = p necessarily implies g = 1;
(c) faithful or effective if the unit element 1 is the only element that acts trivially on
M.

4.4 Lie algebras


Definition 4.5. Let a, g be elements of a Lie group G. The left-translation La : G → G
and the right-translation Ra : G → G are defined by

La g = ag , Ra g = ga .

Clearly La , Ra are diffeomorphisms from G to G, hence they induce differential maps


La∗ : Tg G → Tag G and Ra∗ : Tg G → Tga G. Since the two cases are equivalent, in the
following we discuss left-translations.
Given a Lie group G, there exists a special class of vector fields characterized by the
invariance under group action.
Definition 4.6. Let X be a vector field on a Lie group G. Then X is said to be a left-
invariant vector field if La∗ X|g = X|ag .
A vector V ∈ T1 G defines a unique left-invariant vector field XV on G by

XV g
= Lg∗ V for g ∈ G .

In fact XV |ag = Lag∗ V = La∗ Lg∗ V = La∗ XV |g thus XV is left-invariant. Conversely, a


left-invariant vector field X defines a unique vector V = X|1 ∈ T1 G. We denote the set
of left-invariant vector fields on G by g. The map T1 G ↔ g defined by V ↔ XV is an
isomorphism, in particular dim g = dim G.
Since g is a set of vector fields, it is a subset of X (G) and the Lie bracket is defined on g.
We show that g is closed under the Lie bracket. Let X, Y ∈ g, then

[X, Y ] ag
= [X|ag , Y |ag ] = [La∗ X|g , La∗ Y |g ] = La∗ [X, Y ] g
,

implying that [X, Y ] ∈ g.


Definition 4.7. The set of left-invariant vector fields g with the Lie bracket [ , ] : g × g → g
is called the Lie algebra of a Lie group G.

27
Let the set of n vectors {V1 , . . . , Vn } be a basis of T1 G, where n = dim G and we con-
sider n finite. The basis defines a set of n linearly independent left-invariant vector fields
{X1 , . . . , Xn } on G. Since [Xa , Xb ] is again an element of g, it can be expanded in terms of
{Xa }:
c
[Xa , Xb ] = Cab Xc ,
c
The coefficients Cab are called the structure constants of the Lie group G.
¶ Exercise 10. Show that the structure constants satisfy
(a) skew-symmetry
c c
Cab = −Cba

(b) Jacobi identity


f
Cab Cfgc + Cca
f
Cfgb + Cbc
f
Cfga = 0 .

Let us introduce a dual basis to {Xa } and denote it by {θa }, in other words

hθa , Xb i = δba at all points g ∈ G .

Then {θa } is a basis for the left-invariant one-forms. We can show that the dual basis satisfies
Maurer–Cartan’s structure equations:
1 a b
dθa = − Cbc θ ∧ θc .
2
To prove it we use the coordinate-free expression for the differential of a one-form:16

dθa (Xb , Xc ) = Xb [θa (Xc )] − Xc [θa (Xb )] − θa ([Xb , Xc ])


= Xb [δca ] − Xc [δba ] − θa (Cbc
d a
Xd ) = −Cbc .

We can define a Lie-algebra valued 1-form θ : Tg G → T1 G as

θ : X → (Lg−1 )∗ X = (Lg )−1


∗ X for X ∈ Tg G .

This is called the Maurer–Cartan form on G.


Theorem 4.8. The Maurer-Cartan form θ is

θ = θ a ⊗ Va

where {θa } is the dual basis of left-invariant one-forms and {Va } is a basis of T1 G. Then θ
satisfies
1
dθ + θ ∧ θ ≡ dθa ⊗ Va + θb ∧ θc ⊗ [Vb , Vc ] = 0 ,
2
where the commutator is the one in the Lie algebra.
16
    
Such a formula is: dω(X, Y ) = X ω(Y ) − Y ω(X) − ω [X, Y ] .

28
Proof. Take a vector Y ≡ Y a Xa ∈ Tg G, where {Xa } are left-invariant vector fields with
Xa |g = (Lg )∗ Va . From the definition of θ:
θ(Y ) = Y a θ(Xa ) = Y a (Lg )−1 a a
∗ (Lg )∗ Va = Y Va = θ (Y ) ⊗ Va ,

and in the last step we used that {θa } is the dual basis to {Xa }.
The Maurer–Cartan structure equation gives
1 a b 1
dθ + θ ∧ θ = − Cbc θ ∧ θc ⊗ Va + θb ∧ θc ⊗ Cbc
a
Va = 0 .
2 2

4.5 Exponential map


We saw that a vector field X ∈ X (M ) generates a flow σ(t, x) in M . Now we want to
consider the flow σ(t, g) in a Lie group G generated by a left-invariant vector field X ∈ g.
The flow satisfies
d
σ(·, g)∗ = X σ(t,g) ,
dt t
where σ(·, g) is a map R → G. We define a one-parameter subgroup of G
φX (t) ≡ σ(t, 1) .
It is not obvious that the integral curve of X through the identity forms a subgroup of G
(with the product induced by G).17 This is shown in the
Proposition 4.9. The one-parameter subgroup of G generated by X ∈ g satisfies
φX (s + t) = φX (s) φX (t) , φX (0) = 1 .
Proof. The second equation is obvious. We compare the two sides of the first equation as
functions of t (we keep X implicit). At t = 0 they agree. The function on the LHS satisfies
d
φ(s + · )∗ =X φ(s+t)
.
dt t

To compute the derivative of the RHS we use that X is left-invariant. We have


 d  d d
φ(s) φ( · ) ∗ = Lφ(s) σ(·, 1) ∗ = Lφ(s)∗ σ(·, 1)∗ = Lφ(s)∗ X φ(t)
=X φ(s) φ(t)
.
dt t dt t dt t

Since the two sides of the equation solve the same ODE and have the same initial condition,
they agree for all t’s.
Notice that the statement is different from σ t, σ(s, 1) = σ(t + s, 1) that we already


proved.
17
In fact, while the curve σ(R, 1) is an immersion of R into G for any vector field X, it actually forms a
subgroup of G only when X is a left-invariant vector field.

29
Definition 4.10. Let G be a Lie group and V ∈ g (equivalently, V ∈ T1 G). The exponen-
tial map exp : g → G is defined by

exp V = φV (1) ,

and it satisfies
exp(t V ) = φV (t) .
¶ Exercise 11. The last part should be proven, namely that φtV (1) = φV (t). Show that both
φλV (t) and φV (λt), where λ ∈ R is fixed, are one-parameter subgroups of G generated by
λV , hence they are equal. The claim follows by setting t = 1 and λ → t.

30
5 Riemannian geometry
Besides a smooth structure, a manifold may carry a further structure: a metric. This is an
inner product between vectors in the tangent space.
Definition 5.1. A Riemannian metric g on a differentiable manifold M is a type (0, 2)
tensor field on M such that, at each point p ∈ M :
(i) gp (U, V ) = gp (V, U ) ;
(ii) gp (U, U ) ≥ 0, where equality implies U = 0.
In short, gp is a symmetric positive-definite bilinear form.
A pseudo-Riemannian metric is such that the bilinear form is symmetric and non-
degenerate,18 but not necessarily positive-definite. A special case is a Lorentzian metric,
whose signature has only one negative entry.
In coordinates, we expand the metric as
gp = gµν (p) dxµ ⊗ dxν ≡ ds2 .
The notation ds2 is used in physics, because the metric is an infinitesimal distance squared.
We regard gµν at a given point as a matrix, and since it has maximal rank, it has an inverse
denoted by g µν . The determinant det(gµν ) is sometimes denoted by g.
The metric gp is a map Tp M ⊗ Tp M → R, and for each vector U ∈ Tp M it defines a linear
map
gp (U, ·) : Tp M → R
V → gp (U, V ) ,
which is a one-form ωU ∈ Tp∗ M . Since gp is non-degenerate, this gives an isomorphism
between Tp M and Tp∗ M . In components
ωµ = gµν U ν , U µ = g µν ων .

If a smooth manifold M is equipped with a Riemannian metric g, the pair (M, g) is called
a Riemannian manifold. In fact, all manifolds admit a Riemannian metric.19 This is not
so for Lorentzian or other pseudo-Riemannian metrics.

If f : M → N is an immersion of an m-dimensional manifold M into an n-dimensional


manifold N with Riemannian metric gN , then the pullback map f ∗ induces the natural metric
gM = f ∗ gN on M , called the induced metric. In components:
 ∂f α ∂f β
gM µν (x) = gN αβ f (x) .
∂xµ ∂xν
18
The metric is non-degenerate if gp (U, V ) = 0 for every V ∈ Tp M implies that U = 0.
19 (i)
Simply, on every patch Ui introduce a Riemannian metric gp with compact support. Then sum all the
P (i)
metrics: gp = i gp (one should use a locally-finite covering to avoid divergences). One uses that the sum
of Riemannian metrics is a Riemannian metric gp .

31
If N is pseudo-Riemannian, f ∗ gN is not guaranteed to be a metric.
¶ Exercise 12. Can you give an example?

¶ Exercise 13. Consider the unit sphere S 2 with coordinates (θ, φ) embedded into R3 as

f : (θ, φ) → sin θ cos φ, sin θ sin φ, cos θ .

Taking the Euclidean metric on R3 , compute the induced metric on R2 .

5.1 Connections
A vector X is a “directional derivative” acting on functions f ∈ F (M ) as X : f → X[f ].
There is no natural directional derivative acting on tensor fields of type (p, q): the Lie
derivative LX Y = [X, Y ] is not a directional derivative, because it requires X to be a vector
field, not just a vector (it depends on the derivative of X). What we need is an extra
structure, a connection, which specifies how tensors are transported along a curve.
Definition 5.2. An affine connection ∇ is a map ∇ : X (M ) × X (M ) → X (M ), in short
∇X Y , such that:
∇X (Y + Z) = ∇X Y + ∇X Z
∇(X+Y ) Z = ∇X Z + ∇Y Z
∇f X Y = f ∇X Y
∇X (f Y ) = X[f ] Y + f ∇X Y .
The third property distinguishes the Lie derivative from an affine connection.20 ∇ is also
called a covariant derivative.
Given a coordinate basis {eµ } = {∂/∂xµ } on a chart, one can define the connection
coefficients Γλµν as
∇µ eν ≡ ∇eµ eν = Γλµν eλ .
They characterize the action of the connection ∇ on any vector field:
λ
 
µ ν µ ∂W ν λ

∇V W = V ∇eµ W eν = V + W Γµν eλ .
∂xµ

Let c : (a, b) → M be a curve in M , with tangent vector V = dxµ c(t) /dt eµ |c(t) . Let X
be a vector field, defined at least along c(t). If X satisfies the condition

∇V X = 0 ∀t ∈ (a, b) ,

X is said to be parallel transported along c(t). If the tangent vector V (t) itself is parallel
transported along c(t), namely if
∇V V = 0 ,
20
Recall that Lf X Y = f LX Y − Y [f ] X.

32
the curve c(t) is called a geodesic. In components the geodesic equation is

d2 xµ ν
µ dx dx
λ
+ Γ νλ =0.
dt2 dt dt
Notice that this equation is not invariant under reparametrizations.
¶ Exercise 14. Show that if a curve c(t) satisfies the weaker condition

∇V V = f V with f ∈ F (M ) ,

it is always possible to change parametrization t → t0 such that c(t0 ) is a geodesic.

It is natural do define the covariant derivative of a function f ∈ F (M ) as the ordinary


directional derivative:
∇X f = X[f ] .
Then the fourth property above looks like the Liebnitz rule. We require that this be true for
any product of tensors,

∇X (T1 ⊗ T2 ) = ∇X T1 ⊗ T2 + T1 ⊗ ∇X T2 ,

even when some of the indices are contracted. This fixes the action of ∇ on tensors of general
type. For instance, using
 
X hω, Y i = ∇X hω, Y i = h∇X ω, Y i + hω, ∇X Y i

where ω ∈ Ω1 (M ) is a one-form, one obtains the action in components:

(∇µ ω)ν = ∂µ ων − Γλµν ωλ .

The action on tensors of general type is:


p q
ν ...ν ν ...ν
X ν ...ρ...ν
X ν ...ν
(∇µ T )λ11 ...λpq = ∂µ Tλ11...λqp + Γνµρs Tλ11...λq p − Γρµλs Tλ11...ρ...λ
p
q
.
s=1 s=1

Consider two overlapping charts U, V with U ∩ V = 6 ∅, and with coordinates xµ and


µ
e , respectively. Imposing that ∇X Y transforms as a tensor, imposes that the connection
x
coefficients transform as
µ ν
eγ = ∂x ∂x ∂e xγ λ ∂ 2 xµ ∂exγ
Γ Γ + .
αβ
∂e xβ ∂xλ µν ∂e
xα ∂e xα ∂exβ ∂xµ
Because of the second inhomogeneous term, they do not transform as a tensor. However,
λ
the difference of two connections Γλµν − Γµν is a tensor of type (1, 2).

33
5.2 Torsion and curvature
From the connection ∇ we can construct two intrinsic geometric objects: the torsion tensor
T : X (M )2 → X (M ) and the Riemann tensor R : X (M )3 → X (M ), as follows.

T (X, Y ) = ∇X Y − ∇Y X − [X, Y ]
R(X, Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ] Z .

They are both antisymmetric in X ↔ Y . Although not manifest, with a bit of algebra one
can show that these objects are tensors—as opposed to differential operators—namely they
only depend on X, Y, Z at the point p and not on their derivatives.

¶ Exercise 15. Verify that T (f X, gY ) = f g T (X, Y ) and R(f X, gY )hZ = f gh R(X, Y )Z.

Acting on a coordinate basis, we define the tensors in components:


λ
T (eµ , eν ) = Tµν eλ , R(eµ , eν )eλ = Rσλµν eσ

(notice the order of indices in R). They can be explicitly expressed in terms of the connection
coefficients:
λ
Tµν = 2Γλ[µν]

Rσλµν = 2∂[µ Γσν]λ + 2Γσ[µ|ρ Γρν]λ ,


where [µ, ν] means antisymmetrization of indices.
The curvature tensor R(X, Y )Z measures the difference between parallel transport of Z
along infinitesimal Xp and then Yp , as opposed to Yp and then Xp . The torsion tensor mea-
sures the difference between the point indicated by X parallel transported along infinitesimal
Yp , as opposed to Y parallel transported along infinitesimal Xp .

5.3 The Levi-Civita connection


So far we have left the connection Γ arbitrary. When the manifold M is endowed with a
metric, we can demand that gµν be covariantly constant. This means that if two vectors X
and Y are parallel transported along a curve, then their inner product remains constant.
Definition 5.3. An affine connection ∇ is said to be metric compatible, or simply a
metric connection, if
∇V g = 0
for any vector field V and at every point on M . In components:

(∇λ g)µν = 0 .

34
Writing this condition in terms of the connection coefficients, one finds an expression for the
symmetric part Γλ(µν) in terms of the metric and the torsion tensor (¶). Recalling that the
antisymmetric part is itself the torsion tensor, one finds
λ 1 λρ  σ
 1
λ
Γµν = g 2∂(µ gν)ρ − ∂ρ gµν + 2Tρ(µ gν)σ + Tµν .
2 2
Definition 5.4. A connection ∇ is said symmetric if the torsion tensor vanishes, namely
if Γλ[µν] = 0.
¶ Exercise 16. Let ∇ be a symmetry connection.
(a) Let f ∈ F (M ). Show that ∇µ ∇ν f = ∇ν ∇µ f .
1
(b) Let ω ∈ Ω (M ). Show that dω = (∇µ ω)ν dxµ ∧ dxν .
Proposition 5.5. On a (pseudo-) Riemannian manifold (M, g) there exists a unique sym-
metric connection compatible with the metric g. This connection is called the Levi-Civita
connection.

Proof. The constructive proof is the explicit expression for the Levi-Civita connection:
λ 1 λρ  
Γµν = g ∂µ gνρ + ∂ν gµρ − ∂ρ gµν .
2

¶ Exercise 17. Geodesics and minimal length


¶ Exercise 18. Geodesics on the upper half plane with Poincaré metric
Definition 5.6. The Ricci tensor is a (0, 2) tensor defined by
Ric(X, Y ) = hdxµ , R(eµ , Y )Xi , Ricµν = Rλµλν .
This can be defined for any connection. In the presence of a metric, one can also define
R = g µν Ricµν
called the scalar curvature
The Riemann tensor defined with respect to the Levi-Civita connection satisfies some
special properties.
(i) Symmetry. The object Rλρµν = gλσ Rσρµν satisfies21

R(λρ)µν = 0 Rλρµν = Rµνλρ


Rλρ(µν) = 0 Ricµν = Ricνµ .
21
To prove these symmetry properties, one can first write the Riemann tensor as
 2
∂ 2 gρµ ∂ 2 gλν ∂ 2 gρν

1 ∂ gλµ 
ζ η ζ η

Rλρµν = − − + + gζη Γλµ Γ ρν − Γ λν Γ ρµ .
2 ∂xρ ∂xν ∂xλ ∂xν ∂xρ ∂xµ ∂xλ ∂xµ

35
(ii) Bianchi identities:

0 = R(X, Y )Z + R(Z, X)Y + R(Y, Z)X


0 = (∇X R)(Y, Z)V + (∇Z R)(X, Y )V + (∇Y R)(Z, X)V .

In components they take the simple form

Rλ[ρµν] = 0 , ∇[σ Rλρ|µν] = 0

in terms of antisymmetrization.
(iii) Conservation of the Einstein tensor:22
 
1 µν
∇µ Ric − g R ≡ ∇µ Gµν = 0 .
µν
2

Here Gµν is called the Einstein tensor. This relation shows that it is automatically
conserved.
Taking into account the symmetry properties and the first Bianchi identity, one finds23
that in m dimensions the number of algebraically independent components of the Riemann
tensor is F (m) = m2 (m2 −1)/12. We find F (1) = 0, indeed every one-dimensional manifold is
flat. Then F (2) = 1, indeed in two dimensions the curvature is fixed by the scalar curvature.
Finally F (3) = 6, indeed in three dimensions the curvature is fixed by the Ricci tensor.

5.4 Isometries and conformal transformations


Definition 5.7. Let (M, g) be a (pseudo-) Riemannian manifold. We say that a diffeomor-
phism f : M → M is an isometry if it preserves the metric:

f ∗ gf (p) = gp .

In components:
∂f α ∂f β 
gαβ f (x) = gµν (x) .
∂xµ ∂xν
The identity map, the inverse of an isometry and the composition of isometries are isometries,
therefore they form a group.
22
The relation simply follows from a double contraction of the second Bianchi identity with the compatible
metric.
Each antisymmetric pair [µν] has N = m N +1
23
 
2 components. The symmetrization of two pairs gives 2
components. The first Bianchi identity, Rλρµν + Rλµνρ + Rλνρµ = 0 is a totally antisymmetric tensor in four
indices, and imposes m

4 constraints. The formula for F (m) follows.
In 2 dimensions: Rλρµν = R gλ[µ gν]ρ . In 3 dimensions: Rλρµν = 2 gλ[µ Ricν]ρ − gρ[µ Ricν]λ − R gλ[µ gν]ρ .


36
Definition 5.8. Let (M, g) be a Riemannian manifold and X ∈ X (M ) a vector field. If
X generates a one-parameter family of isometries, it is called a Killing vector field. The
infinitesimal isometries are f : xµ → xµ + εX µ with ε infinitesimal. This leads to

X λ ∂λ gµν + 2∂(µ X λ gν)λ = (LX g)µν = 0 ,

called Killing equation.


¶ Exercise 19. Show that the Killing equation can be rewritten in terms of the Levi-Civita
connection ∇ as24
(∇µ X)ν + (∇ν X)µ = 0 .

Let X, Y be two Killing vector fields. We easily verify:


(a) any linear combination aX + bY (a, b ∈ R) is a Killing vector field;
(b) the Lie bracket [X, Y ] is a Killing vector field.
Thus Killing vector fields form a Lie algebra of symmetries (isometries) of the manifold M .

Definition 5.9. Let (M, g) be a (pseudo-) Riemannian manifold. We say that a diffeo-
morphism f : M → M is a conformal transformation if it preserves the metric up to a
scale:
f ∗ gf (p) = e2σ(p) gp , σ ∈ F (M ) .
In components:
∂f α ∂f β
gαβ f (x) = e2σ(x) gµν (x) .

µ
∂x ∂x ν

Let us define the angle θ between two vectors X, Y ∈ Tp (M ) as


 1/2
cos θ = gp (X, Y )/ gp (X, X) gp (Y, Y ) .

Conformal transformations change lengths but preserve angles. The set of conformal trans-
formations on M is a group, the conformal group Conf(M ).
Definition 5.10. Let (M, g) be a Riemannian manifold and X ∈ X (M ) a vector field. If
X generates a one-parameter family of conformal transformations, it is called a conformal
Killing vector field (CKV). The infinitesimal isometries are f : xµ → xµ + εX µ with ε
infinitesimal. The scale factor must be proportional to ε, thus set σ = εψ/2. This leads to

(LX g)µν = ψ gµν , ψ ∈ F (M ) .

It is easy to find ψ = m−1 X λ g µν ∂λ gµν + 2∂λ X λ



where m = dim M .
CKVs form a Lie algebra of conformal transformations, closed under linear combinations
and Lie bracket (¶).
24
Notice that the Killing condition is weaker than the covariant constancy condition. A covariantly con-
stant vector field, ∇µ Xν = 0, is both Killing and with constant norm, ∂µ (Vν V ν ) = 0.

37
A concept related to conformal transformations is Weyl rescalings. Let g, g be metrics on
a manifold M . g is said to be conformally equivalent to g if

g p = e2σ(p) gp .

This is an equivalence relation between metrics on M , and each equivalence class is called
a conformal structure. The transformations g → e2σ g, called Weyl rescalings, form an
infinite-dimensional group denoted by Weyl(M ).
The Riemann tensor of g is different from the Riemann tensor of g. However, it turns out
that the traceless part of the Riemann tensor is invariant under Weyl rescalings:
2   2
Wλρµν = Rλρµν − gλ[µ Ricν]ρ − gρ[µ Ricν]λ + R gλ[µ gν]ρ .
m−2 (m − 1)(m − 2)

This is called the Weyl tensor. It vanishes identically for m = 3.25


A (pseudo-) Riemannian manifold (M, g) such that in each patch gµν = e2σ ηµν (with the
suitable signature) is said to be conformally flat. Since the Weyl tensor vanishes for a flat
metric, it also vanishes for a conformally flat metric. If m ≥ 4, also the converse is true:
Theorem 5.11. If dim M ≥ 4, a (pseudo-) Riemannian manifold is conformally flat if and
only if its Weyl tensor vanishes.
If dim M = 3, a (pseudo-) Riemannian manifold is conformally flat if and only if its
Cotton tensor vanishes.
If dim M = 2, every (pseudo-) Riemannian manifold is conformally flat.

25
The expression given is not valid for m = 2. However in two dimensions the Riemann tensor is pure
trace, therefore its traceless part vanishes.

38
6 Fibre bundles
A manifold is a topological space that locally looks like Rn . A fibre bundle is a topological
space which locally looks like a direct product of Rn and another space.
Definition 6.1. A (differentiable) fibre bundle (E, π, M, F, G) consists of the following
elements:
(i) A manifold E called the total space.
(ii) A manifold M called the base space.
(iii) A manifold F called the fibre.
(iv) A surjective map
π:E→M
called the projection. The inverse image of any point, π −1 (p) = Fp , must be isomor-
phic to F and is the fibre at p. (See Figure.)
(v) A Lie group G called the structure group, acting on F from the left.
(vi) A set of open coverings {Ui } of M with diffeomorphisms φi : Ui × F → π −1 (Ui ) such
that π ◦ φi (p, f ) = p. The maps φi are called local trivializations. (See Figure.)
(vii) Writing φi (p, f ) ≡ φi,p (f ), the maps φi,p : F → Fp are diffeomorphisms. On Ui ∩Uj 6= ∅
we require that
tij (p) ≡ φ−1
i,p ◦ φj,p : F → F

be an element of G. Then φi and φj are related by a smooth map tij : Ui ∩ Uj → G as



φj (p, f ) = φi p, tij (p)f .

39
The maps tij are called transition functions.
The effect of the transition functions can be written in an alternative way. Consider the
diffeomorphisms φ−1i : π −1 (Ui ) → Ui × F . Take a point u such that π(u) = p ∈ Ui ∩ Uj .
Then
φ−1 −1
i (u) F = tij (p) φj (u) F .

On a well-defined bundle, the transition functions satisfy the following consistency conditions:

tii (p) = id (p ∈ Ui )
tij (p) = tji (p)−1 (p ∈ Ui ∩ Uj )
tij (p) · tjk (p) = tik (p) (p ∈ Ui ∩ Uj ∩ Uk ) .

If all transition functions can be taken to be the identity map, then the bundle is called a
trivial bundle and it is a direct product M × F .
π
For a given bundle E → − M , the set of possible transition functions is not unique. We
can change the local trivializations {φi }, without changing the bundle, by choosing maps
gi (p) : F → F at each point p ∈ M , required to be diffeomorphisms that belong to G, and
then defining
φei,p = φi,p ◦ gi (p) .
The transition functions for the new trivializations are

tij (p) = gi (p)−1 ◦ tij (p) ◦ gj (p) .


e

Physically, the choice of trivializations {φi } is a choice of gauge and the maps {gi } are gauge
transformations, one on each covering patch Ui .

40
π
Definition 6.2. Let E → − M be a fibre bundle. A section s : M → E is a smooth map
which satisfies π ◦ s = idM . Clearly s(p) = s|p is an element of Fp = π −1 (p). The set of
sections on M is denoted by Γ(M, E).
If U ⊂ M , we may talk of a local section which is defined only on U , and Γ(U, F ) denotes
the set of local sections on U . Notice that not all fibre bundles admit global sections.

It turns out that a fibre bundle (E, π, M, F, G) can be reconstructed from the data
M, {Ui }, tij (p), F, G .
This amounts to finding E and π from the given data. Construct
G
X= Ui × F .
i

Introduce an equivalence relation ∼ between (p, f ) ∈ Ui × F and (q, f 0 ) ∈ Uj × F if p = q


and f 0 = tij (p)f (this is possible if the transition functions satisfy the consistency relations).
A fibre bundle E is then defined as
E = X/ ∼ .
Denote an element of E as [(p, f )]. The projection is given by
π : [(p, f )] 7→ p .
The local trivializations φi : Ui × F → π −1 (Ui ) are given by
φi : (p, f ) 7→ [(p, f )] .
π π 0
Let E →− M and E 0 − → M 0 be fibre bundles. A smooth map f¯ : E → E 0 si called a bundle
map if it maps each fibre Fp of E to a fibre Fq0 of E 0 . Then f¯ naturally induces a smooth
map f : M → M 0 such that f (p) = q. Then the diagram

E E0
π π0
f
M M0
commutes.
π
Let E → − M be a fibre bundle with typical fibre F . Given a map f : N → M , we can
define a new fibre bundle f ∗ E over N with the same fibre F , called the pulled-back bundle.
Consider the diagram
 
π π 2
f ∗E 2
E  (p, u) u 
π  .
 
π1 π  π1
 
f f
N M p f (p)

41
We define f ∗ E as the following subspace of N × E:

f ∗ E = (p, u) ∈ N × E f (p) = π(u) .




We define the projections π1 : (p, u) 7→ p and π2 : (p, u) 7→ u. This makes the diagram above
commuting. The fibre Fep of f ∗ E at p is equal to Ff (p) . Then the transition functions of f ∗ E
are the pull-back of those of E:

tij (p) = tij (f (p)) = f ∗ tij (p)


e

(¶ check as an exercise).

6.1 Vector bundles


π
A vector bundle E → − M is a fibre bundle whose fibre is a vector space. Let F be Rk and
M an m-dimensional manifold. We call k the fibre dimension. The transition functions
belong to GL(k, R), since those map the vector space into itself isomorphically. If F is a
complex vector space Ck , the structure group is GL(k, C).

Example 6.3. The tangent bundle T M over an m-dimensional manifold M is a vector


bundle whose fibre is Rm . Let u be a point in T M such that π(u) = p ∈ Ui ∩ Uj . Let
xµ = ϕi (p) and x
eν = ϕj (p) be coordinate systems on Ui and Uj , respectively. The vector V
corresponding to u is expressed as

V = V µ ∂µ p
= Ve ν ∂eν p
.

The local trivializations are

φ−1 µ
φ−1 eν
 
i (u) = p, {V } , j (u) = p, {V } .

The fibre coordinates {V µ } and {Ve ν } are related as


 ∂xµ 
µ
V = Ve ν ,

∂e p

therefore the transition function Gµν (p) = (∂xµ /∂e


xν )p ∈ GL(m, R). Hence, a tangent bundle

is T M, π, M, Rm , GL(m, R) .
The sections of T M are vector fields on M , namely Γ(M, T M ) = X (M ).

Example 6.4. The cotangent bundle T ∗ M = p∈M Tp∗ M is defined similarly to the tangent
S
bundle. On a chart Ui whose coordinates are xµ , a basis of Tp∗ M can be taken to be {dxµ },
which is dual to {∂/∂xµ }. If p ∈ Ui ∩ Uj a one-form ω is represented as

ω = ωµ dxµ = ω xν .
eν de

42
Therefore the fibre coordinates {ωµ } and {e
ων } are related as
xν 
 ∂e  ∂x −1
ωµ = ω
eν = ω

∂xµ ∂e
x µν

and the transition functions are G(p)−1 in terms of the transition functions G(p) of the
tangent bundle, and the structure group is still GL(m, R). Notice that in the contraction

hω, V i = ωµ V µ

the transition functions cancel, namely the contraction takes values in the trivial bundle
with fibre R.
The sections of T ∗ M are the one-forms on M , namely Ω1 (M ) = Γ(M, T ∗ M ).

π
The construction is more general. Given a vector bundle E → − M with fibre F , we may
∗ π ∗ ∗
define its dual bundle E → − M . The fibre F of E is the vector space of linear maps from
F to R (or C). Given a basis {ea (p)} of Fp , we define the dual basis {θa (p)} of Fp∗ such that
hθa (p), ea (p)i = δba . The transition functions of E ∗ are the inverse of those of E.

The set of sections of a vector bundle form an infinite-dimensional vector space. Addition
and scalar multiplication are defined pointwisely:

(s + s0 )(p) = s(p) + s0 (p) (f s)(p) = f (p) s(p) .

We see that the field of coefficients can be taken as the field F (M ) of functions on M , more
general than just R. Vector bundles  always admit a special section called the null section
s0 ∈ Γ(M, E), such that φ−1i s 0 (p) = (p, 0) in any local trivialization. This is also the origin
of the vector space of sections.
Given a metric hµν on the fibre, this defines an inner product

(s, s0 )p = hµν (p) sµ (p) s0ν (p) .

This is a function f ∈ F (M ), not a number, as that is the field of scalars in the vector
space.

6.2 Principal bundles


π
A principle bundle, or G-bundle, P → − M or P (M, G) is a fibre bundle whose fibre F is
equal to the structure (Lie) group G.
The transition functions act on the fibres from the left as before. In addition, we can
define an action of G on F from the right. Let φi : Ui × G → π −1 (Ui ) be a local trivialization
given by φ−1
i (u) = (p, gi ) where p = π(u). Then

φ−1
i (ua) = (p, gi a) ∀a ∈ G .

43
Since transition functions act from the left, and the left and right actions commute, this
definition is independent of any local trivialization: if p ∈ Ui ∩ Uj ,

ua = φj (p, gj a) = φi p, tij (p) gj a = φi (p, gi a) .

Thus the right multiplication is denoted as P × G → P , or (u, a) 7→ ua. Notice that


π(ua) = π(u).
Moreover, the right multiplication on π −1 (p) = Fp ∼
= G is:
(a) Transitive, i.e. for any u1 , u2 ∈ π −1 (p) there exists an element a ∈ G such that
u1 = u2 a. This means that we can reconstruct the whole fibre as π −1 (p) = {ua | a ∈ G}
starting from u.
(b) Free, i.e. if ua = u for some u ∈ P then a = 1.

Given a local section si (p) over Ui , it selects a preferred local trivialization φi as follows.
For u ∈ π −1 (p) and p ∈ Ui , there is a unique element gu ∈ G such that u = si (p)gu . Then
we define
φ−1
i (u) = (p, gu ) .

In other words, this is the local trivialization in which the local section appears as

si (p) = φi (p, 1) .

The transition functions relate the various local sections:

sj (p) = si (p) tij (p) .

This follows from sj (p) = φj (p, 1) = φi p, tij (p)1 = φi p, 1 tij (p) = si (p) tij (p).
 

Proposition 6.5. A principal bundle is trivial, or parallelizable, if and only if it admits a


globally defined section s(p).

Proof. If there is a globally defined section, we can use it to define local trivializations on
all patches Ui . The transition functions between those trivializations are trivial, tij (p) = 1,
proving parallelizability. On the contrary, a trivial bundle is P = M × G and any constant
section is globally defined.
Proposition 6.6. Let G be a Lie group and H a closed Lie subgroup. Then G is a principal
bundle over M = G/H with fibre H.
The projection π : G → M = G/H is given by π : g 7→ [g] = {gh | h ∈ H}. By definition,
π (p) = Fp ∼
−1
= H. To construct local trivializations, we cover M with patches Ui and then
construct local sections si (p).

44
Example 6.7. Hopf map. A nice example is SU (2), which is a U (1) bundle over SU (2)/U (1).
First, SU (2) ∼

= S 3 . Let M = ac db with a, b, c, d ∈ C be a complex matrix. Then M is in
SU (2) if M −1 = det1M −c d −b
a is equal to M † and det M = 1. We find that the matrices of
SU (2) are M = −ba ∗ ab∗ with |a|2 + |b|2 = 1. This is S 3 .
Second, SU (2)/U (1) ∼


= S 2 . The subgroup U (1) is given by matrices U = e0 e−iθ 0 which
rotate the phase of a and b, and we should identify points related by a phase rotation. For
a 6= 0 we can use the phase rotation to set a ∈ R+ , and we are left with a2 +b21 +b22 = 1 (where
b = b1 + ib2 ) which is a semi-sphere B2 . However for a = 0 the phase rotation identifies all
points at the boundary of the semi-sphere, and we get S 2 .
We obtain that S 3 is a U (1) bundle over S 2 . This is called the Hopf fibration.

6.3 Associated bundles


Given a principal bundle P (M, G) and a manifold F on which G acts from the left, we can
construct an associated bundle. We define an action of G on P × F as26

g : (u, f ) 7→ (ug, g −1 f ) .

The associated bundle (E, π, M, F, G) is

(u, f ) ∼ (ug, g −1 f ) .

E = P × F /G where

In other words, the associated bundle has the same base space M , it has fibre F and
its transition functions are the same as those of the principal bundle. The projection is
πE (u, f ) = π(u). The local trivializations are

ψi : Ui × F → πE−1 (Ui ) (p, f ) 7→ φi (p, 1), f .



such that

For example, we can take F to be a k-dimensional vector space V which provides a k-


dimensional representation ρ of G. Thus from a principal G-bundle  we can construct an
associated Rk vector bundle. The transition functions are ρ tij (p) .

7 Connections on fibre bundles


Manifold and bundles only have a topological and differentiable structure. We would like to
introduce a new structure: the “parallel transport” of elements of the fibre F as we move
along the base M . Such a structure is called a connection. We start defining a connection
on principal bundles. We use a geometric and coordinate-invariant definition.
26
This definition guarantees that G acts with a group action, because g2−1 g1−1 = (g1 g2 )−1 .

45
Definition 7.1. Let u be an element of a principal bundle P (M, G) and let Gp be the fibre
at p = π(u). The vertical subspace Vu P is the subspace of Tu P that is tangent to Gp at
u. [Warning: Tu P is the tangent space of P , not just of M .] That is

Vu P = ker π∗ .

To construct Vu P , take A ∈ g. The right-action

Rexp(tA) u = u exp(tA)

gives a curve through u in P and within Gp . Define a vector A] ∈ Tu P by

d
A] [f ] =

f u exp(tA)
dt t=0

where f ∈ F (P ). As we vary u ∈ P , this gives a vector field A] ∈ X (P ).


The map ] : g → Vu P given by A 7→ A] is an isomorphism of vector spaces.

¶ Exercise 20. We can explicitly check that π∗ A] = 0:


d d
(π∗ A] )[f ] = A] [f ◦ π] =

f ◦ π u exp(tA) = f ◦ π(u) =0,
dt t=0 dt t=0

where f ∈ F (M ).
Definition 7.2. Let P (M, G) be a principal bundle. A connection on P is a separation
of the tangent space Tu P into the vertical subspace Vu P and a horizontal subspace Hu P
such that
(i) Tu P = Hu P ⊕ Vu P

46
(ii) The separation is smooth, meaning that any (smooth) vector field X on P is separated
into (smooth) vector fields X H ∈ Hu P and X V ∈ Vu P as X = X H + X V
(iii) Hug P = Rg∗ Hu P ,
i.e. the separation at a point on the fibre fixes the separation on the whole fibre. (See
Figure)
The separation between vertical and horizontal subspaces is specified by the the connection
one-form.
Definition 7.3. A connection one-form ω ∈ T ∗ P ⊗ g is a projection of Tu P onto the
vertical subspace Vu P ∼
= g:27
(i) ω(A] ) = A ∀A ∈ g
∗ −1
(ii) Rg ω = g ωg .
Then ω defines the horizontal subspace Hu P as its kernel:

Hu P ≡ X ∈ Tu P ω(X) = 0 .
Remark. We can check that this definition is consistent with the definition of Hu P . We have

Rg∗ Hu P = Rg∗ X ∈ Tu P ω(X) = 0 .
Using the definition of ω we rewrite
0 = ω(X) = g Rg∗ ω(X) g −1 = g ω(Rg∗ X) g −1 .
Since the adjoint action is invertible, we conclude

Rg∗ Hu P = Rg∗ X ∈ Tug P ω(Rg∗ X) = 0 = Hug P .

Let {Ui } be an open covering of M and let σi : M → P be local sections defined on each
Ui . Then we can represent ω by Lie-algebra-valued one-forms Ai on Ui :
Ai ≡ σi∗ ω ∈ Ω1 (Ui ) ⊗ g .
[It turns out that one can reconstruct ω from Ai .]28 In components we write
Ai = (Ai )aµ dxµ Ta ,
where {Ta } is a basis of g.
27
Equation (ii) can be written as (Rg∗ ω)u (X) ≡ ωug (Rg∗ X) = g −1 ωu (X)g. On the RHS is the adjoint
action of G on g. To define it, construct the map adg : G → G that maps x 7→ gxg −1 for any g ∈ G. In
particular it maps 1 7→ 1. Its differential map at 1 is adg∗ ≡ Adg : T1 G → T1 G. Identifying T1 G ∼
= g, we
obtain the adjoint action on the algebra.
28
One reconstructs g-valued one-forms ωi on π −1 (Ui ) ⊂ P by
ωi = gi−1 π ∗ Ai gi + gi−1 dP gi .
If Ai transform as a gauge potential from patch to patch, then the ωi ’s agree on intersections and give ω.

47
Lemma 7.4. Let σi and σj be local sections of a principal bundle P (M, G) on Ui ∩Uj , related
by σj (p) = σi (p) tij (p). Then for X ∈ Tp M (and p ∈ Ui ∩ Uj ):
]
σj∗ X = Rtij ∗ (σi∗ X) + t−1
ij dtij (X)

which is a vector in Tσj (p) P .

Proof. We take a curve γ : [−1, 1]


d ν
 → M such that γ(0) = p and γ̇(p) = X. Recall that, in
components, (σ∗ X)ν = dt σ γ(t) t=0 . Keeping the index ν implicit and using the shorthand
notation σ(t) for σ(γ(t)) and tij (t) for tij (γ(t)), we find:
d d 
σj∗ X = σj (t) = σi (t) tij (t) t=0
dt t=0 dt
d d
= σi (t) · tij (p) + σi (p) · tij (t)
dt dt t=0
d
= Rtij ∗ (σi∗ X) + σj (p) tij (p)−1 tij (t) .
dt t=0

We should interpret the last term. Notice that


d d
tij (p)−1 dtij (X) = tij (p)−1 tij (p)−1 tij (t) t=0 ∈ T1 G ∼

tij (t) = =g.
dt t=0 dt
If we compare with the definition of A] at u ∈ P :
d  ν
] ν
(Au ) = u (1 + tA + . . . ) ,
dt t=0
]
we see that the last term is t−1
ij (p) dtij (X) at σj (p).

We use the lemma to compare pull-backs of the connection one-form:

σj∗ ω(X) = ω(σj∗ X) = ω(Rtij ∗ σi∗ X) + ω (t−1 ]



ij dtij (X))
= Rt∗ij ω(σi∗ X) + t−1 −1 ∗ −1
ij dtij (X) = tij σi ω(X) tij + tij dtij (X) .

We used the properties of the connection one-form. Since this relation is valid for any X,
we conclude
Aj = t−1 −1
ij Ai tij + tij dtij .

This is precisely the way in which a gauge potential transforms under gauge transformations.

7.1 Curvature
Definition 7.5. The curvature two-form Ω ∈ Ω2 (P ) ⊗ g is defined as

Ω = dP ω + ω ∧ ω ,

in terms of the connection one-form ω ∈ Ω1 (P ) ⊗ g.

48
Here dP is the differential on P . Let us define the last term. Let ζ, η be g-valued forms:
ζ ∈ Ωp (M ) ⊗ g and η ∈ Ωq ⊗ g. This means that we can decompose

ζ = ζ a ⊗ Ta , η = η b ⊗ Tb

where {Ta } is a basis of g, while ζ a ∈ Ωp (M ) and η b ∈ Ωq (M ). Then

[ζ, η] ≡ ζ ∧ η − (−1)pq η ∧ ζ
= ζ a ∧ η b Ta Tb − (−1)pq η b ∧ ζ a Tb Ta
= ζ a ∧ η b ⊗ [Ta , Tb ] = Cab
c
ζ a ∧ η b ⊗ Tc .

In the special case that ζ = η and p = q is odd:


1 1
ζ ∧ζ = [ζ, ζ] = ζ a ∧ ζ b ⊗ [Ta , Tb ] .
2 2
Lemma 7.6. The curvature two-form Ω satisfies

Rg∗ Ω = g −1 Ωg for g ∈ G .

Proof. It is enough to expand

Rg∗ Ω = Rg∗ (dP ω + ω ∧ ω) = dP Rg∗ ω + Rg∗ ω ∧ Rg∗ ω = dP (g −1 ωg) + g −1 ωg ∧ g −1 ωg

and use that g is a constant.


Proposition 7.7. Defining the local form F of the curvature Ω as

F ≡ σ∗Ω ,

where σ : U → P is a local section of the principal bundle P (M, G) on a chart U , it is


expressed in terms of the gauge potential A = σ ∗ ω as

F = dA + A ∧ A .

Proof. We have F = σ ∗ (dP ω + ω ∧ ω) = d σ ∗ ω + σ ∗ ω ∧ σ ∗ ω.

In components we can write


1 1 a
F= Fµν dxµ ∧ dxν = Fµν dxµ ∧ dxν Ta = F a Ta ,
2 2
where we have expanded the form and/or Algebra-valued part. Then

Fµν = ∂µ Aν − ∂ν Aµ + [Aµ , Aν ]
a
Fµν = ∂µ Aaν − ∂ν Aaµ + Cbc
a a b
Aµ Aν .

The first expression is still g-valued.

49
Proposition 7.8. Let Ui and Uj be overlapping charts of M with local forms Fi and Fj of
the curvature. On Ui ∩ Uj they satisfy

Fj = t−1
ij Fi tij

where tij is the transition function.

Proof. The simplest proof is to start from Fj = dAj + Aj ∧ Aj and substitute the relation
Aj = t−1 −1 −1 −1 −1
ij Ai tij + tij dtij . One makes use of tij dtij tij = −dtij . The details are left as an
exercise ¶.

The curvature two-form satisfies a constraint known as the Bianchi identity. In a


coordinate-invariant form this is

dP Ω(X H , Y H , Z H ) = 0

for any X H , Y H , Z H ∈ Hu P . This follows from dP Ω = dP ω ∧ ω − ω ∧ dP ω and the fact that


ω is a projection to the vertical subspace.
More convenient is the local form of the Bianchi identity. We introduce a covariant
derivative D on g-valued p-forms on M :

Dη ≡ dη + [A, η] .

Then, expanding dF, we find

DF ≡ dF + [A, F] = 0 .

In components:
0 = ∂[µ Fνρ] + A[µ Fνρ] − F[µν Aρ] .

7.2 Parallel transport and covariant derivative


Definition 7.9. Let P (M, G) be a G-bundle and let γ : [0, 1] → M be a curve in M . A
curve γ̃ : [0, 1] → P is said to be a horizontal lift of γ if π ◦ γ̃ = γ and the tangent vector
to γ̃(t) always belongs to Hγ̃(t) P .
Theorem 7.10. Let γ : [0, 1] → M be a curve in M and let u0 ∈ π −1 γ(0) . Then there


exists a unique horizontal lift γ̃(t) in P such that γ̃(0) = u0 .


The proof is based on the observation that if X e is the tangent vector to γ̃, then the curve
satisfies ω(X) = 0. This equation reduces to an ODE, and thus the solution exists and is
e
unique. See the Figure.
Let us write  the equation. We work in a trivialization given by a local section σi (p), such
that φi σi (p) = (p, 1). Then let

γ̃(t) = σi γ(t) · gi (t)

50
for some function gi : [0, 1] → G that we want to determine. With a slight modification of
Lemma 7.4, we find

e = γ̃∗ d = d γ̃ = d σi γ(t) · gi (t)


h  i
X
dt dt dt
d  −1  d ]
= Rgi (t)∗ (σi ◦ γ)∗ + gi dgi .
dt dt
Applying ω and using its properties, we get

e = gi (t)−1 ω σi∗ γ∗ d gi (t) + gi (t)−1 dgi (t) .


 
0 = ω(X)
dt dt
Multiplying by gi (t) we get
dgi (t)  d
= −ω σi∗ γ∗ · gi (t) ,
dt dt
d
which is an ODE. Noticing that γ∗ dt ≡ X is the vector tangent to γ in M , we can write
dgi (t)
= −Ai (X) gi (t)
dt
in terms of the local form of the connection.
Remark. The solution can be formally written as a path-ordered exponential:
 Z γ(t) 
 µ
gi (t) = P exp − Aiµ γ(t) dγ .
γ(0)

51
Corollary 7.11. Let γ̃ 0 be another horizontal lift of γ, such that γ̃ 0 (0) = γ̃(0) g. Then
γ̃ 0 (t) = γ̃(t) g for all t.

Proof. We have to show that if γ̃(t) is a horizontal lift of γ, then also γ̃ 0 (t) ≡ γ̃(t) g = Rg γ̃(t)
is. The horizontal lift γ̃ satisfies the equation
 d
ω γ̃∗ =0 for all t ∈ [0, 1] .
dt
Then, using the properties of the connection one-form,
 d  d  d  d
ω γ̃∗0 = ω Rg∗ γ̃∗ = Rg∗ ω γ̃∗ = g −1 ω γ̃∗ g=0,
dt dt dt dt
which concludes the proof.

Given a curve γ : [0, 1] → M and a point u0 ∈ π −1 γ(0) , there is a unique horizontal lift


γ̃ of γ such that γ̃(0) = u0 , and hence a unique point

u1 = γ̃(1) ∈ π −1 γ(1) .


The point u1 is called the parallel transport of u0 along the curve γ.

In physics, we often need to differentiate sections of a vector bundle which is associated


with a certain principal bundle. For example, a charged scalar field in QED is regarded as a
section of a C vector bundle associated to a U (1) principal bundle P (M, U (1)). A connection
one-form ω on a principal bundle enable us to define the covariant derivative in associated
bundles to P .
Let P (M, G) be a G-bundle with projection πP , ρ a representation of G on the vector
space V , and E = P ×ρ V the associated bundle whose elements are the classes

[(u, v)] = (ug, ρ(g)−1 v) u ∈ P, v ∈ V, g ∈ G .




Given an element [(u0 , v)] ∈ πE−1 (p), it is natural to define its parallel transport along a curve
γ in M as  
(γ̃(t), v)
where γ̃(t) is the horizontal lift of γ in P with γ̃(0) = u0 .
To define the covariant derivative, we notice that given a section s ∈ Γ(M, E) and a curve
γ : [−1, 1] → M , we can always represent the section along the curve as
  
s γ(t) = (γ̃(t), η(t))

for some η(t) ∈ V . We define the covariant derivative of s along γ(t) at p = γ(0) as
h d i
∇X s ≡ γ̃(0), η(t) t=0 ,
dt
52
where X is the tangent vector to γ(t) at p. By construction, the parallel transport is a
covariantly-constant transport.
The covariant derivative can be computed at any point of M , therefore if X is a vector
field on M then ∇X is a map Γ(M, E) → Γ(M, E). Since this map turns out to be point-wise
linear in X, we can think of ∇ as a map
∇ : Γ(M, E) → Ω1 (M ) ⊗ Γ(M, E) ,
in the sense that h∇s, Xi = ∇X s.
Proposition 7.12. The covariant derivative satisfies the following properties:
∇(a1 s1 + a2 s2 ) = a1 ∇s1 + a2 ∇s2
∇(f s) = df ⊗ s + f ∇s
∇(a1 X1 +a2 X2 ) s = a1 ∇X1 s + a2 ∇X2 s
∇f X s = f ∇X s .
Here s ∈ Γ(M, E), a1,2 ∈ R and f ∈ F (M ).
¶ Exercise 21. Prove them.

Proof. The first one follows from thelinear structure of V . The second one, when contracted
d d d d
with X, follows from dt f (γ(t)) η(t) = dt f (γ(t)) · η(t) + f dt η(t) = hdf, Xiη(t) + f dt η(t).
The last two, which we already used in the definition of ∇, are more easily proven in the
local expression given below.

Take a local section σi (p) ∈ Γ(Ui , P ) and a local


 trivialization φi (p, 1) = σi on P . If e0α is
a basis vector of V , we let eα (p) = (σi (p), e0α ) be a local section of the associated bundle
E = P ×ρ V . We compute the covariant derivative ∇X eα . Let γ : [−1, 1] → M be a curve
tangent to X and
γ̃(t) ≡ σi (γ(t)) · gi (t)
its horizontal lift in P . Then
eα (t) = σi (t), e0α = γ̃(t) gi−1 (t), e0α ∼ γ̃(t), ρ(gi−1 (t)) e0α .
     

The covariant derivative is


h d i
ρ gi−1 (t) e0α

∇X eα = γ̃(0),
dt t=0
h  dg i (t)  i
= γ̃(0), −ρ gi−1 (0) gi−1 (0) e0α
h dt t=0
 0 i
= σi (0), ρ Ai (X) eα

In representation ρ, the one-form Ai is expanded into generators Ta that act as


Ta e0α = (Ta )β α e0β .

53
Therefore, in a local trivialization of E we have

∇X eα = X µ (Ai )µβ α eβ .

The covariant derivative of a generic section s(p) = ξiα (p) eα follows from the properties listed
above:  α
µ ∂ξi

β β
∇X s = X + (Ai )µ α ξi eα .
∂xµ

¶ Exercise 22. Show that the covariant derivative is independent of the local trivialization
σi chosen, namely that σi , ξi and Ai transform correctly to keep ∇X s invariant.
¶ Exercise 23. Show that the last two properties in Proposition 7.12 are true.
Proposition 7.13. We can define the action of ∇ on fibre-valued p-forms s ⊗ η, where
s ∈ Γ(M, E) and η ∈ Ωp (M ), by

∇(s ⊗ η) = ∇s ∧ η + s ⊗ dη .

Then, if s(p) = ξiα (p) eα (p) is a section of E, we have

∇∇s = eα ⊗ (Fi )αβ ξiβ = ρ(Fi ) s .

Proof. Use that, in this notation, ∇eα = eβ ⊗ (Ai )β α .

54
8 Lie algebras
A Lie algebra g is a vector space equipped with an antisymmetric bilinear operation

[·,·] : g × g → g ,

called a Lie bracket or commutator, constrained to satisfy the Jacobi identity

[X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y ]] = 0 ∀ X, Y, Z ∈ g .

One can study complex or real Lie algebras, that are vector spaces over C or R, respectively.
For the purpose of studying general properties, it is convenient to work over the complex
field. We will also restrict to finite-dimensional algebras.
A Lie algebra can be specified by a set of generators {J a } (in the sense that they are a
basis of g as a vector space) and their commutation relations
X
[J a , J b ] = i f abc J c ,
c

where f abc = −f bac . The number of generators is the dimension of the algebra. We will start
studying complex Lie algebras. The complex numbers f abc are the structure constants,
and we have inserted a factor of i such that if the generators are Hermitian, (J a )† = J a , then
f abc are real.

8.1 Representations
Definition 8.1. A representation ρ of a Lie algebra g is a vector space V , together with
a homomorphism
ρ : g → End(V ) .
Thus, every element of g is “represented” by a linear operator on V . If V is finite-dimensional,
the linear operators are simply matrices. For ρ to be a homomorphism, we require that it
maps the Lie bracket to a commutator:
  
ρ [X, Y ] = ρ(X) ρ(Y ) − ρ(Y ) ρ(X) ≡ ρ(X), ρ(Y ) .

Notice that in a representation we can multiply operators, while in the abstract algebra this
operation is not defined. The dimension of V is the dimension of the representation.
There is a representation that is intrinsically given once the Lie algebra g is given: it is
called the adjoint representation, and uses the algebra g itself as the vector space on which
the generators act:
ad(X) Y ≡ [X, Y ] .

55
  
The property ad [X, Y ] = ad(X), ad(Y ) follows from the Jacobi identity.29 In the basis
{J a }, the generators are given by the matrices (J a )bc = if abc .
Two representations ρ1,2 on V are isomorphic if there exist a linear map R such that
ρ1 = Rρ2 R−1 .
Given a representation ρ(J a ) = T a , the generators T a are matrices that satisfy

[T a , T b ] = i f abc T c .

Then the generators Tea = −T aT satisfy the same relation, and thus are also a representation.
This is called the conjugate representation ρ∗ . If ρ∗ is isomorphic to ρ, we say that the
representation is self-conjugate.

8.2 Simple and semi-simple algebras


Definition 8.2.
(a) A subspace s ⊂ g that is closed under Lie bracket, [s, s] ⊆ s, is called a Lie subalgebra.
(b) A subspace i ⊂ g that is invariant under Lie bracket (this is a stronger condition),

[i, g] ⊆ i ,

is called an ideal.
If i is an ideal, then the quotient g/i, in the sense of a quotient of vector spaces, is also
a Lie algebra.30 The elements of g/i are equivalence classes of x ∼ x + a with x ∈ g
and a ∈ i. Then

[x + a, y + b] = [x, y] + [a, y] + [x, b] + [a, b] ∼ [x, y]

because the last three terms are in i. Thus the Lie bracket extends to the quotient.
A canonical example of ideal is the derived subalgebra g0 of g:
D E
0
g ≡ [g, g] = [x, y] x, y ∈ g ,

where h i is the linear span over the field of the algebra, say C or R. It is clear that [g0 , g] ⊆ g0 .
The quotient
g/g0 is an Abelian Lie algebra ,
29
The Jacobi identity implies
 
ad(X) ad(Y ) − ad(Y ) ad(X) Z = [X, [Y, Z]] − [Y, [X, Z]] = [[X, Y ], Z] = ad [X, Y ] Z .

30
This is similar to the fact that G/H is a group if H is a normal subgroup of G.

56
since any commutator is equivalent to zero.31
Abelian Lie algebras are not particularly interesting, as the Lie bracket is simply zero:
[g, g] = 0. Lie algebras with a proper ideal, i.e. an ideal i g, can be understood as
32
extensions of the quotient g/i by i. This motivates the study of simple Lie algebras:
Definition 8.3. A non-Abelian33 Lie algebra that does not contain any proper ideal is called
simple. A direct sum of simple algebras is called semi-simple.

8.3 Killing form


We introduce a symmetric bilinear form on g, called the Killing form, defined as34
1 
k(X, Y ) ≡ Tr ad(X) ad(Y ) .
2g̃
The normalization constant g̃ > 0 will be fixed in Section 10.2. The Killing form is invariant
in the following sense:  
k [Z, X], Y + k X, [Z, Y ] = 0 .
This simply follows from the definition.
Theorem 8.4. (Cartan’s criterion) A Lie algebra g is semi-simple if and only if the
Killing form k is non-degenerate.
We will not prove this theorem, which requires more machinery.

We can use the Killing form to construct the totally antisymmetric tensor
X
i f abc ≡ k [J a , J b ], J c = i f abd k dc ,

d

where
k dc ≡ k(J d , J c ) .
¶ Exercise 24. Show, using invariance of the Killing form, that f abc is totally antisymmetric.

If the Killing form is non-degenerate, one could similarly construct fabc lowering the indices
with kab , which is the inverse matrix of k ab .
31
The quotient g/g0 is not the Cartan subalgebra. For instance, for g = su(2) one finds g0 = su(2).
32
An extension is a short exact sequence 0 → i → g → g/i → 0. Similarly, G is an H-bundle over G/H.
33
We should specify that the algebra is non-Abelian, otherwise the 1-dimensional algebra, which is neces-
sarily Abelian, would turn out to be simple. Alternatively, we specify that the dimension must be ≥ 2.
34
The trace of a linear operator A on V is defined as follows. Take a basis v i of V , and express Av i = aij v j .
Then Tr A ≡ Tr(a) where the last one is the trace of a matrix. One can check that this definition is
independent of the basis chosen.

57
8.4 The example of su(2)
Let us consider the algebra of SL(2, C) first. The group SL(2, C) is the group of complex
2 × 2 matrices with determinant 1:
 
a b
M= ∈ C2×2 with det M = 1 .
c d

The algebra is the tangent space at 1. We use the exponential map:

M = exp tA = 1 + tA + O(t2 ) .


The algebra is the set {A | M ∈ SL(2, C)}. We use the formula35

1 = det M = exp t Tr A = 1 + t Tr A + O(t2 ) .




We conclude that
sl(2, C) = A ∈ C2×2 Tr A = 0 .


We can choose the following generators:


1     
0 0 1 0 0
J3 = 2 , J+ = , J− = .
0 − 12 0 0 1 0

They generate sl(2, C) as a vector space over C.


Now consider the group SL(2, R) of real 2 × 2 matrices with determinant 1. Its algebra
sl(2) is the set of real traceless matrices, and we can take the same generators J3 , J± but
now in a vector space with real coefficients. In other words, with complex coefficients we
get the same algebra, but there is a restriction of sl(2, C) to real coefficients which gives the
(real) algebra sl(2). This restriction is called a real form.
As another example, consider the algebra of SU (2). This is the group of unitary matrices
with determinant 1:
 
a b
M= ∈ C2×2 with M † = M −1 , det M = 1 .
c d

Setting

M = eitA , M † = e−itA , M −1 = e−itA ,
we find that the algebra is

su(2) = A ∈ C2×2 A = A† ,

Tr A = 0 .
35
Suppose that A is diagonalizable. In a basis in which A is diagonal the formula is obvious. Since det M
is a continuous function (of the entries) and since the set of diagonalizable matrices is dense in the space of
all matrices, the formula follows for a generic matrix A.

58
A set of generators is
1 1
0 − 2i
    
0 σ3 0 σ1 σ2
J3 = 2 1 = , J1 = 1
2 = , J2 = i = ,
0 −2 2 2
0 2 2
0 2
where σi are the Pauli matrices, and we should use real coefficients. However over the field
C we can also use the generators
J1 ± iJ2 = J± .
Therefore, over C we get the same algebra as sl(2, C), while over R su(2) is another real
form of sl(2, C). In the following we will indicate the complex algebra as su(2).

The commutation relations are


[J3 , J± ] = ±J± , [J+ , J− ] = 2J3 .
Suppose we have a finite-dimensional representation, and let |λi be an eigenvector of J3 :
J3 |λi = λ |λi .
Then J± increase/decrease the eigenvalue,

J3 J± |λi = (λ ± 1) J± |λi ,
provided J± |λi =
6 0. All the states we get this way are linearly independent (since they have
different eigenvalue), and since the representation is finite-dimensional, the process has to
stop both going up and going down.
Thus, necessarily there exists a state |ji, with J3 |ji = j |ji, such that
J+ |ji = 0 .
This is called a highest weight state. We then define the following other states:
|j − ki ≡ (J− )k |ji .
The action of the algebra generators on them is
J3 |j − ki = (j − k) |j − ki
J− |j − ki = |j − k − 1i
J+ |j − ki = k(2j − k + 1) |j − k + 1i .
The last one follows from the commutation relations (¶).36 As we said, there must be a
kmax such that |j − kmax i =
6 0 but |j − kmax − 1i = 0. In particular J+ |j − kmax − 1i = 0.
Substituting above, this happens for kmax = 2j.
36
In fact
k k−1 k−1 k−1 k−2 2 k−2
 
J+ |j − ki = J+ J− |ji = 2J3 J− + J− J+ J− |ji = 2J3 J− + 2J− J3 J− + J− J+ J− |ji = . . .
k−1
X
k−1 k−2 k−1

=2 J3 J− + J− J3 J− + ... + J− J3 |ji = 2 (j − s)|j − k + 1i .
s=0

59
The states above form an irreducible representation, because they are all connected by
the action of su(2), and they are closed under that action. It is called a highest weight
representation. Thus:
(i) In finite-dimensional representations, j ∈ 21 Z.
(ii) The irreducible representation has 2j + 1 states, with J3 = {−j, −j + 1, . . . , j − 1, j}.
We could easily write the generators in this basis (essentially we already did this).
On the other hand, we can write the generators in the adjoint representation (from the
commutation relations):
     
1 0 −1 0 0 0 0
J3 =  0  , J+ = 0 0 2 , J− = −2 0 0
−1 0 0 0 0 1 0

We see that the adjoint representation corresponds to j = 1.

8.5 Cartan-Weyl basis


We are concerned with simple Lie algebras. A convenient choice of basis is the Cartan-Weyl
basis. We first find a maximal set of commuting (Hermitian) generators H i , i = 1, . . . , r,
where r is the rank of the algebra:

[H i , H j ] = 0 .

These generators form the Cartan subalgebra h. Since the generators of the Cartan
subalgebra can be simultaneously diagonalized, we choose the remaining generators to be
combinations of the J a ’s that are common eigenvectors:

[H i , E α ] = αi E α .

The vector α = (α1 , . . . , αr ) is called a root and E α its ladder operator. Because h is the
maximal Abelian subalgebra, the roots are non-zero. One can also prove that they are all
different (see footnote 38). Since a root maps an element H i ∈ h to αi ∈ R,

α(H i ) = αi ,

roots are elements of the dual to the Cartan subalgebra: α ∈ h∗ .

Let us make use of the Killing form and the fact that it is non-degenerate.

• Applying invariance to k [H i , E α ], H j we obtain αi k(E α , H j ) = 0, therefore

k(E α , H j ) = 0 .

60

• Applying invariance to k [H i , E α ], E β we obtain (αi + β i ) k(E α , E β ) = 0, therefore

k(E α , E β ) = 0 whenever β 6= −α .

Since k is non-degenerate, necessarily −α is a root whenever α is a root.37


Then necessarily k(E α , E −α ) 6= 0.
[From here we can prove that roots do not have multiplicities.]38

• We also conclude that k restricted to the Cartan subalgebra must be non-degenerate.


We can then choose H i such that39

k(H i , H j ) = δ ij .

We denote the set of roots by ∆. Consider the adjoint representation:

H i 7→ |H i i , E α 7→ |E α i ≡ |αi .

The dimension of the adjoint representation is equal to the dimension of g, and we see that

dim g = #∆ + r .

We also see that the number of roots is even.


37
Notice that the same conclusion is reached if we assume that H i are Hermitian. Taking the Hermitian
conjugate of the eigenvalue equation, using H i = (H i )† and the fact that the eigenvalues are real, we find

[H i , (E α )† ] = −αi (E α )† meaning (E α )† = E −α .

Thus −α is necessarily a root, when α is.


38
Take E α , a vector in the root space Lα of α. Because of non-degeneracy of k, there must exist a vector
−α
in L−α such that k(E α , E −α ) = C 6= 0. Then k [E α , E −α ], H i = Cαi , and thus [E α , E −α ] = C H α

E
where H α = αi H i . Now consider the subspace R = hE α i ⊕ h ⊕ L−α ⊕ · · · ⊕ L−pα , where p is such that −pα
is a root but −(p + 1)α is not. This space is a representation of {E α , E −α , H α } as it is closed under their
adjoint action. We compute TrR H α = TrR [E α , E −α ] = 0 because it is a commutator. On the other hand
X X X X  X 
TrR H α = αi Trspaces H i = αi αi − d−` `αi = |α|2 1 −

d−` ` ,
i spaces i ` `

where d−` is the dimension of L−`α . We know that |α|2 6= 0 and d−1 ≥ 1. We conclude that d−1 = 1 and
d−` = 0 for ` ≥ 2. In particular L−1 is one-dimensional.
Repeating the argument for all roots, we conclude that all Lα are one-dimensional, and moreover if α is
a root, the only multiple of α which is also a root is −α.
39
Since we are studying the algebra over C, we can always make k positive-definite. What is non-trivial is
that, in that basis,
P the roots are real. To show that, consider
k(H α , H α ) = i αi αi = (α, α) = αi αj Tr ad(H i )ad(H j ) = αi αj β β i β j = β (α, β)2 .
P P
On the other hand, (α, β) = (α, α)(qβ − pβ )/2 where qβ , pβPare the largest possible integers such that β + pβ α
and β − qβ α are non-zero. We conclude that (α, α) = 4/ β (qβ − pβ )2 is positive, and so (α, β) is real.

61
Let us evaluate the restriction of the Killing form to h. We have ad(H i )H j = 0 and
ad(H i )E α = αi E α . Therefore, in this basis, ad(H i ) is a diagonal matrix with entries αi on
E α . This implies that
 X i j
Tr ad(H i ) ad(H j ) = αα .
α

Imposing k(H i , H j ) = δ ij and contracting with δij , we find that

|α|2
P
g̃ = α |α|2 ≡ ri=1 αi αi .
P
where
2r
This fixes g̃ in terms of the normalization of the root lengths.

The Jacobi identity implies

[H i , [E α , E β ]] = (αi + β i ) [E α , E β ] .

(i) If α + β ∈ ∆, the commutator [E α , E β ] must be proportional to E α+β


(with a proportionality constant Nα,β 6= 0 that one can compute).
(ii) If 0 6= α + β 6∈ ∆, then [E α , E β ] = 0.
(iii) If β = −α, then [E α , E −α ] commutes with all H i and thus must belong to the Cartan
subalgebra.
Since
k [E α , E −α ], H i = k E −α , [H i , E α ] = αi k(E α , E −α ) ,
 

it follows that [E α , E −α ] = αi H i k(E α , E −α ). We use the symbol α · H ≡ αi H i . We


can then rescale the generators such that
2
k(E α , E −α ) = .
|α|2

We finally get
[H i , H j ] = 0
[H i , E α ] = αi E α
[E α , E β ] = Nα,β E α+β if α + β ∈ ∆
α·H
=2 if α = −β
|α|2
=0 otherwise .
Here Nα,β are non-vanishing constants. This is called the Cartan-Weyl basis.

62
The fundamental role of the Killing form is to establish an isomorphism between the
Cartan subalgebra h and its dual h∗ . The form k(H i , · ) maps h → R and thus is an element
of h∗ .
To every element γ ∈ h∗ , there corresponds an element H γ ∈ h such that

γ = k(H γ , · ) ∈ h∗ .

In particular H α = α · H = αi H i . We can use this isomorphism to induce a positive-


P
i
definite scalar product on h∗ :
X
(γ, β) ≡ k(H γ , H β ) = γ iβ i .
i

This is a scalar product on root space. In particular


X
(α, α) = |α|2 = αi αi .
i

Notice also that the non-degeneracy of k implies that the roots span h∗ .40

8.6 Weights
So far we have analyzed the algebra from the point of view of the adjoint representation.
But much of what we said can be repeated in other representations.

Given a representation, we consider a basis |λi that diagonalizes the Cartan generators:

H i |λi = λi |λi .

The eigenvalues λi form a vector (λ1 , . . . , λr ), called a weight. Weights, as the roots, are
elements of h∗ :
λ(H i ) = λi ,
and the scalar product extends to them. In fact, the roots are the weights of the adjoint
representation.
The ladder operators E α change the eigenvalue of a weight by α:

H i E α |λi = [H i , E α ] |λi + E α H i |λi = (λi + αi ) E α |λi .




40 ∗
Pr Supposei
Pr of h , i then there exists r numbers {xi } such that
that the roots only span a subspace
x
i=1 xi α = 0 for all α ∈ ∆. Construct H ≡ i=1 xi H . We can see that such an element would be
orthogonal to all H j :
r
X r
X X XX 
k(H x , H j ) = xi k(H i , H j ) ∝ xi αi αj = xi α i α j = 0 .
i
i=1 i=1 α α

This would contradict the fact that k is non-degenerate on h.

63
We are interested in finite-dimensional representations. For those, there must exist integers
p, q such that

(E α )p+1 |λi ∼ E α |λ + pαi = 0


(E −α )q+1 |λi ∼ E −α |λ − qαi = 0

for any root α. In fact, notice that E α , E −α and α · H/|α|2 form an su(2) subalgebra of g.
We can write hα · H
±α
i
±α
 α −α  α·H
2
, E = ±E , E ,E =2 .
|α| |α|2
Thus we identify J3 ≡ α·H |α|2
and J± = E ±α . The state |λi is part of a finite-dimensional
representation of su(2). Let it be the state with J3 equal to m. If the total spin is j, the
highest-weight state is reached in p steps and the lowest in q, it means that m = j−p = −j+q.
Since m is the eigenvalue of J3 we have:

(α, λ) α·H
2
|λi = |λi = m |λi = (j − p) |λi = (−j + q) |λi .
|α| |α|2

Combining the equations we get

(α, λ)
2 = −(p − q) ∈ Z .
|α|2

This will be very important later on.

8.7 Simple roots and the Cartan matrix


The number of roots is equal to the dimension of the algebra minus the rank, and in general
this is much larger than the rank. Thus the roots are linearly dependent. We fix a basis
{β1 , . . . , βr } in h∗ , so that any root can be expanded as
r
X
α= nj βj .
j=1

In this basis we define an ordering:

• α is said to be positive if the first non-zero number in the sequence (n1 , . . . , nr ) is


positive. Denote by ∆+ the set of positive roots. Then ∆− is the set of negative
roots, and
∆− = −∆+ .

64
• A positive root that cannot be written as the sum of two positive roots is called a
simple root αi . There are exactly41 r simple roots {α1 , . . . , αr }, providing a convenient
basis for h∗ .

Notice that:
(i) αi − αj 6∈ ∆.
Proof. Suppose αi − αj = γ ∈ ∆+ , then αi = αj + γ. If instead αi − αj = γ ∈ ∆− ,
then αj = αi + (−γ).
(ii) Every positive root is a sum of simple roots.
(In particular, every root can be written in the basis of simple roots using integer
coefficients, either all non-negative or all non-positive.)
Proof. Every positive root is either simple, or it can be written as the sum of two
positive roots. We can continue the argument, which has to stop because the number
of roots is finite.

The scalar product of simple roots defines the Cartan matrix:

2(αi , αj )
Aij = ∈Z.
|αj |2

(i) The matrix is not symmetric.


(ii) From the argument on the su(2) subalgebra, its entries are integers. The diagonal
elements are 2.
(iii) The Schwarz inequality42 implies that (not summed) Aij Aji < 4 for i 6= j.
(iv) Since αi − αj is not a root, E −αj |αi i = 0 and in our argument on the su(2) subalgebra
with αi , αj we have q = 0. Thus

(αi , αj ) ≤ 0 for i 6= j .

(v) We conclude that the off-diagonal elements Aij , Aji are either both 0, or one is −1 and
the other one is −1, −2 or −3.
Notice that
(αi , αj ) = |αi | |αj | cos θij ,
41
As shown below, the simple roots {αi } span the positive roots ∆+ over Z≥0 and so they span Pthe lattice
of roots over Z. Thus they span h∗ over R. Suppose they are not linearly independent, namely i ki αi = 0
for some numbers ki .PThe ki cannotP all be positive, thus separate the sum into a positive and a negative
part, andPwrite γ = ai αi = i bi αi with ai , bi ≥ 0 and ai bi = 0 for all i. Notice that γ 6= 0. But then
(γ, γ) = ij ai bj (αi , αj ) ≤ 0, using the fact (proven below) that (αi , αj ) ≤ 0 for i 6= j. The contradiction
proves that the simple roots are linearly independent.
42 2
Cauchy-Schwarz inequality: hu, vi ≤ hu, ui hv, vi with equality if and only if u, v are linearly dependent.

65
where θij is the angle between the two roots, and is ≥ 90◦ . Such an angle is expressed by
the Cartan matrix:
p √ √ o
Aij Aji n 1 2 3
cos θij = − ∈ 0, − , − ,− .
2 2 2 2
The quantity Aij /Aji (not summed) is the ratio |αi |2 /|αj |2 of lengths of the roots αi , αj ,
whenever they are not orthogonal (namely (αi , αj ) 6= 0). This ratio can only be 1, 2, 21 , 3, 31 ,
and it turns out43 that there can only be at most two lengths in a simple Lie algebra. When
all roots have the same length, the algebra is said to be simply laced.
It is convenient to introduce a special notation for the quantity 2αi /|αi |2 :
2αi
αi∨ = .
|αi |2

Here αi∨ is called the coroot associated to the root αi . Notice in particular that

Aij = (αi , αj∨ ) ∈ Z .

A distinguished
P element of ∆ P is the highest root θ. It is the unique root for which, in
the expansion mi αi , the sum mi is maximized. All elements of ∆ can be obtained by
repeated subtraction of simple roots from θ.44 The coefficients of the decomposition of θ in
the bases {αi } and {αi∨ } bear special names: the marks ai and the comarks a∨i :
r r
X |θ|2 X ∨ ∨
θ= ai αi = a α ai , a∨i ∈ N .
i=1
2 i=1 i i

In terms of them one defines the Coxeter number g and the dual Coxeter number g ∨ :
r
X r
X
g ≡ 1+ ai , g∨ ≡ 1 + a∨i .
i=1 i=1

These definitions are independent of normalizations.


We will choose a normalization in which the longest roots have |α|2 = 2. It turns out that
θ is always a long root, thus we fix |θ|2 = 2. As we will see (Section 10.2), the normalization
of the Killing form is related to the dual Coxeter number by

2g̃ = |θ|2 g ∨ → 2g ∨ .
43
It can be shown as follows. First, the classification of Dynkin diagrams implies that, among the simple
roots, there can be at most two different lengths (see in particular point 7 in Chapter 9 of Cahn’s book).
Second, all roots can be obtained from the simple roots by applying Weyl reflections (see below), which
preserve the scalar product.
44
Therefore θ is the highest weight of the adjoint representation.

66
8.8 The Chevalley basis
The Cartan matrix contains all the information about the structure of a simple Lie algebra
g. This is made manifest in the so-called Chevalley basis.
To each simple root we associate three generators:
αi · H
ei ≡ E αi , f i = E −αi , hi = 2 .
|αi |2
Their commutation relations are fixed by the Cartan matrix:

[hi , hj ] = 0
[hi , ej ] = Aji ej
[hi , f j ] = −Aji f j
[ei , f j ] = δij hj .

The remaining (ladder) generators are obtained by repeated commutations of these basic
generators, subject to the Serre relations
1−Aji j
ad(ei )

e =0
1−A
ad(f i ) ji j
 
f =0.
 2
For instance, ad(e1 )e2 = [e1 , e2 ] while ad(e1 ) e2 = [e1 , [e1 , e2 ]]. These relations follow
from the su(2) subalgebra argument, applied to the adjoint representation, noticing that the
difference of two simple roots is never a root. Thus in this procedure the generators ei and
f j never mix, reflecting the separation of the roots into ∆+ and ∆− .
This procedure shows that the Lie algebra is reconstructed from the Cartan matrix Aij .

8.9 Dynkin diagrams


All the information contained in the Cartan matrix can be encapsulated in a planar diagram:
the Dynkin diagram:
(i) To each simple root αi we associate a node.
(ii) We join node i and j with Aij Aji ∈ {0, 1, 2, 3} lines.
(iii) If there is more than one line then the two roots have different length, and we add an
arrow from the longer to the shorter (we can also draw short roots with shaded color).
The Cartan matrix and the Dynkin diagram should satisfy some extra properties:
(a) The Cartan matrix should be indecomposable, i.e. the Dynkin diagram should be
connected (for simple Lie algebras).
If the diagram is disconnected, the Chevalley construction gives two or more commuting
subalgebras which are ideals of the total algebra, in contradiction with simplicity.

67
(b) The Cartan matrix should have det A > 0 and all principal (sub)minors positive. In
other words (Sylvester’s criterion) the matrix A should be a positive-definite bilinear
form.
(α ,α )
The Cartan matrix Aij = 2 (αji ,αjj ) can be written as the product of two matrices:
2 X
A=A eD , Djk = δjk , Aeij = (αi , αj ) = αik αjk = (ααT )ij .
(αj , αj ) k

D is diagonal with positive entries, so it does not affect the conclusion. Ae is written in
terms of α, which is the matrix of the roots; since the roots are linearly independent, α
has maximal rank. It follows det A e = (det α)2 > 0. In fact, the matrix ααT is positive
definite.

It is an exercise to classify all possible Cartan matrices / Dynkin diagrams with those
properties (see Chapter 9 of Cahn’s book or Chapter 18 of Kac’s lecture notes). This leads
to the complete classification of simple Lie algebras. There are four infinite families
and five exceptional ones:
Ar≥1 , Br≥2 , Cr≥1 , Dr≥2 , E6,7,8 , F4 , G2 .
The subscript indicates the rank. The simply-laced algebras are Ar , Dr and E6,7,8 .

Ar≥1 ∼
= su(r + 1). The Dynkin diagram is
···

with r nodes. All roots are long—the algebra is simply laced. The group SU (r + 1) is given
by (r + 1) × (r + 1) unitary matrices, M −1 = M † , with unit determinant, therefore
su(r + 1) = A ∈ C(r+1)×(r+1) A = A† , Tr A = 0 .


The dimension is r(r + 2). The dual Coxeter number is g ∨ = r + 1 (as |α|2 = 2 for all roots).

Br≥2 ∼
= so(2r + 1). The Dynkin diagram is
···

with r nodes, representing (r − 1) long roots and 1 short root. The group SO(N ) is given
by N × N orthogonal real matrices, M −1 = M T , with unit determinant. Writing M = etA
we find
so(N ) = A ∈ RN ×N AT = −A .


The dimension is N (N − 1)/2. Specialized to the case N = 2r + 1, the dimension is r(2r + 1).
One might be tempted to say that
A1 ∼ su(2) ∼

= B1 = so(3) .
This is true, but the normalizations are not the same because in B1 the root is short while
in A1 it is long.

68
Cr≥1 ∼
= sp(2r). The Dynkin diagram is

···

with r nodes, representing 1 long root and (r − 1) short roots. The group U Sp(2r) is given
by 2r × 2r unitary and symplectic matrices, namely

0 1r
 
−1 † T
M =M , M ΩM = Ω with Ω= .
−1r 0

This group is compact. Writing M = eitA , we find

usp(2r) = A ∈ C2r×2r ΩAT Ω = A = A† .




Counting the number of independent components,45 the dimension is r(2r + 1).


Alternatively, the group Sp(2r, R) is given by 2r × 2r real symplectic matrices. Writing
M = etA we find again the constraint ΩAT Ω = A, which however can be written as

sp(2r) = A ∈ R2r×2r (AΩ)T = AΩ .




Counting symmetric matrices, the dimension is r(2r + 1) again. [In fact, usp(2r) is the
compact real form while sp(2r) is a non-compact real form.]
Notice that

C1 ∼
= A1 sp(2) ∼ C2 ∼
= B2 sp(4) ∼
 
= su(2) , = so(5)

with the correct normalizations.

Dr≥2 ∼
= so(2r). The Dynkin diagram is

···

with r nodes. All roots are long—the algebra is simply laced. Specializing the dimension of
so(N ) to N = 2r, the dimension is r(2r − 1), while g ∨ = 2r − 2.
Notice that

D2 ∼= A1 × A1 so(4) ∼ D3 ∼= A3 so(6) ∼
 
= su(2) × su(2) , = su(4)

with the correct normalizations.


 
A1 A2
45
The equations ΩAT Ω = A = A† imply that A = with A1 Hermitian (r2 components) and
A∗2 −A∗1
A2 complex symmetric (r(r + 1) components).

69
E6 ∼
= e6 . The Dynkin diagram is

The algebra is simply-laced. It dimension is 78.


Exceptional Lie groups are not easy to describe. For reference, E6 (the compact form) is the
isometry group of a 32-dimensional Riemannian symmetric space known as the “bioctonionic
projective plane”.

E7 ∼
= e7 . The Dynkin diagram is

The algebra is simply-laced. It dimension is 133.


E7 is the isometry group of a 64-dimensional Riemannian symmetric space known as the
“quateroctonionic projective plane”.

E8 ∼
= e8 . The Dynkin diagram is

The algebra is simply-laced. It dimension is 248.


E8 is the isometry group of a 128-dimensional Riemannian symmetric space known as the
“octooctonionic projective plane”.

F4 ∼
= f4 . The Dynkin diagram is

The dimension of the algebra is 52.


F4 is the isometry group of a 16-dimensional Riemannian symmetric space known as the
“octonionic projective plane”.

G2 ∼
= g2 .

The dimension of the algebra is 14.


G2 is the automorphism group of the octonion algebra, or the little group of a Majorana
spinor of SO(7) (the Majorana spinor representation has real dimension 8).

70
8.10 Fundamental weights
Weights and roots live in the same r-dimensional vector space, since the roots are the weights
of the adjoint representation. The weights could be expanded in the basis of simple roots,
but the coefficients are (in general) not integers. The convenient basis {ωi } is the one dual
to the simple coroot basis:
(ωi , αj∨ ) = δij .
The ωi are called the fundamental weights.
The expansion coefficients λi of a weight λ in the fundamental weight basis are called
Dynkin labels:
Xr
λ= λi ωi ⇔ λi = (λ, αi∨ ) ∈ Z .
i=1

They are also the eigenvalues of the Cartan generators in the Chevalley basis:
αi · H
λi = hi (λ) with hi = 2 .
|αi |2

The fact that the Dynkin labels are integer (for finite-dimensional representations) follows
from the su(2) argument. We write

λ = (λ1 , . . . , λr )

to indicate the Dynkin labels of a weight. Since we can write

Aij = (αi , αj∨ ) ,

it follows that the entries in the rows of the Cartan matrix are the Dynkin labels of the
simple roots: X
αi = Aij ωj .
i

A weight of special importance, called the Weyl vector or principal vector, is the one
whose Dynkin labels are all 1:
r
X
ρ= ωi = (1, . . . , 1) .
i=1

It has the following property: it is half the sum of all positive roots:
1 X
ρ= α.
2 α∈∆
+

We will prove this formula later.

71
8.11 The Weyl group
Consider the projection of the adjoint representation to the su(2) subalgebra associated to
a root α.46 Let m be the eigenvalue of the operator J3 ≡ α · H/|α|2 on the state |βi:

α·H (α, β) 1 ∨
m |βi = |βi = |βi = (α , β) |βi .
|α|2 |α|2 2

Then we can express


2m = (α∨ , β) ∈ Z .
If m 6= 0, there must be another state in the same multiplet with J3 eigenvalue −m, and
this state must be |β + `αi for some `. Thus it must be

(α∨ , β + `α) = (α∨ , β) + 2` = −(α∨ , β) ,

which determines ` = −(α∨ , β). We conclude that if β is a root, then also β − (α∨ , β) α is a
root.
Thus the linear operator
sα β ≡ β − (α∨ , β) α ,
which is a reflection with respect to the hyperplane perpendicular to α, maps roots to roots.
The set of all such reflections along roots forms a group, called the Weyl group W of the
algebra. The Weyl group is the symmetry group of the set of roots ∆.
The r elements si corresponding to simple roots αi are called simple Weyl reflections:

si ≡ sαi ,

and every element w ∈ W can be decomposed as

w = si sj · · · sk .

Simple Weyl reflections have the property that they map positive roots to positive roots,
with the exception of αi 7→ −αi . In other words:
(
∈ ∆+ if α 6= αi
α ∈ ∆+ ⇒ si α
= −αi if α = αi .

To prove it, expand α = rj=1 kj αj with kj ≥ 0. Then si α = j(6=i) kj αj + # αi . If α 6= αi


P P

then some kj(6=i) > 0.47 Since positive roots have non-negative coefficients in terms of simple
roots, and negative roots have non-positive coefficients, we conclude that si α is positive
whenever α 6= αi .
Here α can be any root, not necessarily a simple root. We define α∨ ≡ 2α/|α|2 .
46
47
Here we use the fact, proved in Footnote 38, that the only multiple of a root α which is also a root, is −α.
Therefore, if α is a positive root and is not αi , it must necessarily contain some other αj in its expansion.

72
It is easy to see that the simple Weyl reflections satisfy

s2i = 1 , si sj = sj si if Aij = 0 .

With a little bit of work, one can show that the full set of relations is


 1 if i=j

2 if Aij Aji =0



(si sj ) = 1
mij
where mij = 3 if Aij Aji =1

4 if Aij Aji =2





6 if Aij Aji =3.

A group having such a presentation is called a Coxeter group.48 On the simple roots, the
simple reflections take the form
si αj = αj − Aji αi .

Example 8.5. For su(N ), the Weyl group turns out to be the permutation group of N
elements, W = Sn .
Indeed, we can interpret si , for i = 1, . . . , N −1, as simple permutations of adjacent objects
in a list of N objects {O1 , . . . , ON }, e.g. s1 : O1 ↔ O2 and more generally

si : Oi ↔ Oi+1 .

Each simple permutation squares to 1, non-adjacent permutations commute, while adjacent


permutations satisfy
s1 s2 s1 = s2 s1 s2
when acting on {O1 , O2 , O3 }. This is equivalent to (s1 s2 )3 = 1.

We have shown that W maps ∆ to itself. In fact, this is a way to generate the complete
set of roots ∆ from the simple roots: acting on them with the Weyl group,

∆ = wα1 , . . . , wαr w ∈ W .

From this we see that any set {wαi } with w fixed, provides an alternative, equally good
set of simple roots. In fact, any possible set of simple roots that one obtains by choosing a
different separation of h∗ is obtained by the action of the Weyl group on an initial set.
From the definition, it is immediate to check that the Weyl group preserves the scalar
product on h∗ : 
sα λ, sα µ = (λ, µ) .
Thus, it is an orthogonal transformation of h∗ .
48
Therefore all Weyl groups of simple Lie algebras are Coxeter groups. The converse is not true: there are
(an infinite number of) finite Coxeter groups that are not Weyl groups.

73
Proposition 8.6. The Weyl vector takes the form
r
X 1 X
ρ= ωi = α.
i=1
2 α∈∆
+
P
Proof. Let us set ρ = α>0 α/2. As shown above, the simple Weyl reflection si maps positive
roots to positive roots, with the exception of αi 7→ −αi . Therefore
X
si ρ = 21 1
α>0 α − 2 αi = ρ − αi .
α6=αi

On the other hand si preserves the scalar product. Therefore


(
(ρ − αi , αi∨ ) = (ρ, αi∨ ) − 2
(si ρ, αi∨ ) =
(ρ, si αi∨ ) = −(ρ, αi∨ )
We conclude that (ρ, αi∨ ) = 1 and thus ρ = ri=1 ωi .
P

8.12 Lattices and congruence classes


Given a basis (1 , . . . , d ) of Rd , a lattice is the set of all points whose expansion in the basis
has integer coefficients:
Z1 + · · · + Zd .
In other words, it is the Z-span of {i }.
There are three important r-dimensional lattices in a Lie algebra g: the weight lattice
P = Zω1 + · · · + Zωr ,
the root lattice
Q = Zα1 + · · · + Zα4 ,
and the coroot lattice
Q∨ = Zα1∨ + · · · + Zαr∨ .
The weight lattice P contains all possible weights of finite-dimensional representations of
g. Given a finite-dimensional representation, the effect of the generator E α on a weight is
to shift the weight by α (unless it gives zero), thus the weights in a representation differ by
elements of the root lattice Q. Obviously Q ⊆ P (as roots are the weights of the adjoint
representation).
For the algebras G2 , F4 and E8 it turns out that Q = P . In all other cases, Q is a proper
subset of P and P/Q is a finite group. Its order, |P/Q|, is equal to the determinant of the
Cartan matrix (and P/Q is equal to the center of the simply-connected compact group G
with algebra g).
The distinct elements of P/Q define congruence classes, and each finite-dimensional
representation is in a congruence class. Then P/Q is a conserved charge associated to the
representations, in the sense that the product of representations has a charge which is the
sum of the individual charges.

74
Example 8.7. For su(2) there two classes, given by λ1 mod 2: integer or half-integer spin.
Example 8.8. For su(3) there are three classes, identified by the triality: λ1 + 2λ2 mod 3.
Example 8.9. For su(N ) there are N classes, and the N -ality is characterized by

λ1 + 2λ2 + · · · + (N − 1)λN −1 mod N .

The congruence classes in the various cases are the following:

g su(N ) so(2r + 1) sp(2r) so(2r) E6 E7 E8 F4 G2


Z2 × Z2 if 2r = 0 mod 4
P/Q ZN Z2 Z2 Z3 Z2 1 1 1
Z4 if 2r = 2 mod 4

9 Low-rank examples
Let us discuss the algebras of rank 1 and 2.

9.1 A1 ∼
= su(2)
There is only one simple algebra of rank 1: A1 ∼
= su(2). The Cartan matrix is

A = (2) ,

therefore there is only one simple root, α1 = θ, obviously equal to the highest root, and
related to the fundamental weight ω1 by

α1 = 2ω1 .

The Weyl group is generated by the simple reflection s1 , that acts on a weight λ = λ1 ω1 as

s1 (λ1 ω1 ) = λ1 ω1 − (α1∨ , λ1 ω1 )α1 = λ1 ω1 − λ1 α1 = −λ1 ω1 ,

which is a reflection with respect to the origin. Therefore W = {1, s1 } = Z2 and the roots
are
∆ = {α1 , −α1 } .
Indeed dim su(2) = 3. The weight lattice is

−3ω1 −α1 −ω1 0 ω1 α1 3ω1

The shaded nodes form the root lattice. Then the group of congruence classes is P/Q = Z2 ,
corresponding to integer or half-integer spin. The non-trivial class is generated by ω1 .

75
9.2 = so(4) ∼
D2 ∼ = A 1 × A1 ∼
= su(2) × su(2)
This case, which is degenerate, corresponds to Cartan matrix
 
2 0
A= ,
0 2

which is decomposable and so the algebra is semi-simple, rather than simple. The root
system is
α2
ω2
ω1
−α1 α1

−α2

The group of congruence classes is P/Q = Z2 × Z2 , and the non-trivial elements are repre-
sented by ω1 , ω2 and ω1 + ω2 .

9.3 A2 ∼
= su(3)
The Cartan matrix is  
2 −1
A=
−1 2
and all roots have the same length. The simple roots are α1 , α2 , with an angle of 120◦ , and
are related to the fundamental weights by

α1 = α1∨ = 2ω1 − ω2 = (2, −1)


α2 = α2∨ = −ω1 + 2ω2 = (−1, 2) .

The Weyl group is 


W = 1, s1 , s2 , s1 s2 , s2 s1 , s1 s2 s1 = S3 ,
that follows from the relation (s1 s2 )3 = 1.
The action of the Weyl group on the simple roots gives all possible roots, and one finds

∆ = α1 , α2 , α1 + α2 , −α1 , −α2 , −α1 − α2 .

76
Indeed dim su(3) = 8. The highest root is θ = α1 + α2 = (1, 1). The root system is

α2 α1 + α2 = θ
ω2
ω1
−α1 α1

−α1 − α2 −α2

The congruence group is P/Q = Z3 : each fundamental weight ω1,2 is in a different non-trivial
class (while the roots are in the trivial class). This is called triality.

9.4 B2 ∼
= so(5) ∼
= C2 ∼
= sp(4)
The Cartan matrix of B2 ∼
= so(5) is
 
2 −2
A= ,
−1 2

and the algebra is not simply-laced:

|α1 |2 = 2 , |α2 |2 = 1 ,

and the angle between the two simple roots is 135◦ . We have

α1 = α1∨ = 2ω1 − 2ω2 = (2, −2)


1
α2 = α2∨ = −ω1 + 2ω2 = (−1, 2) .
2
The structure of the Weyl group follows from (s1 s2 )4 = 1, so W = D4 the dihedral group of
8 elements.49
The roots are

∆ = α1 , α2 , α1 + α2 , α1 + 2α2 , −α1 , −α2 , −α1 − α2 , −α1 − 2α2 ,
49
The dihedral group Dn is defined as

Dn = r, s rn = s2 = (sr)2 = 1 = Zn o Z2


and has 2n elements. Setting s1 = s, s2 = sr, s1 s2 = r it equals the Weyl group s21 = s22 = (s1 s2 )n = 1.

77
indeed dim so(5) = 10, therefore the root system is

α1 + 2α2 = θ

α2 α1 + α2 = ω1
ω2

−α1 α1

−α1 − α2 −α2

−α1 − 2α2

The highest root is θ = α1 + 2α2 = (0, 2).


The congruence classes are P/Q = Z2 , corresponding to spinors and tensors. The non-
trivial class is represented by ω2 .

If we exchange α1 ↔ α2 we obtain C2 ∼
= sp(4) with Cartan matrix
 
2 −1
A= .
−2 2

The root system is the same, but rotated by 45◦ . The two congruence classes correspond to
tensors with odd or even rank.

9.5 G2
The Cartan matrix of G2 is  
2 −3
A= .
−1 2
The algebra is not simply laced: |α1 |2 = 2, |α2 |2 = 2
3
and the angle between the two is 150◦ .
We have

α1 = α1∨ = 2ω1 − 3ω2 = (2, −3)


1
α2 = α2∨ = −ω1 + 2ω2 = (−1, 2) .
3
The Weyl group is given by (s1 s2 )6 = 1, and is the dihedral group D6 with 12 elements. The
positive roots are

∆ = α1 , α2 , α1 + α2 , α1 + 2α2 , α1 + 3α2 , 2α1 + 3α2 ,

78
indeed dim G2 = 14, therefore the root system is
α1 + 3α2 α1 + 2α2 2α1 + 3α2 = ω1 = θ
= ω2
α2 α1 + α2

−α1 α1

−α1 − α2 −α2
−α1 − 2α2
−2α1 − 3α2 −α1 − 3α2

The highest root is θ = 2α1 + 3α2 = (1, 0). There are no congruence classes as P/Q = 1.

9.6 Method to construct the root system


There is a simple algorithmic method to construct the root system of any simple Lie algebra.
Recall that
(λ, α)
(λ, α∨ ) = 2 =q−p
|α|2 of λ along α

contains information about how many times we can increase (p) or decrease (q) the weight
λ by α. In particular if we shift λ by the simple roots αi , the information is in the Dynkin
labels:
λi = q − p along αi .
Next, we use that the difference of two simple roots is never a root, αi − αj 6∈ ∆. Therefore,
starting with the simple roots, we know that the weight cannot be lowered and thus q = 0.
(i)
Calling λj = Aij the j-th Dynkin label of αi , we have
(i)
λj = −p ,
i.e. a negative Dynkin label tells us how many times we can shift a simple root αi by another
simple root αj .50
Starting from the simple roots at the bottom, we build up all positive roots by adding
simple roots; we keep track of what p and q are along the way, and every time we encounter
a negative Dynkin label for a root that has q = 0, we infer how many times we can further
add another simple root. The process stops when all Dynkin labels are positive, and that is
the highest root θ.
Since the Dynkin labels of the simple roots are the rows of the Cartan matrix, the process
is completely specified by the Cartan matrix A.
50
Notice that (αi , αi∨ ) = 2, because the only multiple of αi that is a root is −αi = αi − 2αi , and of course
there is the vanishing weight.

79
Example 9.1. Let us work out the root system of G2 . The Cartan matrix is
 
2 −3
A= .
−1 2

Starting from the bottom, we obtain:

2α1 + 3α2 : (1, 0)


α1

α1 + 3α2 : (−1, 3)
α2

α1 + 2α2 : (0, 1)
α2

α1 + α2 : (1, −1)
α2 α1

α1 : (2, −3) α2 : (−1, 2)

This reproduces the positive roots described before. In particular the positive roots are 6,
thus the dimension of the algebra is

dim G2 = 6 + 6 + 2 = 14 .

80
10 Highest-weight representations
Any finite-dimensional irreducible representation has a unique highest-weight state |λi, which
is completely specified by its Dynkin labels λi .
Among all the weights in the representation, the highest weight λ is such that λ + α is
not a weight for any α > 0. That is

E α |λi = 0 ∀α > 0 .

From the su(2) argument,

(λ, αi )
2 =q≥0 because p = 0
|αi |2

and thus the Dynkin labels are non-negative.


Moreover, to each weight λ with non-negative Dynkin labels, called dominant weight,
corresponds a unique finite-dimensional irreducible representation Vλ (sometimes we indicate
the representation as λ).
The highest root θ is the highest weight of the adjoint representation.

Starting from the highest-weight state |λi, all the states in the representation Vλ can be
obtained by the action of the lowering operators:

E −β E −γ . . . E −η |λi for β, γ, . . . , η ∈ ∆+ .

Let us call the weight system Ωλ the set of weights in a representation. Any weight µ ∈ Ωλ
is such that λ − µ ∈ Q, the root lattice, thus all weights in a given representation lie in the
same congruence class of P/Q.
To construct all weights µ in Ωλ we use the su(2) subalgebra:

(µ, αi∨ ) = µi = qi − pi , pi , qi ∈ Z+ .
P P
Since µ = λ − ni αi for some ni ∈ Z+ , we can call ni ∈ Z the “level” of the weight µ
and proceed level-by-level. At each step we know the value of pi , and as long as

qi = µi + pi > 0 ,

we can remove αi qi times. When removing αi , we reduce the Dynkin labels of µ by the
Dynkin labels of αi , which are given in the i-th row of the Cartan matrix:

(αi , αj∨ ) = Aij ⇒ αi = (Ai1 , . . . , Air ) .

This process produces the full set Ωλ .

81
Example 10.1. Adjoint representation of su(3). The Cartan matrix of su(3) is
 
2 −1
A=
−1 2
and the adjoint representation has highest weight λ = (1, 1). We construct the weight system
Ωλ :
(1, 1)
−α1 −α2

(−1, 2) (2, −1)


−α2 −α1

(0, 0)
−α1 −α2

(−2, 1) (1, −2)


−α2 −α1

(−1, −1)
Indeed, these are the roots of su(3).
Example 10.2. Fundamental representation of g2 . The Cartan matrix is
 
2 −3
A=
−1 2
and the fundamental representation has highest weight (0, 1):
(0, 1)
−α2

(1, −1)
−α1

(−1, 2)
−α2

(0, 0)
−α2

(1, −2)
−α1

(−1, 1)
−α2

(0, −1)
This representation has dimension 7.
¶ Exercise 25. Compute the weight system of the fundamental representation of sp(4), whose
highest weight is (1, 0), and verify that there are 4 weights (in fact it has dimension 4).
Then consider so(5) and its representation with highest weight (0, 1), which is the spinorial
representation.

82
The procedure we described does not keep track of multiplicities.
Notice that the highest weight has no multiplicity, and multiplicities can only arise when
two or more arrows go into the same weight (or if an arrow starts from a weight with
multiplicity).
To compute multiplicities, one can use Freudenthal’s recursion formula (that we will
prove later): P P∞
α>0 k=1 2 multλ (µ + kα) (µ + kα, α)
multλ (µ) = .
|λ + ρ|2 − |µ + ρ|2
The denominator can also be written as (λ + µ + 2ρ, λ − µ) = (λ, λ + 2ρ) − (µ, µ + 2ρ).
The formula gives the multiplicity of µ in terms of the multiplicities of the weights above
it.

Example 10.3. Let us compute the multiplicity of (0, 0) in the adjoint representation of su(3),
assuming that all roots have multiplicity 1. First we use
2
λ = θ = α1 +α2 , ρ = α1 +α2 , µ = (0, 0) , |λ+ρ|2 = 2(α1 +α2 ) =8, |µ+ρ|2 = 2 .

Then we see that there are three weights above (0, 0), and k can only be 1:

(µ + kα, α) = (α, α) = 2 .

We thus have
2(2 + 2 + 2)
multθ (0, 0) = =2.
8−2
This is correct, as the Cartan subalgebra of su(3) has dimension 2.

¶ Exercise 26. Compute the weight system of the adjoint representation of g2 , whose highest
weight is (1, 0). Verify that it has dimension 14, computing the multiplicity of the weight
(0, 0), and that it agrees with what we described before.

Theorem 10.4. Finite-dimensional irreducible (highest-weight) representations Vλ of a com-


plex semi-simple Lie algebra are always unitary, in the following sense. Using (H i )† = H i
and (E α )† = E −α , the norm of any state |λ0 i in Vλ , computed using the commutation rela-
tions, is positive definite:

|λ0 i = E −β . . . E −γ |λi ⇒ hλ0 |λ0 i = hλ|E γ . . . E β E −β . . . E −γ |λi > 0

with β, . . . , γ ∈ ∆+ and taking hλ|λi > 0.


This means that all finite-dimensional representations of the compact real form are uni-
tary: the algebra can be represented by hermitian or anti-hermitian operators (while the
complex algebra is its complexifications, and the other real forms have non-unitary finite-
dimensional representations).

83
10.1 Conjugate representations
Given an irreducible representation Vλ with highest weight λ, it contains a “lowest-weight
state” λ
b < 0. The representation Vλ∗ with highest weight

λ∗ ≡ −λ
b

is called the conjugate representation of Vλ . If λ b = −λ, i.e. λ∗ = λ, then the represen-


tation is called self-conjugate.
The conjugate representation Vλ∗ is obtained by turning “upside-down” the representation
Vλ ,51 and so it has the same dimension.
In the case of su(N ), the conjugate representation is obtained by reversing the order of
the Dynkin labels:

λ = (λ1 , . . . , λN −1 ) ⇒ λ∗ = (λN −1 , . . . , λ1 ) .

Notice that this is a symmetry of the Dynkin diagram. The two representations are in
opposite congruence classes. Since the Dynkin diagrams of so(2r + 1), sp(2r), G2 , F4 , E7
and E8 have no symmetry, all their representations are self-conjugate.

10.2 Quadratic Casimir operator


The Casimir operators are operators one can construct by taking products of algebra
generators, such that they commute with all elements of the algebra. The Casimir operators
are not part of the algebra, but rather of the universal enveloping algebra, since we cannot
multiply elements of the Lie algebra.52 On the other hand, they are naturally constructed
in any representation.
Since Casimir operators commute with all elements of g, in an irreducible representation
they are proportional to the identity. The eigenvalue carries the interesting information.
The quadratic Casimir operator Q is given by
X
Q= kab J a J b , where k ab = k(J a , J b ) .
a,b

¶ Exercise 27. Verify that it commutes with all elements of g.


51
In fact λ
b = w0 λ, where w0 is the longest element of the Weyl group that maps the positive chamber
to the negative chamber. It follows that λ∗ = −w0 λ, and all weights of the conjugate representation are
obtained from those of Vλ by the action of −w0 .
In fact w0 is the only element of the Weyl group that maps ∆+ to ∆− , and thus −w0 maps simple roots to
simple roots. Since W preserves the scalar product of roots, conjugation −w0 must implement a symmetry
of the Dynkin diagram. However, notice that the Dynkin diagram could have a symmetry which is not
realized by −w0 (this is the case of D2+2k ).
52
The universal enveloping algebra is the set of all formal power series in elements of g.

84
In the Cartan-Weyl basis, it looks like53
r
X X |α|2
H iH i + E α E −α + E −α E α .

Q=
i=1 α>0
2

Since Q has the same eigenvalue on all states of an irreducible representation, let us evaluate
it on the highest weight state. First
X X
H i H i |λi = λi λi |λi = (λ, λ) |λi .
i i

Second
α·H (α, λ)
E α E −α |λi = [E α , E −α ] |λi = 2 2
|λi = 2 |λi ,
|α| |α|2
where we used E α |λi = 0. Therefore
 X   
2 2
Q |λi = (λ, λ) + (λ, α) |λi = (λ, λ + 2ρ) |λi = |λ + ρ| − |ρ| |λi
α>0

where ρ is the Weyl vector.


In the special case of the adjoint representation, we compute
X X  X
(θ, θ + 2ρ) = 2 + 2(θ, ρ) = 2 + 2 a∨i αi∨ , ωj = 2 + 2 a∨i = 2g ∨ ,
i j
i

where we used that |θ|2 = 2. Thus, the quadratic Casimir of the adjoint representation is
twice the dual Coxeter number.

Remark. We can use the Casimir operator to fix the normalization of the Killing form. Recall
that
1
k(J a , J b ) = Tr ad(J a ) ad(J b ) ≡ k ab .

2g̃
As we saw before, imposing k(H i , H j ) = δ ij fixes g̃ = α |α|2 /2r. Now notice
P

1 1 1
dim g = kab k ab = kab Tr ad(J a ) ad(J b ) =

Tradj Q = dim g (θ, θ + 2ρ) .
2g̃ 2g̃ 2g̃
We conclude
2g̃ = (θ, θ + 2ρ) = |θ|2 g ∨ ,
which is independent of normalization. The choice |θ|2 = 2 leads to g̃ = g ∨ .
Recall that k(H i , H j ) = δ ij , k(H i , E α ) = 0, k(E α , E β ) = 0 for β 6= −α while k(E α , E −α ) = 2/|α|2 .
53

Therefore the matrix of the Killing form and its inverse are
 h i   h i 
2/|α|2 |α|2 /2
k ab = diag 1r , 2/|α| 0
2
0
, . . . , k ab = diag 1 r , 0
2
|α| /2 0
, . . . , .

85
We have described the quadratic Casimir operator, that exists in all simple Lie algebras.
In su(2), that is the only Casimir invariant.
In general, however, there are r independent Casimir invariants of various degrees. The
degrees minus 1 are called the exponents of the algebra.
Note. The exponents can be computed in the following way. First construct a “symmetrized
Cartan matrix”
bij = 2 p(αi , αj ) .
A
|αi |2 |αj |2
Then the exponents mi are
2g
q 
mi = arcsin Eigenvalues A
bij /4 ,
π
where g is the Coxeter number.

Derivation of Freudenthal’s formula. Consider an irreducible representation Vλ with


highest weight λ. Let Wµ be the eigenspace related to the weight µ. The trace of the Casimir
operator Q in Wµ is
TrWµ Q = mult(µ) (λ, λ + 2ρ) .
2
On the other hand, we can use Q = i H i H i + α>0 |α|2 (E α E −α + E −α E α ). The trace of
P P
the first term is X
TrWµ H i H i = mult(µ) (µ, µ) .
i

For the second term, we consider the su(2) subalgebra generated by J3 ≡ α·H |α|2
, J± ≡ E ±α
for a positive root α. Each of the states in Wµ will be in an irreducible representation of
this su(2). Let |µj i be a state in a representation of spin j. From our analysis of the su(2)
algebra we get

J+ J− + J− J+ |j − ki = 2(j + 2jk − k 2 ) |j − ki = 2(j 2 + j − m2 ) |mi ,




where we expressed as m = j − k the eigenvalue of J3 . Since such an eigenvalue is, in our


case,
α·H (α, µ)
m |µj i = 2
|µj i = |µj i ,
|α| |α|2
we conclude that
(α, µ)2
 
α −α −α α

E E +E E |µj i = 2 j(j + 1) − |µj i .
|α|4

Alternatively, we can use the Casimir of su(2)

Qsu(2) = 2J3 J3 + J+ J− + J− J+ = 2j(j + 1) .

86
Within the full algebra it takes the form
 
su(2) α·Hα·H α −α −α α
Q |µj i = 2 + E E + E E |µj i = 2j(j + 1) |µj i .
|α|4

We reach the same conclusion.


The state |µj i is in a spin-j representation, so let µ + kα be the highest weight for some
k ∈ Z≥0 , and let |µ + kα|j i be its highest-weight state. Since

α·H (α, µ + kα)


j |µ + kα|j i = 2
|µ + kα|j i = |µ + kα|j i
|α| |α|2

we get an expression for j. Substituting into the previous expression we obtain


 
α −α −α α
 (α, µ)
E E + E E |µj i = 2 k(k + 1) + (2k + 1) |µj i .
|α|2

There might be many copies of the spin-j representation that show up in Wµ : it is clear
that the number of representations whose highest weight is µ + kα is equal to the dimension
of Wµ+kα minus the dimension of Wµ+(k+1)α . Therefore
X |α|2
E α E −α + E −α E α =

TrWµ
α>0
2
XX  
= mult(µ + kα) − mult µ + (k + 1)α k(k + 1)|α|2 + (2k + 1)(α, µ)
α>0 k≥0
X X 
= mult(µ) (α, µ) + 2 mult(µ + kα) (α, µ + kα) .
α>0 k≥1

We conclude that

TrWµ Q = mult(µ) (λ, λ + 2ρ)


XX
= mult(µ) (µ, µ + 2ρ) + 2 mult(µ + kα) (α, µ + kα) .
α>0 k≥1

This leads to the multiplicity formula


P P
α>0 k≥1 2 multλ (µ + kα) (α, µ + kα)
multλ (µ) = .
(λ + µ + 2ρ, λ − µ)

The denominator can also be written as (λ, λ + 2ρ) − (µ, µ + 2ρ) = |λ + ρ|2 − |µ + ρ|2 .

87
10.3 Dynkin index
The invariant bilinear forms computed as traces in different representations are all propor-
tional to the Killing form, and the relative normalizations are called the Dynkin index xλ
of a representation λ. We define xλ through
 xλ
Trλ ρ(J a ) ρ(J b ) = ∨ Trθ ad(J a ) ad(J b ) = |θ|2 xλ k ab ,

g
where ρ is the representation λ and k ab is the Killing form (we usually take |θ|2 = 2). By
construction, the Dynkin index of the adjoint representation is equal to the dual Coxeter
number:
xθ = g ∨ .
In all other cases, the index is computed by multiplying both sides by kab . On the LHS we
get Trλ Q, while on the RHS we get |θ|2 xλ dim g. We thus find
(λ, λ + 2ρ) dim Vλ
xλ = ,
2 dim g
using |θ|2 = 2.

10.4 Tensor products of representations


Given two irreducible representations Vλ , Vµ , we can construct the tensor product represen-
tation
Vλ ⊗ Vµ .
In terms of the two homomorphisms ρλ and ρµ , we define
ρλ⊗µ = ρλ ⊗ 1 + 1 ⊗ ρµ ,
which is a homomorphism.54 This implies that the weights of Vλ ⊗ Vµ are sums of pairs of
weights of Vλ and Vµ .
However, in general, the representation Vλ ⊗ Vµ is not irreducible, rather it is a sum—with
multiplicities—of irreducible representations:
M
Vλ ⊗ Vµ = Nλµ ν Vν ,
ν∈P+

where P+ is the set of dominant weights while the integers Nλµ ν are the multiplicities. This
is called a Clebsh-Gordan decomposition. The sum of highest weights λ + µ gives a
highest weight in Vλ ⊗ Vµ , thus the representation Vλ+µ appears once in the decomposition:

Nλµ λ+µ = 1 .
54
One could try to define ρλ⊗µ = ρλ ⊗ ρµ , but this would not be a linear map. For instance, one would
find ρλ⊗µ (tX) = ρλ (tX) ⊗ ρµ (tX) = t2 ρλ (X) ⊗ ρµ (X) which is not a homomorphism.

88
After removing the weights of Vλ+µ , we are left with a number of other highest weights,
representing other terms in the decomposition. Proceeding this way, in principle, we can
work out the full decomposition.
The tensor-product coefficients should be compatible with the dimensions:
X
dim Vλ · dim Vµ = Nλµ ν dim Vν .
ν∈P+

The formal set of representations form an algebra, called a fusion algebra. Let us call 0
the one-dimensional trivial representation, and µ∗ the conjugate representation to µ. Then
fusion with the trivial representation does not do anything:
Nλ0 ν = δλν .
In the tensor product of a representation λ and its conjugate λ∗ there is always one and only
one singlet:55
Nλλ∗ 0 = 1 .
The result is more general: the product Vλ ⊗ Vµ contains one and only one singlet if µ = λ∗ ,
and no singlet if µ 6= λ∗ . In fact the singlet is a linear map Vλ → Vµ∗ which commutes with
the algebra action, and then by Schur’s lemma its existence implies µ∗ ∼ = λ up to a change
of basis.
We are familiar with the case of su(2):
(2j) ⊗ (2j 0 ) = 2(j + j 0 ) ⊕ 2(j + j 0 − 1) ⊕ · · · ⊕ 2|j − j 0 | ,
  

where we have indicated the Dynkin labels, and each representation on the RHS has multi-
plicity 1. The general case is complicated, and we will only explore the case of su(N ).

10.5 su(N ) and Young diagrams


There is a convenient graphical way to indicate representations of su(N ), called Young
diagrams. Given a representation with Dynkin labels (λ1 , . . . , λN −1 ), we associate a diagram
containing λj columns of j boxes:

λj
z }| {


j
 
λ1 , λ2 , . . . , λN −1 →

55
Writing the singlet as va ⊗ wi C ai with v ∈ Vλ and w ∈ Vλ∗ , and recalling that the generators of λ∗ are
−TαT in terms of those Tα of λ, we find [TαT , C] = 0 and thus, by Schur’s lemma as λ is irreducible, C is the
identity. We have determined the singlet, which then is unique.

89
The total number of boxes is rj=1 jλj .
P

For instance, the fundamental representation λ = (1, 0, . . . ), its conjugate the anti-
fundamental representation λ∗ = (. . . , 0, 1) and the adjoint θ = (1, 0, . . . , 0, 1) appear as
 
fund: antifund: 


 adjoint: 



N −1 N −1

 


 

Dimension formula. There is a simple way to compute the dimension of a representation


of su(N ) corresponding to a Young diagram Yλ . We need to fill the boxes with two sets of
numbers, as follows.
(i) Start with the upper-left corner and assign N . Then, every time you move to the right
increase the number by 1, and every time you move down you decrease the number by
1. In this way, fill the diagram with numbers nij .
(ii) Assign to each box its hook factor hij :

hij = # boxes on the right + # boxes below + 1 .

The dimension of the representation is


Y nij
dim Vλ = .
ij
hij

Example 10.5. Consider the representation λ = (1, 2, 0, . . . ) of su(N ). To compute its di-
mension, we fill in its Young diagram:

N N+1 N+2 4 3 1

N−1 N 2 1

Then the dimension of the representation is

N 2 (N + 1)(N + 2)(N − 1)
dim λ = .
24
¶ Exercise 28. Compute the dimension of the fundamental, the anti-fundamental and the
adjoint representation of su(N ).

Product decomposition. Given two Young diagrams Yλ , Yµ , the Littlewood-Richardson


rule tells us how to decompose the product Vλ ⊗ Vµ into irreducible representations.

90
(a) Fill the second diagram with numbers: 1 in all boxes of the first row, 2 in the second
row, etc.
(b) Add all boxes with a 1 to the first diagram, keeping only the resulting diagrams that
satisfy:
(i) The resulting diagram is regular: the heights of the columns are non-increasing.
(ii) There should not be two 1’s in the same column.
Then, from each resulting diagram remove columns with N boxes.
(c) Proceed adding all boxes with a 2 to the previous diagrams, keeping only the resulting
diagrams that satisfy (i) and (ii) above, where in (ii) 1 is replaced by 2, as well as:56
(iii) In counting from right to left and top to bottom (i.e. concatenating the reversed
rows into a sequence), the number of 1’s must always be ≥ than the number of
2’s, the number of 2’s ≥ than the number of 3’s, and so on.
Then remove columns with N boxes.
(d) Continue until all numbered boxes are used.
The resulting diagrams, with multiplicity, furnish the decomposition.

Example 10.6. We compute the decomposition of (2, 0) ⊗ (1, 1) in su(3):

⊗ 1 1
2

At the first step we add the boxes with 1, obtaining the diagrams

1 1 1
1 1 1

(a diagram with 1’s on top of each other is excluded). At the second step we add the box
with 2, obtaining the diagrams

1 1 1 1
2 1 2 1 1 1
2 2

After removing the columns with three boxes, we find

(2, 0) ⊗ (1, 1) = (3, 1) ⊕ (1, 2) ⊕ (2, 0) ⊕ (0, 1) .

The dimensions add up correctly: 6 × 8 = 24 + 15 + 6 + 3.


56
The rules (ii) and (iii) guarantee that the procedure applied to the trivial product 1 ⊗ Yλ gives only Yλ .

91
¶ Exercise 29. Verify the following decomposition in su(3):

(1, 1) ⊗ (1, 1) = (2, 2) ⊕ (3, 0) ⊕ (0, 3) ⊕ (1, 1) ⊕ (1, 1) ⊕ (0, 0)

and check that the dimensions add up correctly.


¶ Exercise 30. Verify the following decomposition in su(N ):

(1, 0 . . . ) ⊗ (1, 1, 1, 0 . . . ) = (2, 1, 1, 0 . . . ) ⊕ (0, 2, 1, 0 . . . ) ⊕ (1, 0, 2, 0 . . . ) ⊕ (1, 1, 0, 1, 0 . . . )

and check that the dimensions add up correctly.

Generating representations. The fundamental representation (1, 0 . . . ) or  has dimen-


sion N : it is a vector of su(N ), i.e. a tensor with a single index.
By taking multiple products of the fundamental representation we see that:
(a) The basic representations with a single non-vanishing Dynkin label equal to 1 are

(. . . 0, 1, 0 . . . )
antisymmetric tensors with j indices
j

(b) The representations

(n, 0 . . . ) symmetric tensors with n indices

(c) The representations with a single non-vanishing Dynkin label are

(. . . 0, n, 0 . . . )
symmetric product of n copies of j-antisymmetric tensors
j

(d) The fact that a column with N boxes can be removed is because µ1 ...µN is a singlet of
su(N ).
P −1
(e) A generic representation (λ1 , . . . , λN −1 ) is a tensor with N
j=1 j λj indices and a certain
symmetry property.
Therefore all representations can be obtained from products of the fundamental rep-
resentation, by applying a certain symmetry pattern.

11 Real forms and compact Lie groups


So far we have studied complex Lie algebras (because they are simple to classify), but for
physical applications we are mostly interested to real Lie algebras, possibly of compact Lie
groups.

92
In the Cartan-Weyl basis {H i , E α } all structure constants are real. We can then restrict
to real linear combinations of those generators, to get a real Lie algebra gR . This real Lie
algebra is called the split real form of g. For classical algebras the split real forms are

Ar → sl(r + 1, R) , Br → so(r + 1, r) , Cr → sp(2r, R) , Dr → so(r, r) .

In the exceptional E-type cases, the split real forms are indicated as

e6(6) , e7(7) , e8(8) ,

while for F4 and G2 there is no standard symbol.

Instead, take a basis given by


E α + E −α E α − E −α
iH i , J1α ≡ √ , J2α ≡ √
i 2 2
and consider the real algebra gR of real linear combinations. It is easy to check that, again,
the structure constants are real.
Since k(H i , H j ) = δ ij , k(H i , E α ) = k(E α , E β ) = 0 for β 6= −α and k(E α , E −α ) = 2/|α|2 ,
in the basis above the Killing form is diagonal and negative-definite. This guarantees that
the corresponding Lie group is compact, and therefore this is called the compact real form.
To show that the group is compact, one introduces an abstract notion of Hermitian
conjugation † as follows:

(H i )† = H i , (E α )† = E −α ,

then extended to g in an anti-linear way: (cX)† = c∗ X † . Therefore

(iH j )† = −iH j , α †
(J1,2 α
) = −J1,2 ,

i.e. all elements of gR are anti-Hermitian. Then the exponential map

M = eA

gives a subgroup of a unitary group (since M −1 = e−A = M † ), which is compact. For


classical algebras the compact real forms are

Ar → su(r + 1) , Br → so(2r + 1) , Cr → usp(2r) , Dr → so(2r) .

while in the exceptional cases we use the same symbols as in the complex case.
Other (non-compact) real forms are su(p, q) and so(p, q). The full classification of real
forms of g boils down to classifying involutive (σ 2 = 1) automorphisms of its compact real
form (due to a theorem by Cartan), and this is done with Satake diagrams.

93
A compact real form gR is the Lie algebra of a compact Lie group, for instance constructed
with the exponential map.57 By taking the universal cover, to each compact real form we
associate a compact connected simply-connected Lie group G, such that gR is its Lie algebra:

su(N ) → SU (N ) , so(N ) → Spin(N ) , sp(2r) → U Sp(2r)


er → Er , f4 → F4 , g2 → G2 .

The group of conjugacy classes of representations P/Q is equal to the center Z of G:

Z(G) ≡ z ∈ G gzg −1 = z , ∀ g ∈ G = P/Q .




Representations of the algebra are also representations of the simply-connected group G


(using the exponential map). Since the center of G commutes with G, it is represented by
the identity matrix multiplied by phases in each irreducible representation, and so it gives
a “conserved charge” of representations. This is exactly the same as the conjugacy classes
P/Q.
In fact the centers of those groups are

G SU (N ) Spin(N ) U Sp(2r) E6 E7 E8 F4 G2
Z2 if N = 1, 3 mod 4
Z(G) ZN Z2 × Z2 if N = 0 mod 4 Z2 Z3 Z2 1 1 1
Z4 if N = 2 mod 4

In particular
SO(N ) = Spin(N )/Z2 .
Spin(N ) has all representations, including spinor representations; SO(N ) has only tensor
representations.
From the overlap of algebras we learn that

Spin(2) = U (1) , Spin(3) = SU (2) , Spin(4) = SU (2) × SU (2)


Spin(5) = U Sp(4) , Spin(6) = SU (4) .

11.1 Spinors in various dimensions


The basic representation of so(N ) and Spin(N ) is not the vector representation, but rather
the one (for N = 2r + 1) or two (for N = 2r) spinor representations, in the sense that
57
Starting with the algebra, one can construct the universal enveloping algebra which is defined as the
set of all formal power series in elements of g. This naturally matches the product of generators in any
representation. In the universal enveloping algebra, the exponential map is well-defined, and it gives a Lie
group whose algebra is the Lie algebra we started with. This particular construction gives the centerless
group, also called the adjoint group. Its universal cover is the simply-connected group G, and the centerless
group is G/Z where Z is the center of G.

94
all other representations can be obtained from products of spinor representations. Let us
discuss Lorentz spinors, namely spinors of so(d − 1, 1).
We start with the Clifford algebra
{Γµ , Γν } = 2η µν ,
and take η µν = diag(−1, +1, . . . , +1). The relation between the Clifford algebra and the
orthogonal algebra is that
i
Σµν = − [Γµ , Γν ]
4
are generators of the Lorentz algebra so(d − 1, 1), so representations of the Clifford algebra
are also representations of the Lorentz group.
We start in even dimension, d = 2k. A faithful representation of the Clifford algebra can
be constructed recursively. For d = 2 take
   
0 0 1 1 0 1
Γ = , Γ = .
−1 0 1 0
Then from d = 2k − 2 to d = 2k take
 
µ µ −1 0
Γ =γ ⊗ µ = 0, . . . , d − 3
0 1
   
0 1 0 −i
Γd−2
=1⊗ Γd−1
=1⊗ ,
1 0 i 0
with γ µ the Dirac matrices in d = 2k − 2. This is called the Dirac representation and its
dimension is 2k . Notice that
(Γ0 )† = −Γ0 , (Γµ6=0 )† = Γµ .
The Dirac representation is an irreducible representation of the Clifford algebra, but a
reducible representation of the Lorentz algebra. Define
Γ = ik−1 Γ0 Γ1 . . . Γd−1 ,
which has the properties
(Γ)2 = 1 , {Γ, Γµ } = 0 , [Γ, Σµν ] = 0 .
Γ is called the chirality matrix and has eigenvalues ±1. The 2k−1 states with chirality +1
form a Weyl spinor representation of the Lorentz algebra, and the 2k−1 states with
chirality −1 form a second, inequivalent, Weyl representation.
In terms of Dynkin labels, the two Weyl spinor representations have a single 1 on one of
the two small tails.
In odd dimension, d = 2k + 1, we simply add Γd ≡ Γ to the Dirac matrices of d = 2k.
This is now an irreducible representation (because Σµd anti-commutes with Γd ). Thus, in
so(2k, 1) the Dirac spinor representation has dimension 2k , is irreducible, and in terms of
Dynkin labels it has a single 1 on the shaded node (short root).

95
For d = 2k, the irreducible 2k -dimensional representation of the Clifford algebra we con-
structed is unique, up to change of basis, indeed the matrices Γµ∗ and −Γµ∗ satisfy the same
Clifford algebra as Γµ , and are related to Γµ by a similarity transformation.
In the basis we chose, Γ3 , Γ5 , . . . , Γd−1 are imaginary while the other ones are real. Define

B1 = Γ3 Γ5 . . . Γd−1 , B2 = ΓB1 ,

then (using the commutation relations)

B1 Γµ B1−1 = (−1)k−1 Γµ∗ , B2 Γµ B2−1 = (−1)k Γµ∗ .

For both matrices,


BΣµν B −1 = −Σµν∗ .
It follows that the spinors ζ and B −1 ζ ∗ transform in the same way under the Lorentz group,
so the Dirac representation is self-conjugate (it is invariant under charge conjugation).
Acting on the chirality matrix:

B1 ΓB1−1 = B2 ΓB2−1 = (−1)k−1 Γ∗ ,

therefore for d = 0 mod 4 the two Weyl representations are conjugate to each other (charge
conjugation flips the chirality), while for d = 2 mod 4 each Weyl representation is self-
conjugate (charge conjugation does not flip the chirality).
We can try to impose a Majorana (i.e. reality) condition on spinors, relating ζ and ζ ∗ .
Consistency with Lorentz transformations requires

ζ ∗ = Bζ with either B1 or B2 .

Taking the ∗ we get ζ = B ∗ ζ ∗ = B ∗ Bζ, therefore such a condition is consistent if and only
if B ∗ B = 1. One finds

B1∗ B1 = (−1)k(k−1)/2 , B2∗ B2 = (−1)(k−1)(k−2)/2 .

Therefore a Majorana condition for Dirac spinors is possible only for d = 0, 2, 4 mod 8.
A Majorana condition for Weyl spinors is possible only if, moreover, the representation is
self-conjugate. Therefore in d = 2 mod 8 one can define Majorana-Weyl spinors, while
in d = 0, 4 mod 8 one can have either Weyl or Majorana spinors.
For d = 2k + 1, Γd ≡ Γ and so the conjugation of Γd is compatible with the conjugation
of Γµ only using B1 . Therefore a Majorana condition is possible only for d = 1, 3 mod 8.
Summarizing:

96
d Majorana Weyl Majorana-Weyl min. rep. over R
2 yes self yes 1
3 yes − − 2
4 yes complex − 4
5 − − − 8
6 − self − 8
7 − − − 16
8 yes complex − 16
9 yes − − 16
10 = 8 + 2 yes self yes 16
11 = 8 + 3 yes − − 32
12 = 8 + 4 yes complex − 64

Now consider spinors of so(d). The discussion is similar. It turns out that the reality
properties of representations of
so(p, q)
only depend on p − q, therefore so(d) behaves like so(d + 1, 1).
When the Majorana condition can be imposed, the spinor representation is real and the
Lorentz generators can be chosen to be imaginary. Otherwise the representation is pseudo-
real: the conjugate is isomorphic to itself, but the generators cannot be chosen to be imagi-
nary and the representation is not real. A familiar example of pseudo-real representation is
the 2 of so(3). The product of two pseudo-real representations is a real representation.

More details can be found in Polchinski’s book Volume 2, Appendix B.

12 Subalgebras
We would like to classify the possible embeddings of a semi-simple Lie algebra p into a simple
Lie algebra g. To organize the classification, one restricts to maximal embeddings

p⊂g,

for which there is no p0 such that p ⊂ p0 ⊂ g. All non-maximal embeddings can be obtained
from a chain of maximal ones.
We distinguish two categories: regular subalgebras and special (or non-regular) subalgebras.

Regular subalgebras. A regular subalgebra p is one whose generators are a subset of the
generators of g. A maximal regular subalgebra has the same rank as g, thus it retains all
Cartan generators.

97
A
b1 E
b6

A
br≥2 E
b7
···

B
br ··· E
b8

C
br ··· Fb4

D
br ··· G
b2

First, construct the extended (or affine) Dynkin diagram of g, by adding an extra
node associated to −θ (minus the highest root).58 Since θ is expanded in the simple-root
basis in terms of marks ai ,
Xr
θ= ai αi ,
i=1

one can compute the Cartan matrix of the extended Dynkin diagram (marks can be found
in Di Francesco’s book, Appendix 13.A). The extended Dynkin diagrams are in Figure.
Then, to maintain linear independence between the simple roots, we should drop (at least)
one of the αi . All semi-simple maximal regular subalgebras are obtained by dropping an αi
whose mark ai is a prime number.
In the few cases in which ai is not prime, one obtains a subalgebra of a maximal semi-simple
subalgebra obtained by dropping another αi . The exhaustive list is

F4 ⊃ B4 ⊃ A3 ⊕ A1 E7 ⊃ D6 ⊕ A1 ⊃ A3 ⊕ A3 ⊕ A1
E8 ⊃ D8 ⊃ D5 ⊕ A3 E8 ⊃ E6 ⊕ A2 ⊃ A5 ⊕ A2 ⊕ A1
E8 ⊃ E7 ⊕ A1 ⊃ A7 ⊕ A1 E8 ⊃ E7 ⊕ A1 ⊃ A5 ⊕ A2 ⊕ A1 .

¶ Exercise 31. Check them.


58
Promoting −θ to a “simple root” preserves the property that the difference between two simple roots is
not a root (because αi + θ is not a root). In fact, det A = 0 but all principal minors are positive.

98
Maximal regular subalgebras that are not semi-simple are constructed from the removal
of 2 nodes with mark ai = 1 and the addition of a u(1) factor. For instance

su(p + q) ⊃ su(p) ⊕ su(q) ⊕ u(1) for p, q ≥ 1 .

Special subalgebras. The maximal subalgebras of the classical algebras are of two types.
The first type uses the fact that classical algebras are algebras of matrices. Thus59

su(p) ⊕ su(q) ⊃ su(pq)


so(p) ⊕ so(q) ⊃ so(pq)
sp(2p) ⊕ sp(2q) ⊃ so(4pq)
sp(2p) ⊕ so(q) ⊃ sp(2pq)
so(p) ⊕ so(q) ⊃ so(p + q) for p and q odd .

In the first four we write the indices on the right as a double index, realizing the algebras
on the left. The last case is obvious, however it does not appear from manipulations of the
extended Dynkin diagram.
The second type uses the fact that if p has an N -dimensional representation, then it is
a subalgebra of su(N ). Since all representations are isomorphic to unitary representations,
just take the representatives as N × N matrices.
Is p maximal in su(N )? With a few exceptions,60 if the N -dimensional representation
is real (and thus it admits a symmetric bilinear form) then p is maximal in so(N ), if it is
pseudo-real (and thus it admits an anti-symmetric bilinear form) it is maximal in sp(N ),
otherwise it is maximal in su(N ).

The maximal special subalgebras of the exceptional algebras are listed, for instance in
Table 13.1 of Di Francesco’s book.

59
The unitary, orthogonal and symplectic groups are groups of matrices that preserve a Hermitian, sym-
metric and antisymmetric bilinear form, respectively. Writing Cpq = Cp ⊗ Cq , the bilinear form can be
written as (φ1 ⊗ φ2 , η1 ⊗ η2 ) = (φ1 , η1 )1 · (φ2 , η2 )2 . This gives the embedding of groups.
60
The exceptions are listed by Dynkin.

99

You might also like