Perry Diff Geometry

Download as pdf or txt
Download as pdf or txt
You are on page 1of 74

Applications of Differential Geometry to Physics

Prof Malcolm Perry, Lent Term 2009


Unofficial lecture notes, LATEX typeset: Steffen Gielen

Last update: June 19, 2009

Contents
1 Introduction to Differential Forms 2
1.1 Vectors, Tensors and p-forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Operations on Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Electromagnetism and Yang-Mills Theory . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Connections and General Relativity 14


2.1 Vielbein Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Form Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Explicit Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Integration 25
3.1 Action for General Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Yang-Mills Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Topologically Non-Trivial Field Configurations 30

5 Kaluza-Klein Theory 33
5.1 Particle Motion in Kaluza-Klein Theory . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Magnetic Monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.3 S 3 as a Group Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6 Aspects of Yang-Mills Theory 45


6.1 Spontaneous Symmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Magnetic Monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.3 Instantons in Yang-Mills Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

7 Gravitational Instantons 51
7.1 Topological Quantum Numbers for Gravity . . . . . . . . . . . . . . . . . . . . . . . 51
7.2 De Sitter Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3 Other Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

1
8 Positive Energy 61
8.1 Geometry of Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.2 Spinors in Curved Spacetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.3 Definition of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.4 Energy Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.5 Proof of Positive Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

1 Introduction to Differential Forms


Lect. 1
This course will be somewhat different from the course given by Prof Gary Gibbons in previous
years. We will plan to cover applications of differential geometry in general relativity, quantum
field theory, and string theory.

1.1 Vectors, Tensors and p-forms


Assume we have some kind of d-dimensional manifold, possibly representing spacetime, with a set
of co-ordinates xa , a = 1, . . . , d.
In general relativity, typically one thinks of a vector as being represented by ua . But ua is really
the components of a vector in some particular basis. We need to think about basis-independent
expressions.
In d dimensions, there is always a set of d basis vectors

E1 , . . . , Ed , collectively Ea . (1.1)

A vector is then X
u= ua Ea , (1.2)
a

where ua are the components of u in the basis {Ea }.


A one-form ω is an object which is dual to a vector, i.e. given a vector u and a one-form ω there
is a bracket operation hω, ui giving a real number.
This bracket is linear: If u = αv + βw for arbitrary vectors v, w and real numbers α, β,

hω, αv + βwi = αhω, vi + βhω, wi. (1.3)

We can write a one-form as X


ω= ωa E a , (1.4)
a

where ωa are numbers and E a are one-forms. Then the bracket can be defined as

hE a , Eb i = δa b , (1.5)

such that the basis of one-forms are dual to the basis of vectors.
The bracket is also linear in ω: If ω = αη + βλ for one-forms η, λ and real numbers α, β,

hαη + βλ, ui = αhη, ui + βhλ, ui. (1.6)

2
Consider now X X X
hω, ui = hωa E a , ub Eb i = ωa ub hE a , Eb i = ωa ua . (1.7)
a,b a,b a

The bracket corresponds to the usual scalar multiplication.


The next thing is to define the derivative of a function f (x), denoted by df - this is a one-form. It
should have the property
hdf, Xi = X f. (1.8)
We can pick a set of one-forms and a basis of vectors to make it explicit. In a co-ordinate basis,
∂ i
these basis vectors are ∂x i and the one-forms are dx . These are dual,


h , dxj i = δi j , (1.9)
∂xi
which is consistent with the definition of df , since
∂ ∂ j
h i
, dxj i = x. (1.10)
∂x ∂xi
This can also be done for an arbitrary vector X = X j ∂x∂ j . From linearity,

∂ ∂
hdf, Xi = hdf, X j j
i = X j j f. (1.11)
∂x ∂x
This is the directional derivative of f in the direction X.
This, roughly speaking, is what one-forms are. There is a simple geometrical consequence; suppose
that
hdf, Xi = 0. (1.12)
Then f is a constant in the direction of the vector X, which means that df is normal to surfaces of
f = constant.
We can put this into a bigger perspective: Functions f are often called 0-forms. Then df , the
derivative of f , is a one-form. We have defined an operator d turning 0-forms into one-forms. In
general, d will turn p-forms into (p + 1)-forms. In terms of a co-ordinate basis,

∂f i
df = dx . (1.13)
∂xi
This is exactly as expected from the chain rule for a derivative.

A general tensor is of type (r, s); its components are T a1 ...ar b1 ...bs . We think of this as some-
thing which does not depend on a basis:

T = T a1 ...ar b1 ...bs Ea1 ⊗ Ea2 ⊗ . . . ⊗ Ear ⊗ E b1 ⊗ . . . ⊗ E bs . (1.14)

This is independent of the particular basis in question.


In general relativity, a tensor transforms in a particular way under a co-ordinate transformation.
But this is really just a change of basis:

Ea → Ea′ = χa′ a Ea , (1.15)

3
where χa′ a represents a non-degenerate d × d matrix. Similarly, one could do a transformation on
the basis one-forms
′ ′
E a → E a = Φa a E a . (1.16)
This could be a co-ordinate basis, but does not have to be. Looking at the bracket, we must have
′ ′ ′ ′ ′
δa b′ = hE a , Eb′ i = hΦa a E a , χb′ b Eb i = Φa a χb′ b δa b = Φa a χb′ a , (1.17)

thus χ is the matrix inverse of Φ. Under a change of basis, the tensor T must be invariant, thus
′ ′ ′ ′
T = T a1 ...ar b′1 ...b′s Ea′1 ⊗ Ea′2 ⊗ . . . ⊗ Ea′r ⊗ E b1 ⊗ . . . ⊗ E bs
′ ′ ′ ′
= T a1 ...ar b′1 ...b′s χa′1 a1 . . . χa′r ar Φb1 b1 . . . Φbs bs Ea1 ⊗ Ea2 ⊗ . . . ⊗ Ear ⊗ E b1 ⊗ . . . ⊗ E bs
= T a1 ...ar b1 ...bs Ea1 ⊗ Ea2 ⊗ . . . ⊗ Ear ⊗ E b1 ⊗ . . . ⊗ E bs , (1.18)

so the components of T transform as (expressing the old components in terms of the new)
′ ′ ′ ′
T a1 ...ar b′1 ...b′s χa′1 a1 . . . χa′r ar Φb1 b1 . . . Φbs bs = T a1 ...ar b1 ...bs , (1.19)

exactly as expected from the co-ordinate formulation of general relativity.


A p-form is defined to be a tensor of type (0, p) whose components are totally antisymmetric (in
any basis):
1
T = Ta1 ...ap E a1 ⊗ . . . ⊗ E ap = Ta1 ...ap E [a1 ⊗ . . . ⊗ E ap ] = Ta ...a (E a1 ∧ . . . ∧ E ap ) , (1.20)
p! 1 p
where we define the wedge product
X
E a1 ∧ . . . ∧ E ap := π(σ)E σ(a1 ) ⊗ E σ(a2 ) ⊗ . . . ⊗ E σ(ap ) (1.21)
σ∈Sp

and the sum is over all permutations σ of p elements with parity π(σ) either +1 or −1, so there
are p! terms in the sum. ∧ basically tells you to take the antisymmetric product:

Ea ∧ Eb = Ea ⊗ Eb − Eb ⊗ Ea,
Ea ∧ Eb ∧ Ec = Ea ⊗ Eb ⊗ Ec + Eb ⊗ Ec ⊗ Ea + Ec ⊗ Ea ⊗ Eb
−E a ⊗ E c ⊗ E b − E b ⊗ E a ⊗ E c − E c ⊗ E b ⊗ E a , (1.22)

etc. E a1 ∧ . . . ∧ E ap is antisymmetric under the interchange of any adjacent pair of indices. In d


dimensions, the number of linearly independent such objects is
 
d(d − 1) . . . (d − p + 1) d! d
= = . (1.23)
p! p!(d − p)! p

This means one must have p ≤ d, because one will get nothing otherwise.

4
1.2 Operations on Forms
Lect. 2
The next thing is to look at a product of a p-form P and a q-form Q. A p-form P can in any basis
be written as
1
P = Pa1 ...ap E a1 ∧ E a2 ∧ . . . ∧ E ap , (1.24)
p!
similarly
1
Q= Qb ...b E b1 ∧ E b2 ∧ . . . ∧ E bq . (1.25)
q! 1 q
We already have a rule for defining the product of one-forms. We define the wedge product of a
p-form with a q-form to be
1
P ∧Q= Pa ...a Qb ...b E a1 ∧ E a2 ∧ . . . ∧ E ap ∧ E b1 ∧ E b2 ∧ . . . ∧ E bq . (1.26)
(p + q)! 1 p 1 q

You can think of this in a slightly different way. P ∧ Q is really equivalent to a tensor of type
(0, p + q) that is antisymmetric on all its p + q indices. If you wanted to know its components, you
could write down a simple formula
P[a1 ...ap Qb1 ...bq ] . (1.27)
That, of course, means that if you stare at this product, consequently

P ∧ Q = (−)pq Q ∧ P. (1.28)

We have discovered that differential forms have a Z2 -grading:



Q∧P if either p or q is even,
P ∧Q= . (1.29)
−Q ∧ P if p and q are odd

You can think of P or Q as odd objects if p, q are odd, and as even objects if p or q are even.
(This is analogous to bosons which are described by even quantum fields, and fermions which are
described by odd quantum fields in quantum field theory.)
To avoid possible ambiguities, we write out explicitly what is meant by [·], namely antisymmetriza-
tion with weight one:
1 X
X[a1 ...ap ] = π(σ)Xσ(a1 )...σ(ap ) , (1.30)
p! σ∈S
p

so that
1
X[ab] = (Xab − Xba ) ,
2
1
X[abc] = (Xabc + Xbca + Xcab − Xacb − Xbac − Xcba ) , (1.31)
6
etc. Similarly, (·) always means symmetrization with weight one.
The next thing is to define an exterior derivative d on p-forms. We look at a p-form in a
co-ordinate basis:
1 a1
P = Pa1 ...ap dx ∧ dxa2{z∧ . . . ∧ dxap} . (1.32)
p! | {z } |
set of 0−forms p−form

5
We already know what d does on 0-forms, and so we define

1 ∂Pa1 ...ap b
dP = dx ∧ dxa1 ∧ dxa2 ∧ . . . ∧ dxap . (1.33)
p! ∂xb
This is consistent with how d acts on a 0-form to give a one-form. There is an alternative convention
where dxb is put at the end which gives you unpleasant factors of (−)p , and which we will not use.
Because
1
dP = X = X dxa1 ∧ dxa2 ∧ . . . ∧ dxap+1 , (1.34)
(p + 1)! [a1 ...ap+1 ]
∂P
we can write the components of X = dP in terms of the components of ∂x :

Xa1 ...ap+1 = (−)p (p + 1)∂[ap+1 Pa1 ...ap ] , (1.35)

where we write ∂a for ∂x∂ a . It is impossible to suppress all factors of (−)p ; this one is a nuisance.
Properties of the operator d:

• d maps p-forms to (p + 1)-forms. To see this, you have to prove that dP is a tensor. Do the
′ ′
calculation in a co-ordinate basis: Under a change of co-ordinates xa → x′a = x′a (xa ), we
define ′
′ ∂x′a ∂xa
Aa a = a
, Aa′ a = . (1.36)
∂x ∂x′a′
Then if P is a p-form,

Pa1 ...ap → Pa′1 ...a′p = Aa′1 a1 Aa′2 a2 . . . Aa′p ap Pa1 ...ap . (1.37)

The components of dP transform as


 
∂[b Pa1 ...ap ] → ∂[b′ Pa′1 ...a′p ] = ∂[b′ Aa′1 a1 Aa′2 a2 . . . Aa′p ] ap Pa1 ...ap
∂xb 
a1 a2 ap

= ∂b A|a1′ A a2′ . . . Aap ]
′ Pa1 ...ap
∂x′[b′ |
= Ab′ b Aa′1 a1 Aa′2 a2 . . . Aa′p ap ∂[b Pa1 ...ap ]
 
+A[b′ | b ∂b A|a′1 a1 Aa′2 a2 . . . Aa′p ] ap Pa1 ...ap + . . . , (1.38)

with more similar terms. These all contain terms of the form
a
∂xb ∂Aa′1 1 ∂xb ∂xa1 ∂xa1
= = , (1.39)
∂x′b′ ∂xb ∂x′b′ ∂x′a′1 ∂xb ′
∂x′a1 ∂x′b′
antisymmetrized over a′1 and b′ . Since partial derivatives commute, these terms all vanish.
What you end up with is what you expect for a tensorial object:

∂[b Pa1 ...ap ] → Ab′ b Aa′1 a1 Aa′2 a2 . . . Aa′p ap ∂[b Pa1 ...ap ] . (1.40)

Components of dP transform tensorially under a co-ordinate transformation.

6
• d2 = 0. This is most easily seen by looking at the components of d(dP ).

components of P ∼ P[a1 ...ap ]


components of dP ∼ ∂[b Pa1 ...ap ]
components of d(dP ) ∼ ∂[c ∂[b Pa1 ...ap ]] = ∂[c ∂b Pa1 ...ap ] = 0. (1.41)

Remember there was a Z2 -grading. dP is a (p + 1)-form and so d changes the Z2 -grading of


the form.
So morally, d had better be odd. Therefore dd = −dd = 0.

• The operator d is Leibnizian.

P = Pa1 ...ap dxa1 ⊗ . . . ⊗ dxap


dP = dPa1 ...ap ∧dxa1 ⊗ . . . ⊗ dxap + . . . , (1.42)
| {z }
∂Pa1 ...ap
dxb
∂xb

where all remaining terms contain some ddxai and will vanish.

• d acting on the product of a p-form with a q-form:


The components of P ∧ Q are
P[a1 ...ap Qb1 ...bq ] . (1.43)
Then the components of d(P ∧ Q) will be proportional to

∂[b (Pa1 ...ap )Qb1 ...bq ] + (P[a1 ...ap )∂b Qb1 ...bq ] . (1.44)

Since X[a1 ...ap bb1 ...bq ] = (−)p X[ba1 ...ap b1 ...bq ] , this shows that

d(P ∧ Q) = dP ∧ Q + (−)p P ∧ dQ. (1.45)

• All manipulations were in a co-ordinate basis but this is inessential. The action of d is
independent of a choice of co-ordinates.

This all looks like messing about, but it is easy to apply these things to electromagnetism, Yang-
Mills theory and general relativity. As of now, the word “metric” has not been mentioned. Forms,
their products and their exterior derivatives are all concepts which are independent of the metric.
We will need an object called the alternating tensor: This is an object εa1 ...ad which is antisym-
metric under the interchange of any adjacent pair of indices. It has components

+1 (a1 . . . ad ) is an even permutation of (1, . . . , d)
a1 ...ad 1 
ε = p −1 (a1 . . . ad ) is an odd permutation of (1, . . . , d) (1.46)
|g| 
0 otherwise.
Here g = det gab for a metric gab . These form the components of a rank d tensor (proof provided
later). One can also form
εa1 ...ad = ga1 b1 ga2 b2 . . . gad bd εb1 ...bd ; (1.47)

7
this has components

p  +1 (a1 . . . ad ) is an even permutation of (1, . . . , d)
εa1 ...ad = (−)t |g| −1 (a1 . . . ad ) is an odd permutation of (1, . . . , d) (1.48)

0 otherwise.
Here t is the number of timelike directions, which may be different depending on the type of
geometry one is studying.
Pure mathematicians study almost exclusively Riemannian geometry - this is based on the
R
axiom that if the distance ds, as defined by the metric

ds2 = gab dxa dxb , (1.49)

between two points is zero, then they are the same point.
This means that the metric g is positive definite, with only positive eigenvalues. The signature is
(+d ). This type of geometry is known in the physics literature, quite confusingly, as “Euclidean”.
It corresponds to t = 0.
We contrast this with what happens in general relativity, where one studies pseudo-Riemannian
geometry. Here g is not positive-definite and ds = 0 defines how light rays propagate. Typically,
we have signature (+d−1 , −), and t = 1. (The term spacetime means a manifold with such a
metric in the following.)

There is also Kleinian geometry, which is encountered in twistor theory (t = 3) or in the


F-theory approach to string theory (t = 2). Here one has a general signature (+p , −t ). One must
remember this when doing calculations with forms.
Lect. 3
Now we prove that ε is indeed a tensor. That means that under a co-ordinate transformation

a ′a′ ′a′ a a′ ∂x′a
x →x = x (x ), A a = , (1.50)
∂xa
it must transform as
′ ′ ′ ′ ′ ′
εa b c ... = Aa a Ab b Ac c . . . εabc...
1 ′ ′ ′ ′ ′ ′ 1
p η a b c ... = Aa a Ab b Ac c . . . p η abc... , (1.51)

|g | |g|

where we defined the alternating symbol (not a tensor!)



 +1 (a1 . . . ad ) is an even permutation of (1, . . . , d)
η a1 ...ad = −1 (a1 . . . ad ) is an odd permutation of (1, . . . , d) (1.52)

0 otherwise.
Because of the symmetry, there is really only one equation that has to be satisfied. We multiply
′ ′ ′
the equation by η a b c ... and sum over all indices:
X 1 ′ ′ ′ ′ ′ ′
X ′ ′ ′ 1 ′ ′ ′
p η a b c ... η a b c ... = Aa a Ab b Ac c . . . p η a b c ... η abc... (1.53)
a′ b′ c′ ...
|g′ | a′ b′ c′ ...
|g|

8
The sum on the left-hand side gives d!, on the right-hand side we have
X ′ ′ ′ ′ ′ ′
Aa a Ab b Ac c . . . η a b c ... = η abc... det A, (1.54)
a′ b′ c′ ...

and the remaining summation over a, b, c, . . . gives


1 1
(d!) p = (d!) p det A. (1.55)

|g | |g|
For that to be true we must have that under a co-ordinate transformation

|g′ | = |g|(det A)−2 . (1.56)

Since the metric is a tensor, it transforms as


∂xa ∂xb
ga′ ′ b′ = gab = Aa′ a Ab′ b gab , (1.57)
∂x′a′ ∂x′b′
where Aa′ a is the inverse of A. Take the determinant of this equation to get

det g′ = det(A−1 A−1 g) = (det A)−2 det g. (1.58)

Putting this together leads to the conclusion that ε really is tensorial. ε has the following useful
properties:
εabcd... εabcd... = (−)t d!; (1.59)
from that, you can derive other contractions, such as

εabc...de εabc...df = (−)t (d − 1)!δe f ,


εab...c εpq...r = (−)t d!δ[a [p δb q . . . δc] r] . (1.60)

Now we want to construct the dual of a differential form. We start off with a p-form P ; its dual is
going to be ∗P , a (d − p)-form. We define this in terms of its components, in any basis: If
1
P = Pa ...a dxa1 ∧ . . . ∧ dxap , (1.61)
p! 1 p
we define
1
∗P = (∗P )a1 ...ad−p dxa1 ∧ . . . ∧ dxad−p , (1.62)
(d − p)!
where
1
εa ...a b1 ...bp Pb1 ...bp .
(∗P )a1 ...ad−p = (1.63)
p! 1 d−p
Note that we contract the last p indices, this is conventional. We can construct the double dual of
P , and find that its components are
1
(∗ ∗ P )c1 ...cp = εc ...c a1 ...ad−p εa1 ...ad−p b1 ...bp Pb1 ...bp
p!(d − p)! 1 p
(−)p(d−p)
= εa ...a c ...c εa1 ...ad−p b1 ...bp Pb1 ...bp
p!(d − p)! 1 d−p 1 p
(−)p(d−p)
= (−)t (d − p)!δ[b1 [c1 δb2 c2 . . . δbp ] cp ] Pb1 ...bp
p!(d − p)!
= (−)p(d−p) (−)t Pb1 ...bp (1.64)

9
and hence we obtain
∗ ∗ P = (−)p(d−p)+t P. (1.65)
This means that if t is even, then

−P if d even and p odd
∗∗P = (1.66)
P otherwise;
for odd t it is the other way around.

1.3 Electromagnetism and Yang-Mills Theory


Now we will find a use for forms. The simplest use for forms is Maxwell’s equations, where now
d = 4, t = 1. These are

∇[a Fbc] = 0 ⇔ ∂[a Fbc] = 0; ∇a F ab = −j b . (1.67)

We can rewrite this in terms of forms, this will make life easier:

dF = 0, ∗d ∗ F = −j. (1.68)

We do this explicitly. F is an antisymmetric tensor, the field strength. We can therefore construct
a two-form
1
F = Fab dxa ∧ dxb ; (1.69)
2
then (remember d2 ≡ 0)
 
1 a b
dF = d Fab dx ∧ dx
2
1
= dFab ∧ dxa ∧ dxb
2
1 ∂Fab c
= dx ∧ dxa ∧ dxb
2 ∂xc
1 
= ∂[c Fab] dxc ∧ dxa ∧ dxb = 0 (1.70)
2
reproduces the first set of equations. We need to define a current one-form for the other half of
Maxwell’s equations:
j = ja dxa . (1.71)
Now work out ∗d ∗ F :  
1 1 cd a b
∗F = εab Fcd dx ∧ dx . (1.72)
2 2
We will “cheat” by using Riemann normal co-ordinates. In these co-ordinates,

g ∼ η, Γ ∼ 0, ∂Γ 6= 0. (1.73)

All quantities are tensorial, so the results will hold in general. In these co-ordinates dε = 0; then
1  cd 
d∗F = d εab Fcd dxa ∧ dxb
4
1
= ε cd ∂ Fcd dxe ∧ dxa ∧ dxb . (1.74)
4 [ab e]

10
This is a three-form with components
3
(d ∗ F )eab = ε[ab cd ∂e] Fcd . (1.75)
2
Then the components of ∗d ∗ F are
1 eab 3
(∗d ∗ F )p = εp ε cd ∂ Fcd
6 2 [ab e]
1 ab e cd
= ε p εab ∂e Fcd
4
= −δ[c| p ge|d] ∂e Fcd
= −δc p ged ∂e Fcd = −∂ d Fpd = ∂ d Fdp . (1.76)

On the example sheet, you can do this with combinatorial factors and using Christoffel symbols for
a general metric.
The simplest example is a current flowing through a wire in the z-direction. In cylindrical co-
ordinates, the metric is
ds2 = −dt2 + dρ2 + ρ2 dθ 2 + dz 2 , (1.77)
so that det g = −ρ2 . The current density only has a z component

jz = Iδ(2) (ρ), j = Iδ(2) (ρ)dz. (1.78)

We need to figure out F . The only component of the electromagnetic field is Bθ (ρ). That is

Fρz = −Fzρ = −Bθ . (1.79)

The two-form will be


1 1
F = Fab dxa ∧ dxb = (Fρz dρ ∧ dz − Fzρ dz ∧ dρ) = Fρz dρ ∧ dz = −Bθ dρ ∧ dz. (1.80)
2 2
Then automatically
∂Bθ
dF = −dBθ ∧ dρ ∧ dz = − dρ ∧ dρ ∧ dz = 0. (1.81)
∂ρ
∗F has components
1 1
(∗F )ab = εab cd Fcd = εpqcd Fcd gap gbq , (1.82)
2 2
the only component will be
1  tθρz  1
(∗F )tθ = −(∗F )θt = ε Fρz gtt gθθ + εtθzρ Fzρ gtt gθθ = − Fρz (−1)ρ2 = −ρBθ . (1.83)
2 ρ
Then ∗F = −ρBθ dt ∧ dθ and
∂(ρBθ )
d ∗ F = −d(ρBθ )dt ∧ dθ = − dρ ∧ dt ∧ dθ. (1.84)
∂ρ
Lect. 4
The only component of ∗d ∗ F is
 
ρtθ ρρ tt θθ ∂ 1 ∂
∗d ∗ F = εzρtθ (d ∗ F ) dz = ρg g g − (ρBθ ) = (ρBθ )dz (1.85)
∂ρ ρ ∂ρ

11
Maxwell’s equations are
1 ∂
(ρBθ ) = −Iδ(2) (ρ) (1.86)
ρ ∂ρ
which gives the obvious result. So that is how you do electromagnetism.

Next consider a generalisation of electromagnetism, developed by Yang and Mills in 1954, and
earlier (1952) by R. Shaw of the University of Hull.
Normally, a one-form A is
A = Aa dxa (1.87)
with functions Aa . But there is no requirement that Aa should be real-valued functions; they could
be elements of a Lie algebra.
Take some Lie group G. There will be a set of generators in the adjoint representation {Tα }.
The Cartan metric on the Lie algebra of G is

ηαβ = −2Tr(Tα Tβ ). (1.88)

There will be compact and non-compact directions in general. Compact directions will be repre-
sented by anti-Hermitian generators for which ηαβ = +1; non-compact directions will be represented
by Hermitian generators for which ηαβ = −1. This might fit in more with the mathematics than
the physics literature, that is simply too bad. For physical Yang-Mills theories, G is compact as
required to make a unitary quantum field theory.
The metric can then be used to raise or lower indices in the Lie algebra.
The group can be specified by the commutation relations

[Tα , Tβ ] = cαβ γ Tγ , (1.89)

where cαβ γ are structure constants of the Lie algebra. Then we define

Aa = Aαa Tα , (1.90)

where Aαa are components of the gauge field in question, and A is a Lie algebra valued one-form.
This generalises the vector potential of electromagnetism. We need to find the analogue of the field
strength. In electromagnetism, the field strength is invariant under gauge transformations. This
requirement is too strong in Yang-Mills theory. We define

F = dA + gA ∧ A, (1.91)

which is now a Lie algebra valued two-form, and g is a coupling constant that one introduces in
particle physics. In the mathematics literature, one sets g = 1. In the field theory world, this is
written out in terms of components:
1 α a
F dx ∧ dxb Tα = d(Aαa Tα ) ∧ dxa + gAαa Aβb dxa ∧ dxb Tα Tβ
2 ab
1
= (dAαa ∧ dxa )Tα + gAαa Aβb dxa ∧ dxb [Tα , Tβ ]
2
1
= ∂b Aa dx ∧ dx Tα + gAαa Aβb dxa ∧ dxb [Tα , Tβ ]
α b a
2
1 1  
= (∂a Aαb − ∂b Aαa ) dxa ∧ dxb Tα + gAαa Aβb cαβ γ dxa ∧ dxb Tγ , (1.92)
2 2

12
so
α
Fab = ∂a Aαb − ∂b Aαa + gAβa Aγb cβγ α . (1.93)
It is often simpler to do abstract calculations using forms.
In electromagnetism, since F = dA, one has automatically dF = 0. In Yang-Mills theory,

DA F = 0, (1.94)

where DA is a gauge covariant derivative defined by

DA F = dF + g[A, F ], (1.95)

where the commutator of a p-form P and a q-form Q is defined by



P ∧ Q − Q ∧ P if either or both of P and Q are even,
[P, Q] = (1.96)
P ∧ Q + Q ∧ P if P and Q are both odd.
Substitute this in F = dA + gA ∧ A to discover that DA F = 0 (Bianchi identity):

DA F = d(dA + gA ∧ A) + g(A ∧ (dA + gA ∧ A) − (dA + gA ∧ A) ∧ A)


= gdA ∧ A − gA ∧ dA + gA ∧ dA + g2 A ∧ A ∧ A − gdA ∧ A − g2 A ∧ A ∧ A
= 0. (1.97)

Let us generalise gauge transformations: In electromagnetism, these are

A → A + dǫ, F → F. (1.98)

Here
A → A + DA ǫ = A + dǫ + g[A, ǫ]. (1.99)
Then the infinitesimal change in F is

δF = dδA + gδA ∧ A + gA ∧ δA
= d(dǫ + gAǫ − gǫA) + g(dǫ + gAǫ − gǫA) ∧ A + gA ∧ (dǫ + gAǫ − gǫA)
= gdAǫ − gA ∧ dǫ − gdǫ ∧ A − gǫdA + gdǫ ∧ A + g2 Aǫ ∧ A − g2 ǫA ∧ A + gA ∧ dǫ
+g2 A ∧ Aǫ − g2 A ∧ ǫA
= g(dA + gA ∧ A)ǫ − gǫ(dA + gA ∧ A)
= g[F, ǫ]. (1.100)

So F transforms covariantly under gauge transformations, i.e. depends only on ǫ and not dǫ. That
should remind us of something, namely curvature.
In general relativity, under a co-ordinate transformation
′ ′
gab → ga′ b′ = Aa′ a Ab′ b gab , Γa bc → Γa b′ c′ = Aa a Ab′ b Ac′ c Γa bc + . . . , (1.101)

where the remaining terms contain derivatives of A. The Riemann tensor Ra bcd contains derivatives
of Γ and squared Γ terms, so one would expect second derivatives of A or squared first derivatives
to appear. But
′ ′
Ra b′ c′ d′ = Aa a Ab′ b Ac′ c Ad′ d Ra bcd (1.102)

13
with no such ∂A, ∂∂A terms. F in Yang-Mills theory has the same property, that is not a coinci-
dence. In general relativity,
[∇a ∇b − ∇b ∇a ]Vc = Rabc d Vd . (1.103)
The curvature is the commutator of two covariant derivatives. The same is true in Yang-Mills
theory (see later).
We first go back to Maxwell’s equations; the other half of these equations is (in vacuum)

d ∗ F = 0. (1.104)

The obvious generalisation of this is the Yang-Mills equation

DA (∗F ) = 0, (1.105)

that is in components,
1 γ bcde
∇a F abα + gcβγ α Aβc Fde ε = 0. (1.106)
6
Lect. 5
Let us now calculate DA DA X, where X is a p-form in the adjoint representation of G. Then
Y = DA X is a (p + 1)-form, so we have

DA Y = dY + gA ∧ Y + g(−)p Y ∧ A, Y = DA X = dX + gA ∧ X + g(−)p+1 X ∧ A. (1.107)

Then

DA DA X = d(dX + gA ∧ X + g(−)p+1 X ∧ A) + gA ∧ (dX + gA ∧ X + g(−)p+1 X ∧ A)


+g(−)p (dX + gA ∧ X + g(−)p+1 X ∧ A) ∧ A
= gdA ∧ X − gA ∧ dX + g(−)p+1 dX ∧ A − gX ∧ dA + gA ∧ dX + g2 A ∧ A ∧ X
+g2 (−)p+1 A ∧ X ∧ A + g(−)p dX ∧ A + g2 (−)p A ∧ X ∧ A − g2 X ∧ A ∧ A
= g(dA + gA ∧ A) ∧ X − gX ∧ (dA + gA ∧ A)
= g[F, X] (1.108)

since F is a two-form.
That is exactly what you would expect from a curvature. F is often called the curvature form in
mathematics (or Yang-Mills field strength in physics). So F must be the curvature of something,
so you should think of A as being a connection one-form (in the mathematics world).

2 Connections and General Relativity


2.1 Vielbein Formalism
You should wonder whether the same ideas work in general relativity. In general relativity, every-
thing involves just the metric tensor gab . All of the geometry of spacetime will be encoded into a
line element
ds2 = gab dxa dxb . (2.1)

14
We try to extend this idea: gab is a d-dimensional metric with t timelike directions. That means in
practice that you can always construct normal co-ordinates such that
 
η ∼ (−)t , (−)d−t . (2.2)

One can only do this at a point. But as η describes the tangent space of the manifold, we can
rewrite the metric as  
gab = ea µ eb ν ηµν , ηµν = diag (−)t , (−)d−t . (2.3)
The objects ea µ are called vierbein or vielbein fields in general relativity, or frame fields in the
mathematics world. It is not entirely obvious that you can always do this construction. At each
point, g is a symmetric matrix, so can diagonalise it:
X
g = OT DO, D = λi (fi ⊗ fi ), fi = (0, . . . , |{z}
1 , 0, . . . , 0). (2.4)
i ith place

There will be d non-zero eigenvalues λi , of which t will be negative and d − t will be positive. Then
by rescaling the eigenvectors, it should be clear that one can get g to the above form.
But while gab has 12 d(d + 1) components, ea µ has d2 components, so many more. But Lorentz
transformations
V µ → V ν = Λν µ V µ (2.5)
preserve the Lorentz metric:
ΛT ηΛ = η, Λµ ρ Λν σ ηµν = ηρσ . (2.6)
In the general case, Λ ∈ SO(d − t, t), and it is often useful to restrict attention to the component
connected to the identity. One would not call this a Lorentz transformation, but a generalised
rotation.
Under a (local) transformation of the frame fields,

ea µ → ẽa µ = Λµ ν (x)ea ν , (2.7)

the metric is left invariant:

gab → g̃ab = ẽa µ ẽb ν ηµν = Λµ ρ (x)ea ρ Λν σ (x)eb σ ηµν = ea ρ eb σ ηρσ = gab . (2.8)

You have found a new local invariance. We have enlarged the symmetry of general relativity (or
. . . ) to be

a) general co-ordinate transformations,

b) local generalised rotations.

One needs the frame fields to describe fermions in general relativity. Greek indices µ, ν, . . . are
tangent space indices (Lorentz indices), Latin indices a, b, c, . . . are spacetime indices.
gab , gab can raise and lower spacetime indices; ηµν , η µν can raise and lower tangent space indices:

eaµ = ηµν ea ν , ea µ = η µν eaν , etc. (2.9)

15
We can now write
gab = ea µ eb ν = ea µ ebµ . (2.10)
The objects ea µ can also be used to convert spacetime vectors (tensors) into tangent space vectors
(tensors):
V a → V µ = ea µ V a , V a = ea µ V µ , (2.11)
and indeed
ea µ V µ = ea µ eb µ V b = δa b V b = V a . (2.12)
This works similarly for general tensors of type (r, s).
The next thing is some idea of a derivative: A covariant derivative is

V a → ∇b V a , (2.13)
′ ′
such that under a co-ordinate transformation, if V a = Aa a V a ,
′ ′
∇b V a → ∇b′ V a = Ab′ b Aa a ∇b V a (2.14)

with no derivatives of A, which are cancelled by the usual Christoffel symbols. What is the covariant
derivative of ea µ ? It should be a (0, 2) spacetime tensor, and a Lorentz vector. Under a Lorentz
transformation e → eΛ, one will normally get ∂e ∼ (∂e)Λ + e∂Λ, so we need to add an extra term:

∇b ea µ = ∂b ea µ − Γb c a ec µ + ωb µ σ ea σ , (2.15)

where ωb µ σ is the spin connection. The spin connection is needed to absorb the terms involving
∂Λ if one performs a Lorentz transformation.
In Riemannian geometry and general relativity, one is accustomed to making a certain choice of
connection, such that
∇a gbc = 0. (2.16)
One wants to make an analogous choice for frame fields, which is consistent with it. The simplest
way to arrange this is to make
∇b ea µ = 0, ∇a ηµν = 0. (2.17)
We can turn
∂b ea µ − Γb c a ec µ + ωb µ σ ea σ = 0 (2.18)
into an expression for the spin connection by multiplying by ea λ :

ωb µ λ = ωb µ σ ea σ ea λ = −ea λ ∂b ea µ + Γb c a ec µ ea λ . (2.19)

We can regard this as a definition of the spin connection (almost). This definition of the spin
connection contains more information than Γ, so ω and Γ are not equivalent. Remember that a
metric connection consists of two pieces:

Γb c a = Γ(b c a) + Γ[b c a] , (2.20)

where the symmetric part is given by the Christoffel symbols and the antisymmetric part defines
the torsion:
Tb c a = 2Γ[b c a] . (2.21)

16
We have not yet looked at ∇a ηµν = 0. Lect. 6
You can obtain an equation analogous to the one above by writing out ∇b ea ν = 0:

∂b ea ν + Γb a c ec ν + ωbνσ eaσ = 0. (2.22)

Then multiply this by ea λ to get

ωbνλ = −ea λ ∂b ea ν − ea λ Γb a c ec ν . (2.23)

This definition is equivalent to the one above.


You see that calculations like these are rather messy. Cartan called this a “debauch of indices”.
The point of using forms is to get rid of the indices. We still need to look at

∇a ηµν = 0. (2.24)

This gives
!
0 = ∇a ηµν = ∂a ηµν + ωaµ σ ησν + ωaν σ ηµσ = ωaµν + ωaνµ . (2.25)
Hence a spin connection that is metric is antisymmetric on its Lorentz indices.
So this how a spin connection is defined, but you really do not want to do it this way in practice.
Let us start again, remembering that a conncetion can have torsion as well as curvature. We
demanded that
0 = ∇a gbc = ∂a gbc − Γa d b gdc − Γa d c gbd . (2.26)
For a symmetric connection, you can solve this in terms of Γ and discover that a symmetric metric
connection is unique. No such luck for us!
Let us try to repeat the usual calculation with nonvanishing torsion. One starts with

0 = ∇a gbc + ∇b gca − ∇c gab , (2.27)

which gives

∂a gbc + ∂b gca − ∂c gab = Γa d b gdc + Γa d c gbd + Γb d c gda + Γb d a gcd − Γc d a gdb − Γc d b gad


= 2Γ(a d b) gdc + 2Γ[a d c] gbd + 2Γ[b d c] gda
= 2Γ(a d b) gdc + T[a d c] gbd + T[b d c] gda . (2.28)

If the torsion vanishes, you can get what you are used to.
Let us recall a formula for the curvature:

Ra bcd = ∂c Γd a b − ∂d Γc a b + Γc a e Γd e b − Γd a e Γc e b (2.29)

or equivalently, by commuting covariant derivatives

(∇c ∇d − ∇d ∇c ) Z a = Ra bcd Z b − Tc e d ∇e Z a . (2.30)

We have another connection which will have a curvature too:

(∇c ∇d − ∇d ∇c )V µ = ∇c (∂d V µ + ωd µ σ V σ ) − ∇d (∂c V µ + ωc µ σ V σ )


= ∂c ∂d V µ + (∂c ωd µ σ )V σ + ωd µ σ ∂c V σ − Γc e d (∂e V µ + ωe µ σ V σ )
+ωc µ λ (∂d V λ + ωd λ σ V σ ) − (c ↔ d), (2.31)

17
we would like this to be a spin curvature and a torsion term. We look at the terms involving V
and no derivatives of V , and identify

Rcd µ σ = ∂c ωd µ σ − ∂d ωc µ σ + ωc µ λ ωd λ σ − ωd µ λ ωc λ σ (2.32)

as the curvature of the spin connection. The remaining terms are

(−Γc e d + Γd e c )(∂e V µ + ωe µ σ V σ ) = −Tc e d ∇e V µ . (2.33)

We obtain the same form as before. Manipulations on Lorentz indices are analogous to manipula-
tions on spacetime indices. Another fact which is almost miraculous is

Rab µ ν = ec µ ed ν Rab c d , (2.34)

where the term on the left-hand side is the curvature from the spin connection, and the Riemann
tensor on the right-hand side is the curvature from the Γ connection. This is true including torsion,
but not for a non-metric connection.
This is not entirely obvious. If you like, you can prove it explicitly using a metric connection.

2.2 Form Notation


Everybody who has ever calculated Rab c d explicitly knows that it is a nightmare. All these expres-
sions look much easier when written in terms of forms.
We start with a basis of (“pseudo-orthonormal”) one-forms

E µ = ea µ dxa . (2.35)

This is enough to specify the metric by

ηµν E µ ⊗ E ν = ηµν ea µ eb ν dxa ⊗ dxb . (2.36)

Since {E µ } form a basis, one can use them in any practical calculation. There is, in addition, a
connection one-form built from the spin connection

ωµν = −ωνµ = ωaµν dxa . (2.37)

We can define a torsion two-form


1
T λ = ea λ Tb a c dxb ∧ dxc . (2.38)
2
Lastly, there is a curvature two-form
1 µ
Rµ ν = R νcd dxc ∧ dxd . (2.39)
2
These forms contain all the information you could possibly want. Now we will translate everything
into this language. No sane person, after they have seen this, will do calculations any other way.

18
We discover that

dE µ + ω µ ν ∧ E ν = d(ea µ dxa ) + ωa µ ν dxa ∧ eb ν dxb


= (∂b ea µ )dxb ∧ dxa + ωa µ ν eb ν dxa ∧ dxb
1
= (∂a eb µ − ∂b ea µ + ωa µ ν eb ν − ωb µ ν ea ν )dxa ∧ dxb
2
1
= (Γa c b ec µ − ωa µ σ eb σ − Γb c a ec µ + ωb µ σ ea σ + ωa µ ν eb ν − ωb µ ν ea ν )dxa ∧ dxb
2
1 c µ a
= Ta b ec dx ∧ dxb = T µ , (2.40)
2
where we have used
∂b eaν = Γb c a ecν − ωbνσ ea σ . (2.41)
This is Cartan’s first equation of structure:

dE µ + ω µ ν ∧ E ν = T µ . (2.42)

What about the curvature? You can substitute in the components to see that Cartan’s second
equation of structure holds:
dω λσ + ω λ ν ∧ ω νσ = Rλσ . (2.43)
Lect. 7
The second equation of structure is very similar to Yang-Mills theory, where F = dA + A ∧ A,
except that

F, A take values in the adjoint representation of a gauge group,
(2.44)
R, ω take values in the Lorentz group (or whatever stands in for it).

However, in Yang-Mills theory there is no analogue of torsion T or vielbeins E. This leads to


problems if you try to interpret general relativity as a Yang-Mills theory for the Lorentz group.
You could write something like
R = Dω ω, (2.45)
but we will not use this notation.
Let us look at Bianchi identites. The first identity is obtained by taking d of Cartan’s first
equation of structure:

dT µ = d(dE µ + ω µ ν ∧ E ν )
= dω µ ν ∧ E ν − ω µ ν ∧ dE ν
= (Rµ ν − ω µ ρ ∧ ω ρ ν ) ∧ E ν − ω µ ν ∧ (T ν − ω ν σ ∧ E σ )
= Rµ ν ∧ E ν − ω µ ν ∧ T ν . (2.46)

For vanishing torsion, as in general relativity, one has

Rλ µ ∧ E µ = 0, (2.47)

19
which corresponds to the usual Ra [bcd] = 0.
There is a second Bianchi identity: Take d of the defintion of curvature.

Rλ µ = d(dω λ µ + ω λ ρ ∧ ω ρ µ )
= dω λ ρ ∧ ω ρ µ − ω λ ρ ∧ dω ρ µ
= (Rλ ρ − ω λ ν ∧ ω ν ρ ) ∧ ω ρ µ − ω λ ρ ∧ (Rρ µ − ω ρ ν ∧ ω ν µ )
= Rλ ρ ∧ ω ρ µ − ω λ ρ ∧ Rρ µ . (2.48)

We could write this as


dR = [R, ω], (2.49)
which is again very similar to Yang-Mills theory. In components, this is ∇[a Rλ |µ|bc] = 0.

2.3 Explicit Example


If one wants to evaluate the curvature and use it for something, then using this formalism is
relatively easy. In general relativity, the torsion vanishes, T µ = 0 (typically). Then you can use

0 = dE µ + ω µ ν ∧ E ν , ωµν = −ωνµ (2.50)

to find a metric connection ω µ ν for a given (pseudo-)orthonormal basis of one-forms E µ . You can
expand the two-form dE λ as
1
dE λ = cλ µρ E µ ∧ E ρ = −ω λ µ ∧ E µ (2.51)
2
and invert this relation to get
1
ωµν = (−cλµν − cµλν + cνλµ )E λ . (2.52)
2
This defines the connection one-form. Then you can use the second equation of structure to find
the curvature.
Most of the time, you can actually find ω by inspection, without using this formula.
Example: Spherically symmetric static spacetimes with line element

ds2 = −V 2 (r)dt2 + W (r)2 dr 2 + r 2 dθ 2 + r 2 sin2 θdφ2 = ηµν E µ ⊗ E ν (2.53)

which defines an orthonormal basis of one-forms:

E 0 = V (r)dt, E 1 = W (r)dr, E 2 = r dθ, E 3 = r sin θ dφ. (2.54)

You can think of the coefficients in these expressions as ea µ . This relates the basis {E µ } to a
co-ordinate basis; it is useful to invert this:

E0 E1 E2 E3
dt = , dr = , dθ = , dφ = . (2.55)
V (r) W (r) r r sin θ

20
The coefficients appearing here form the matrix ea µ . The first equation of structure gives

dE 0 = d(V (r)dt)
= −V ′ (r)dt ∧ dr
V ′ (r)
= − E0 ∧ E1
V (r)W (r)
= −ω 0 1 ∧ E 1 − ω 0 2 ∧ E 2 − ω 0 3 ∧ E 3 , (2.56)

since ω 0 0 = −ω00 = 0. This means that ω 0 2 is proportional to E 2 , ω 0 3 is proportional to E 3 and


V ′ (r)
ω01 = E 0 + αE 1 . (2.57)
V (r)W (r)
Similarly,

dE 1 = d(W (r)dr)
= 0
= ω10 ∧ E 0 − ω12 ∧ E 2 − ω1 3 ∧ E 3 . (2.58)

We use ω 1 0 = ω10 = −ω01 = ω 0 1 to see there is no E 1 term and hence


V ′ (r)
ω01 = E 0. (2.59)
V (r)W (r)
ω 1 2 is proportional to E 2 and ω 1 3 is proportional to E 3 .

dE 2 = d(rdθ)
= dr ∧ dθ
1
= E1 ∧ E2
rW (r)
= ω20 ∧ E 0 − ω21 ∧ E 1 − ω2 3 ∧ E 3 . (2.60)

Using that ω 2 0 = ω20 = −ω02 = ω 0 2 and ω 2 1 = ω21 = −ω12 = −ω 1 2 , we now have


1
ω 0 2 = 0, ω12 = − E2. (2.61)
rW (r)
Finally,

dE 3 = d(r sin θdφ)


= sin θdr ∧ dφ + r cos θdθ ∧ dφ
1 1
= E1 ∧ E3 + E2 ∧ E3
rW (r) r tan θ
= ω30 ∧ E 0 − ω3 1 ∧ E 1 − ω32 ∧ E 2. (2.62)

Using ω 3 0 = ω 0 3 , ω 3 1 = −ω 1 3 and ω 3 2 = −ω 2 3 we obtain


1 1
ω 0 3 = 0, ω13 = − E3, ω2 3 = − E3. (2.63)
rW (r) r tan θ
We can summarise this in a table:

21
ωa b b=0 b=1 b=2 b=3
V ′ (r) 0
a=0 0 V (r)W (r) E 0 0
V ′ (r)
a=1 V (r)W (r) E
0 0 − rW1(r) E 2 − rW1(r) E 3
1 2 1 3
a=2 0 rW (r) E 0 − r tan θE
1 3 1 3
a=3 0 rW (r) E r tan θ E 0

There are always 12 d(d−1) nontrivial components of ω. Compare this with Γb a c , which has 21 d2 (d+1)
components for a symmetric connection. Lect. 8
Now we will calculate the curvature two-form using

Rµ ν = dω µ ν + ω µ ρ ∧ ω ρ ν , (2.64)

in the basis of two-forms given by E ρ ∧ E σ , where


1
Rµ ν = Rµ νρσ E ρ ∧ E σ . (2.65)
2
Since
Rµν = −Rνµ , (2.66)
we again only need to calculate six components:

R0 1 = dω 0 1 + ω 0 µ ∧ ω µ 1
= dω 0 1 + ω 0 2 ∧ ω 2 1 + ω 0 3 ∧ ω 3 1
!
V ′′ (r) V ′ 2 (r) V ′ (r)W ′ (r) V ′ (r)
= − 2 − dr ∧ E 0 + dE 0
V (r)W (r) V (r)W (r) V (r)W 2 (r) V (r)W (r)
!
V ′′ (r) V ′ 2 (r) V ′ (r)W ′ (r) 0 1V ′ 2 (r)
= − + + E ∧E − 2 E0 ∧ E1
V (r)W 2 (r) V 2 (r)W 2 (r) V (r)W 3 (r) V (r)W 2 (r)
 
1 V ′′ (r) V ′ (r)W ′ (r)
= − + E 0 ∧ E 1. (2.67)
W 2 (r) V (r) V (r)W (r)

R0 2 = dω 0 2 + ω 0 1 ∧ ω 1 2 + ω 0 3 ∧ ω 3 2
 
V ′ (r) 0 1 2
= E ∧ − E
V (r)W (r) rW (r)
V ′ (r)
= − E0 ∧ E2. (2.68)
rV (r)W 2 (r)

R0 3 = dω 0 3 + ω 0 1 ∧ ω 1 3 + ω 0 2 ∧ ω 2 3
 
V ′ (r) 0 1 3
= E ∧ − E
V (r)W (r) rW (r)
V ′ (r)
= − E0 ∧ E3. (2.69)
rV (r)W 2 (r)

22
R1 2 = dω 1 2 + ω 1 0 ∧ ω 0 2 + ω 1 3 ∧ ω 3 2
     
1 2 1 3 1 3
= d − E + − E ∧ E
rW (r) rW (r) r cot θ
 
1 W ′ (r) 1
= 2
+ 2
dr ∧ E 2 − dE 2
r W (r) rW (r) rW (r)
 
1 1 W ′ (r) 1
= + E1 ∧ E2 − 2 2 E1 ∧ E2
W 2 (r) r 2 rW (r) r W (r)
W ′ (r) 1
= E ∧ E2. (2.70)
rW 3 (r)

R1 3 = dω 1 3 + ω 1 0 ∧ ω 0 3 + ω 1 2 ∧ ω 2 3
 
1 3 1 1
= d − E + · E2 ∧ E3
rW (r) rW (r) r tan θ
  
1 W ′ (r) 3 1 1
= + dr ∧ E − E1 ∧ E3
r 2 W (r) rW 2 (r) rW (r) rW (r)

1 2 3 1
+ E ∧E + 2 E2 ∧ E3
r tan θ r W (r) tan θ
 
1 1 W ′ (r) 1
= 2 2
+ E1 ∧ E3 − 2 2 E1 ∧ E3
W (r) r rW (r) r W (r)
W ′ (r) 1
= E ∧ E3. (2.71)
rW 3 (r)

R2 3 = dω 2 3 + ω 2 0 ∧ ω 0 3 + ω 2 1 ∧ ω 1 3
 
1 3 1 1
= d − E − · E2 ∧ E3
r tan θ rW (r) rW (r)
 
1 3 1 3 1 1 1 3 1 2 3
= 2 dr ∧ E + dθ ∧ E − E ∧E + E ∧E
r tan θ r sin2 θ r tan θ rW (r) r tan θ
1
− 2 2 E2 ∧ E3
r W (r)
   
1 1 1 2 1 1 1 2 3
= − E ∧E + 2 − r 2 tan2 θ − r 2 W 2 (r) E ∧ E
r 2 W (r) tan θ r 2 W (r) tan θ 2
r sin θ
 
1 1
= 2 1− 2 E2 ∧ E3. (2.72)
r W (r)
Now in practice you want to calculate solutions of Einstein’s equations. For this you need to
calculate the Ricci tensor, defined by
Rµ νµσ = Rνσ . (2.73)
We can read off the components Rµ νµσ from the expressions above, e.g.
 
0 1 V ′′ (r) V ′ (r)W ′ (r)
R 101 = 2 − + , (2.74)
W (r) V (r) V (r)W (r)
noticing that they are only non-vanishing if ν = σ, which is a consequence of the symmetry of the
problem. It follows that we must have

Rνσ = 0, ν 6= σ. (2.75)

23
The diagonal components are

R00 = R1 010 + R2 020 + R3 030


 ′′ 
1 V (r) V ′ (r)W ′ (r) 1 V ′ (r) 1 V ′ (r)
= − + +
W 2 (r) V (r) V (r)W (r) W 2 (r) rV (r) W 2 (r) rV (r)
 ′′ ′ ′ 
1 V (r) V (r)W (r) V ′ (r)
= − +2 . (2.76)
W 2 (r) V (r) V (r)W (r) rV (r)

R11 = R0 101 + R2 121 + R3 131


 
1 V ′′ (r) V ′ (r)W ′ (r) W ′ (r)
= − + + 2 . (2.77)
W 2 (r) V (r) V (r)W (r) rW (r)

R22 = R0 202 + R1 212 + R3 232


 
1 V ′ (r) W ′ (r) 1 1
= − + − + 2, (2.78)
W 2 (r) rV (r) rW (r) r 2 r

and R33 = R22 (exercise). For vacuum solutions, we need Rµν = 0, so


 ′ 
2 V (r) W ′ (r)
0 = R00 + R11 = + , (2.79)
rW 2 (r) V (r) W (r)

and hence
log V (r)W (r) = constant, V (r)W (r) = constant. (2.80)
If we demand that spacetime is flat as r → ∞, it is natural to set
1
W (r) = (2.81)
V (r)

Then  
2 V ′ (r) 1 1
R22 = V (r) −2 − 2 + (2.82)
rV (r) r r2
and so we have to solve
V ′ (r) 1 1 1 − V 2 (r)
2 =− + = . (2.83)
V (r) r rV 2 (r) rV 2 (r)
This is an ordinary differential equation that you can easily solve:
Z Z
2V (r) dV dr constant
= , V 2 (r) = 1 − . (2.84)
1 − V 2 (r) r r

We have rediscovered the Schwarzschild solution. This is the easiest way to find solutions to
Einstein’s equations.

24
3 Integration
You have learned in general relativity that if we want to integrate a scalar φ over a d-dimensional
domain D (with boundary ∂D), then
Z

I= dd x g φ(x) (3.1)
D

is independent of the choice of co-ordinates, where

g = | det gab | (3.2)


R
and dd x is interpreted as a Riemann integral.
Now suppose that φ = ∇a V a , then (Gauss’ theorem)
Z Z
a
I= ∇a V = dΣa V a , (3.3)
D ∂D

where dΣa = na ·(volume element of ∂D) for an outward unit normal na . The metric on ∂D is

hab = gab ± na nb , (3.4)

where there is a plus if n is timelike and a minus if n is spacelike. Lect. 9


d √
We will now replace the covariant volume element d x g in the above formulation by a volume
form, the d-form
ǫ = E1 ∧ E2 ∧ . . . ∧ Ed, (3.5)
where {E µ } are a basis of orthonormal one-forms. Remember the alternating symbol (in the tangent
space) 
 +1 (µν . . . τ ) is an even permutation of (1, . . . , d)
µν...τ
η = −1 (µν . . . τ ) is an odd permutation of (1, . . . , d) (3.6)

0 otherwise.
ηµν...τ is found by lowering with the Lorentz metric (or more generally, the tangent space metric).
Hence,
ηµν...τ = (−)t η µν...τ . (3.7)
(Normally you should not write equations like this one.) In terms of components, we then have
1
ǫ= (−)t ηµν...τ E µ ∧ E ν ∧ . . . ∧ E τ . (3.8)
d!
Expressing this in a co-ordinate basis, using E µ = ea µ dxa ,
1
ǫ = (−)t ηµν...τ ea µ eb ν . . . ef τ dxa ∧ dxb ∧ . . . ∧ dxf
d!
1
= (−)t ηµν...τ ea µ eb ν . . . ef τ η abc...f dx1 ∧ dx2 ∧ . . . ∧ dxd
d!
= (−)t (det e)dd x. (3.9)

det e is almost the same as g:

gab = ea µ eb ν ηµν ⇒ det g = det(e2 η) = ±(det e)2 . (3.10)

25
p √
Hence | det e| = | det g| = g. What you discover is that ǫ reproduces the previous expression
for the volume element. So we define integration over a d-dimensional region to be
Z Z
φ d(V ol) ≡ φ ǫ. (3.11)
D D

One always integrates a d-form over a space of dimension d. Alternatively,


Z Z
φ d(V ol) ⇒ ∗ φ. (3.12)
D D

The first and most imporant result for integrals over forms is Stokes’ theorem. We prove a
pedestrian version, where the region D is bounded by two surfaces which can be taken as λ = 0
and λ = 1. We can then choose co-ordinates in D such that the metric is

ds2 = dλ2 + ds2⊥ , (3.13)

such that gλλ = 1 and gλi = 0 (these are Gaussian normal co-ordinates, see GR course). The
volume form on D is dλ ∧ d(V ol)d−1 , where d(V ol)d−1 is a volume form on surfaces λ = constant.
Now take a (d − 1)-form which is proportional to d(V ol), written as f (λ, xi )d(V ol), and integrate
Z Z Z
∂f (λ, xi ) 
dλ ∧ d(V ol) = df (λ, xi ) ∧ d(V ol) = d f (λ, xi ) ∧ d(V ol) , (3.14)
D ∂λ D D

since (∂f /∂xi )dxi ∧ d(V ol) = 0 and ∂d(V ol)/∂λ = 0. As a one-dimensional integral over λ this is
Z Z Z
∂f (λ, xi ) i
dλ ∧ d(V ol) = f (1, x )d(V ol) − f (0, xi )d(V ol), (3.15)
D ∂λ ∂D(λ=1) ∂D(λ=0)

which is an integral over the boundary. So in this case we have


Z Z
dω = ω. (3.16)
D ∂D

This is by far the easiest version of Stokes’ theorem. A corollary of this is


Z Z Z
0= ddg = dg = g (3.17)
D ∂D ∂∂D

for any (d − 2)-form g. So the boundary of a boundary is empty.

3.1 Action for General Relativity


We want to give an action for general relativity in d = 4. You are probably used to the Einstein-
Hilbert action Z

dd x g R; (3.18)

from this you derive Rab = 0 in vacuum, but under various assumptions on the connection. There
is an alternative formulation which makes the requirements on the connection appear more natural:
Z Z   
I= (Rµν (ω) ∧ E ρ ∧ E σ )ηµνρσ = dω µλ + ω µν ∧ ων λ ∧ E ρ ∧ E σ ηµνρσ . (3.19)
D D

26
This action contains two types of fields: the vielbein fields and the connection ω. We require
I to be stationary under arbitrary variation of both E and ω. Note that in this action, ηµνρσ
projects out the symmetric part of ω. We only need to consider the antisymmetric part of ω, so
one automatically has a metric connection.
Vary E
Z Z
µν ρ σ µν ρ σ
δI = (R (ω) ∧ δE ∧ E + R (ω) ∧ E ∧ δE )ηµνρσ = 2(Rµν (ω) ∧ δE ρ ∧ E σ )ηµνρσ ; (3.20)
D D

if this is supposed to vanish for arbitrary δE ρ , we must have

Rµν ∧ E ρ ηµνρσ = 0. (3.21)

In components, this is
1 µν
R λτ E λ ∧ E τ ∧ E ρ ηµνρσ = 0, (3.22)
2
or
Rµν [λτ ηρ]µνσ = 0. (3.23)
Contract this with η λτ ρκ , which does not annihilate any information in the equation:

0 = Rµν λτ ηρµνσ η λτ ρκ
h i
= 2Rµν λτ δµ κ δν λ δσ τ + δµ λ δν τ δσ κ + δµ τ δν κ δσ λ (3.24)
 
= 2 Rκλ λτ δσ τ + Rλτ λτ δσ κ + Rτ κ λτ δσ λ
= 2 (−2Rκ σ + Rδσ κ ) ,

which you recognise as the vacuum Einstein equations. Contraction gives

Rκ σ = 0, (3.25)

so the Ricci tensor of the connection ω vanishes.


Now we try to vary ω:
Z   
δI = dδω µλ + δω µν ∧ ων λ + ω µν ∧ δων λ ∧ E ρ ∧ E σ ηµνρσ . (3.26)
D

This should vanish for arbitrary ω. Use the identity for a one-form X and a two-form Y
Z Z Z
X ∧Y = d(X ∧ Y ) = dX ∧ Y − X ∧ dY. (3.27)
∂D D D

When varying something, you always have to worry about boundary conditions. Here we put
δω µν = 0 on the boundary. One could do this more generally, and consider boundary terms in the
action as well (see Black Holes course).
Setting the boundary term to zero, we can turn (dδω) ∧ E ∧ E into δω ∧ d(E ∧ E):
Z n o
δI = δω µλ ∧ (dE ρ ∧ E σ − E ρ ∧ dE σ ) + 2 (δω µν ) ∧ ων λ ∧ E ρ ∧ E σ ηµλρσ
ZD  
= δω µν ∧ (dE ρ ∧ E σ − E ρ ∧ dE σ ) ηµνρσ + 2ων λ ∧ E ρ ∧ E σ ηµλρσ . (3.28)
D

27
This should vanish for arbitrary ω µν , so we get

(dE ρ ∧ E σ − E ρ ∧ dE σ ) ηµνρσ + 2ω[ν λ ∧ E ρ ∧ E σ ηµ]λρσ = 0. (3.29)

Recall Cartan’s first equation of structure and replace dE ρ = T ρ − ω ρ τ ∧ E τ :

((T ρ − ω ρ τ ∧ E τ ) ∧ E σ − E ρ ∧ (T σ − ω σ τ ∧ E τ )) ηµνρσ + 2ω[ν λ ∧ E ρ ∧ E σ ηµ]λρσ = 0. (3.30)

Now use

2ω[ν λ ∧ E ρ ∧ E σ ηµ]λρσ + (−ω ρ τ ∧ E τ ∧ E σ + E ρ ∧ ω σ τ ∧ E τ ) ηµνρσ


= 2ω[ν λ ∧ E ρ ∧ E σ ηµ]λρσ − 2ω ρ λ ∧ E λ ∧ E σ ηµνρσ
 
= 2 ωτ [ν λ ηµ]λρσ − ωτ λ ρ ηµνλσ E τ ∧ E ρ ∧ E σ
 
= 2 −ωτ λ [ν ηµ]λρσ − ωτ λ ρ ηµνλσ E τ ∧ E ρ ∧ E σ , (3.31)

where we expanded ων λ = ωτ ν λ E τ etc. Take the components of this three-form and multiply by
η τ ρσκ , which is taking the Hodge dual:
 
2 −ωτ λ [ν ηµ]λρσ η τ ρσκ − ωτ λ ρ ηµνλσ η τ ρσκ
= −ωτ λ ν ηρσµλ η ρστ κ + ωτ λ µ ηρσνλ η ρστ κ + 2ωτ λ ρ ησµνλ η στ ρκ
= 4ωτ λ ν δ[µ τ δλ] κ − 4ωτ λ µ δ[ν τ δλ] κ − 12ωτ λ ρ δ[µ τ δν ρ δλ] κ
= 4ω[µ λ |ν| δλ] κ − 4ω[ν λ |µ| δλ] κ − 12ω[µ λ ν δλ] κ
 
= 2 ωµ κ ν − ων κ µ − ωλ λ ν δµ κ + ωλ λ µ δν κ − ωµ κ ν − ων λ λ δµ κ − ωλ λ µ δν κ + ωµ λ λ δν κ + ωλ λ ν δµ κ + ων κ µ
 
= 2 −ων λ λ δµ κ + ωµ λ λ δν κ = 0. (3.32)

Hence all terms involving the connection cancel, and we finally obtain

(T ρ ∧ E σ ) ηρσµν = 0. (3.33)

Lect.
We claim that it follows from 10
T τ ∧ Eλ = 0 (3.34)
that the torsion two-form has to vanish identically. To show this, we expand in a basis of one-forms:
1
T τ = T τ αβ E α ∧ E β . (3.35)
2
Then we have
T τ αβ E α ∧ E β ∧ E λ = 0. (3.36)
We multiply this three-form by ητ λρσ and take its components:

T τ [αβ η|τ |λ]ρσ = 0. (3.37)

More explicitly,
T τ αβ ητ λρσ + T τ λα ητ βρσ + T τ βλ ητ αρσ = 0. (3.38)

28
Now we contract with η λρσκ

0 = T τ αβ ητ λρσ η λρσκ + T τ λα ητ βρσ η λρσκ + T τ βλ ητ αρσ η λρσκ


   
= 6T τ αβ δτ κ − 2 δτ λ δβ κ − δβ λ δτ κ T τ λα − 2 δτ λ δα κ − δα λ δτ κ T τ βλ
= 6T κ αβ − 2 (T τ τ α δβ κ − T κ βα ) − 2 (T τ βτ δα κ − T κ βα )
= 2T κ αβ − 2T τ τ α δβ κ − 2T τ βτ δα κ (3.39)

Contract κ in this equation with β to find

2T κ ακ + 8T τ ατ − 2T τ ατ = 0. (3.40)

Therefore we have T κ ακ = 0 and consequently

T κ αβ = 0. (3.41)

We have discovered that the action


Z
I[E, ω] = Rµν (ω) ∧ E ρ ∧ E σ ηµνρσ , (3.42)

when E and ω are independently varied, gives implicitly a metric conection and explicitly vanishing
torsion and the vacuum Einstein equations. In this sense, the action is superior to the Einstein-
Hilbert action.

3.2 Yang-Mills Action


Yang-Mills theory in four dimensions can be defined by the action
Z
1
I= Tr(F ∧ ∗F ), (3.43)
2

where the gauge group G is compact. (If you do not make this assumption, the quantum theory
will violate unitarity.) For a set of generators {Tα } of G,
Z
1
I= Tr(F α Tα ∧ ∗F β Tβ ), (3.44)
2

and we can use the Cartan metric Tr(Tα Tβ ) = − 21 ηαβ = 12 δαβ (since G is compact) to rewrite this
as Z Z
1 α β 1
I= F ∧ ∗F Tr(Tα Tβ ) = F α (A) ∧ ∗Fα (A), (3.45)
2 4
where explicitly
1
F α = dAα + gcβγ α Aβ ∧ Aγ . (3.46)
2
We take G to be compact and semi-simple, so that c is totally antisymmetric in all indices. If you
think about it, the components of G ∧ ∗F , where G and F are two-forms, are proportional to

Gµν F ρσ ηρσλτ , (3.47)

29
and so the thing that is integrated is

Gµν F ρσ ηρσλτ η µνλτ . (3.48)

But this is the same as you get for ∗G ∧ F . Hence, under variation of A one gets two identical
terms:
Z
1
δI = δF α (A) ∧ ∗Fα (A) + F α (A) ∧ ∗δFα (A)
4
Z
1
= δF α (A) ∧ ∗Fα (A)
2
Z  
1 α 1 α β γ
= δ dA + gcβγ A ∧ A ∧ ∗Fα (A)
2 2
Z  
1 α 1 α β γ 1 α β γ
= δdA + gcβγ δA ∧ A + gcβγ A ∧ δA ∧ ∗Fα (A)
2 2 2
Z  
1
= δdAα + gcβγ α δAβ ∧ Aγ ∧ ∗Fα (A) (3.49)
2
Again, we use
Z Z Z
δA ∧ ∗F = d(δA ∧ ∗F ) = (dδA ∧ ∗F − δA ∧ d ∗ F ) (3.50)
∂M M M

and set the boundary term to zero to obtain


Z  
1
δI = δAα ∧ d ∗ Fα + gcαβ γ Aβ ∧ ∗Fγ (3.51)
2
which should vanish for all δAα . One obtains the Yang-Mills equations

d ∗ Fα + gcαβ γ Aβ ∧ ∗Fγ = 0, (3.52)

or alternatively
d ∗ F + g[A, ∗F ] = 0. (3.53)
Conventionally one rescales the fields to remove g from the definition of F :
A 1
A→ , F = dA + gA ∧ A → (dA + A ∧ A). (3.54)
g g
Since the action is homogeneous of degree two in F , g can be taken outside the integral:
Z
1
I= 2 Tr(F ∧ ∗F ), F = dA + A ∧ A. (3.55)
4g
Notice that this is just a way of rescaling the fields which is not a change of physics.

4 Topologically Non-Trivial Field Configurations


These are things which you can not see in perturbation theory in quantum field theory, but are
nevertheless important. The simplest example is a domain wall in scalar field theory, where we
take the potential to be
V (φ) = λ(φ2 − a2 )2 . (4.1)

30
This is a renormalizable sensible field theory. The action for this theory in four-dimensional space-
time is Z  
4 √ 1 ab
I = d x g − ∂a φ∂b φg − V (φ) , (4.2)
2
where we take spacetime to be Minkowski space. The potential has minima at φ = ±a:

6V (φ)

-
φ− = −a φ+ = a φ

There are two different choices of vacuum state.


Perturbation theory describes small fluctuations around one of the minima. We might be interested,
instead of doing this, in domain walls. These are configurations where the field takes two different
asymptotic values in different regions of space.
Fluctuations around a minimum can be described as having some kind of mass. For an ordinary
massive particle,
1
V (φ) = m2 φ2 . (4.3)
2
Hence we can define
m2 = V ′′ (vacuum). (4.4)
This describes the curvature at the minimum. For our V (φ),

V (φ) = λ(φ2 − a2 )2 , V ′ (φ) = 4λφ(φ2 − a2 ), V ′′ (φ) = 4λ(φ2 − a2 ) + 8λφ2 (4.5)

and hence
m2 = V ′′ (±a) = 8λa2 . (4.6)

The potential has coupling constant λ and describes particles of mass 8λa. λ and m desribe the
physical variables in this problem. The equations of motion (the analogue of the Klein-Gordon
equation) are
2φ − V ′ (φ) = 0. (4.7)
We look for solutions that are static and have planar symmetry. This turns it into a one-dimensional
problem: If φ = φ(z), the Klein-Gordon equation becomes

d2 φ
= 4λφ(φ2 − a2 ). (4.8)
dz 2

31
Multiply this equation by φ′ (z) and integrate over z:
φ′′ (z)φ′ (z) = 4λφ(z)φ′ (z)(φ(z)2 − a2 )
1 ′
(φ (z))2 = λ(φ(z)2 − a2 )2 + constant. (4.9)
2
A long way away from the domain wall, we assume that φ → φ± and φ′ (z) → 0. This means the
constant is set to zero. Now integrate again:

φ′ (z) = ± 2λ(φ(z)2 − a2 )
Z √ Z

= ± 2λ dz
φ2 − a2
1 φ √
Artanh = ± 2λ(z − z0 ). (4.10)
a a
Choosing the positive sign, we have found a kink solution
√ 
φ(z) = a tanh 2λa(z − z0 ) . (4.11)
The solution interpolates between two vacua. z = z0 is a domain wall which separates one vacuum
from a different vacuum. Lect.
The energy of the field is in the region around z = z0 . To calculate the energy-momentum tensor, 11
take the covariant Lagrangian
Z  
4 √ 1 ab
I = d x g − ∂a φ∂b φg − V (φ) , (4.12)
2
then the energy-momentum tensor is given by
2 δI
Tab = − √ . (4.13)
g δgab
This gives  
1
Tab = ∂a φ∂b φ − gab gcd ∂c φ∂d φ + V (φ) (4.14)
2
You want to calculate the energy per unit area in the domain wall, which is
Z∞
dz T00 . (4.15)
−∞

We put z0 = 0 for simplicity. Then



√ ′ 2λa2
φ(z) = a tanh( 2λaz), φ (z) = √ , (4.16)
cosh2 ( 2λaz)

1  ′2 
T00 = φ + λ(φ2 − a2 )2
2 
1 2λa4 4 2
√ 2
= √ + λa (tanh ( 2λaz) − 1)
2 cosh4 ( 2λaz)
 
1 4 2 1
= λa √ + √
2 cosh4 ( 2λaz) cosh4 ( 2λaz)
3λa4
= √ . (4.17)
2 cosh4 ( 2λaz)

32
The energy per unit area is then
Z∞ Z∞
3λa4 1
dz T00 = dz 4

2 cosh ( 2λaz)
−∞ −∞
Z∞ √ !
3λa4 1 tanh2 ( 2λaz)
= dz √ − √
2 cosh2 ( 2λaz) cosh2 ( 2λaz)
−∞
 ∞
3λa4 1 √ 1 3

= √ tanh( 2λaz) − √ tanh ( 2λaz)
2 2λa 3 2λa −∞
3λa4 1 4 √ 1 m 3
= √ = 2λa3 = , (4.18)
2 2λa 3 16 λ

where we used a = m/ 8λ. The important result is that this is proportional to λ−1 .
For static configurations, the action is energy times a time interval. The path integral will be
Z
Z ∼ D[φ]e−iI[φ] . (4.19)

The amplitude with any process that contains a domain wall will be

Z ∼ ei/λ , (4.20)

up to numerical factors. This has an essential singularity at λ = 0.


The key point is that you can never find this process in perturbation theory in λ. Therefore
one has to do things which are inherently non-perturbative in nature. The amplitudes involving
topologically non-trivial configurations always involve inverse powers of the coupling constants.

5 Kaluza-Klein Theory
This, in its simplest form, is just general relativity in five dimensions instead of four. Your first
reaction will be that this makes absolutely no sense.
Imagine that one dimension out of the five is wrapped up in the form of a very small circle. We
would like x5 to be wrapped up with radius R, so we identify

x5 with x5 + 2πR. (5.1)

You could argue that this is a special class of solutions which are irrelevant. But this is nothing
unusual, this is simply what one does to make life easy - compare with isotropic and homogeneous
solutions in cosmology. We assume there is a Killing vector associated with translations in x5 ,
∂ ∂
5
= Ka a . (5.2)
∂x ∂x
Then, the five-metric can be written as not to depend explicitly on x5 . We write this as
 
g55 | g5j
 | 
gab = 
 gi5 |
, (5.3)
gij 
|

33
where the indices i, j run from 0 to 3.
From a four-dimensional point of view, g5j looks like a vector field, g55 looks like a scalar field.
That is indicative of what you should expect.
We can write the five-metric in the following form, which is cunningly chosen to make life easy:
2
ds2 = e2βφ γij dxi dxj + e2αφ dx5 + Ai dxi , (5.4)

where we interpret γij as a four-dimensional metric, Ai as a vector field under four-dimensional co-
ordinate transformations and φ as a scalar field under four-dimensional co-ordinate transformations.
Now draw your attention to what happens under an infinitesimal co-ordinate transformation that
does involve x5 . Suppose that
Ai → Ai + ∂i Λ, (5.5)
then the one-form dx5 + Ai dxi is invariant if

x5 → x5 − Λ. (5.6)

If this were to describe electromagnetism, a gauge transformation is the same as a co-ordinate


transformation. The simplest thing to do is to calculate the Ricci scalar, because this is what
appears in the action
Z Z Z
1 5 √ (5) 1 √
I= (5)
d x g R= (5)
d x dx5 e(4β+α)φ γ (5) R.
4
(5.7)
16πGN 16πGN | {z }
= 2πR

(Note that det g = e(8β+2α)φ det γ.)


The calculation of (5) R is half messy and half straightforward. You should do the straightforward
part yourself. The messy part is an application of the technology of forms.
Step 1 Find an orthonormal basis in d = 5: Define

E 5 = eαφ (dx5 + A), A ≡ Ai dxi , (5.8)

and regard ds4(2) = γij dxi dxj as four-dimensional line element which defines an orthonormal basis
of one-forms ei , such that
γij dxi dxj = ei ⊗ ej ηij , (5.9)
where we now use spacetime and tangent space indices interchangably. Then the five-dimensional
one-forms are defined by
E i = eβφ ei , (5.10)
and the five-metric is
ds2 = E 5 ⊗ E 5 + ηij E i ⊗ E j . (5.11)
Step 2 Calculate the connection one-forms, setting torsion to zero:

dE i = β dφ eβφ ∧ ei + eβφ dei


= β dφ eβφ ∧ ei − eβφ ω̂ i j ∧ ej , (5.12)

34
where we use Cartan’s first equation of structure dei = −ω̂ i j ∧ej for the four-dimensional connection
ω̂, and
dE 5 = α dφ ∧ eαφ (dx5 + A) + eαφ dA. (5.13)
Thinking of electromagnetism, we write dA as F , with
1 1
F = Fij ei ∧ ej = (∂i Aj − ∂j Ai ) ei ∧ ej (5.14)
2 2
We can rewrite this in terms of the big E’s with extreme ease:
dE i = β e−βφ ∂j φ E j ∧ E i − ω̂ i j ∧ E j , (5.15)
1 
dE 5 = α e−βφ ∂j φ E j ∧ E 5 + e(α−2β)φ Fij E i ∧ E j , (5.16)
2
where ∂j φ relates to the components of dφ = (∂j φ) ej (not written in terms of E j )!
From Cartan’s first equation of structure, we obtain the connection components (note that the
connection must be antisymmetric)
1
ω 5 i = α∂i φ e−βφ E 5 + e(α−2β)φ Fij E j , (5.17)
2
i i −βφ
 1
ω j = ω̂ j − βe ∂ φEj − ∂j φE i − e(α−2β)φ F i j E 5 ,
i
(5.18)
2
5 5 i
where we get the first from dE = −ω i ∧ E , and the second from
1
dE i = −ω i j ∧ E j − ω i 5 ∧ E 5 = −ω i j ∧ E j + e(α−2β)φ F i j E j ∧ E 5 . (5.19)
2
Lect.
In slightly more general terms, we could do a reduction from (d + 1) dimensions to d dimensions, 12
where the metric is written as
2
ds2 = e2βφ γij dxi dxj + e2αφ dz + Ai dxi (5.20)
and the z direction is taken to be curled up. Of course, the calculations go through as before.
You can now calculate the two-form from this. This is rather messy and there are lots of terms you
will get. We will not write the calculation out explicitly.
You will find that the result is
 
z −2βφ k 1 2(α−β)φ
R i = e α(α − 2β)∂i φ∂j φ + α∂j ∂i φ + ηij αβ∂k φ∂ φ − e k
Fkj F i E j ∧ E z
4
 
(α−3β)φ 1 1 1 1
+e (α − β)∂i φFkj − (α − β)∂[k φFj]i − ∂[k Fj]i + βηi[j Fk]l ∂ φ E k ∧ E j
l
2 2 2 2
1  
− e(α−2β)φ Fij ω̂ j k ∧ E k + Fjk ω̂ j i ∧ E k + α∂j φe−βφ E z ∧ ω̂ j i , (5.21)
2

i i z k (α−3β)φ 1 1 
R j = r j +E ∧E e (α − β)F i j ∂k φ + ∂k F i j − (α − β) ∂ i φFjk − F i k ∂j φ
2 2
  
1 
+ β ∂n φF n j δi k + F i l ∂ l φηjk + E k ∧ E l e−2βφ β ∂j ∂[k φδi l] − ∂[k ∂ i φηl]j (5.22)
2

1 2(α−β)φ i i
 2 i p i i

− e F j Fkl + F [k| Fj|l] + β ∂ φ∂[k φηl]j − ∂p φ∂ φδ [k ηl]j − ∂j φ∂[k φδ l]
4
  1  
+βe−βφ ∂k φE i ∧ ω̂ k j − ∂ k φω̂ i k ∧ Ej − βe(α−2β)φ F k j ω̂ i k ∧ E z − F i k ω̂ k j ∧ E z ,
2

35
where r i j = dω̂ i j + ω̂ i k ∧ ω̂ k j is the d-dimensional Riemann tensor.
This, as you can gather, is an unpleasant calculation to do. If we look at the action
Z
1 √
I= (d+1)
dd+1 x gR, (5.23)
16πGN

we will find that this can be brought in to an extremely simple form. From the expressions for Ri j ,
discover that
j
R = 2Rz iz i + Ri ji
 
−2βφ 2 2 1 2(α−β)φ 2
= 2e α(−α + 2β)(∇φ) − α2φ − d · αβ(∇φ) + e F + 2α∂k φe−βφ ω̂ ik i
4

−2βφ −2βφ −2βφ 1 1
+e r + 2e + 2e β(−d + 1)2φ − e2(α−β)φ F 2 + β 2 ((∇φ)2 (d − 1)
4 2

1
− (∇φ)2 (d − 1)d + 2βe−βφ (d − 1)∂n φω̂ jn j
2
 
1
= e−2βφ r − 2e−2βφ (∇φ)2 α2 + (d − 2)αβ + (d − 2)(d − 1)β 2 − 2e−2βφ 2φ(α + β(d − 1))
2
1
+ e(2α−4β)φ F 2 + 2e−βφ ∂k φω̂ ik i (α + β(d − 1)). (5.24)
4
The offending term involving the connection components can be set to zero by choosing

α = −β(d − 1). (5.25)

Then
1
R = e−2βφ r − e−2βφ (d − 1)(d + 2)β 2 (∇φ)2 + e(2α−4β)φ F 2 . (5.26)
4
If we recall that det g = e2(α+dβ)φ det γ, we obtain
 
√ √ 1
g R = γe−βφ r(γ) + e−β(2d+1)φ F 2 − (∇φ)2 . (5.27)
4

The result could also be (choosing α = −β(d − 2), but then where are the connection terms?)
 
√ √ 1 −2β(d−1)φ 2 1 2
g R = γ r(γ) − e F − (∇φ) . (5.28)
4 2
You can do the integral over z in the action which just gives a constant 2πR, and obtain the
d-dimensional action
Z  
2πR d √ −βφ 1 −β(2d+1)φ 2 2
I= (d+1)
d x γe r(γ) + e F − (∇φ) . (5.29)
16πGN 4
| {z }
(d)
=:(16πGN )−1

In a region where φ is more or less constant, we just get a unified theory of a scalar field, an
“electromagnetic field”, and gravity. Probably, this theory would have languished in the physics of
the 1920’s, were it not for string theory. Since string theory only makes sense in higher dimensions,
you have to do the same construction to get rid of the extra dimensions.

36
5.1 Particle Motion in Kaluza-Klein Theory
Now the first thing is to ask yourself about the motion of particles in this spacetime:

(a) Classical particle motion


This is given by geodesics, obtained by extremising the action
Z
I = ds gab ẋa ẋb . (5.30)

Let us decompose this


Z  
I = ds e2βφ γij + e2αφ Ai Aj ẋi ẋj + 2e2αφ Ai ẋi ż + e2αφ ż 2 . (5.31)


Since ∂z is Killing,
∂L
= 2e2αφ (ż + Ai ẋi ) (5.32)
∂ ż
is a constant of the motion along geodesics. (This is of course general: If K a is Killing, and
ua = ẋa , then ub ∇b (K a ua ) = 0.)
We only want to consider φ = constant. This is because the vacuum solution will be of the
form
ηij dxi dxj + dz 2 + 2dz Ai dxi + . . . (5.33)
The last term in the one-particle action looks like a mass term. In a region where φ is constant,
there is also a term
2αhφi i
|2e {z ż} Ai ẋ , (5.34)
=: q

where q looks like the charge of a test particle.


Thus the motion in the z-direction corresponds to electric charge. This is why this does not
make sense as a theory of electromagnetism; test particles have masses proportional to their
charge. As a unified theory of gravity and electromagnetism, this theory was out of fashion
until approximately 1982.

(b) Quantum-mechanical particle motion


Consider the Klein-Gordon equation
 
m2
−2 + φ = 0. (5.35)
~2

~ has been put in for a reason which will become apparent. We consider a semi-classical
approximation of the form
φ = AeiS/~ . (5.36)
This wavefunction in the ~ → 0 limit gives you back the classical theory. Derivatives of φ are
i
∇a φ = ∇a AeiS/~ + ∇a SAeiS/~ , (5.37)
~
i i 1
2φ = 2AeiS/~ + 2 (∇a A)(∇a S)eiS/~ + 2S AeiS/~ − 2 ∇a S∇a S AeiS/~ . (5.38)
~ ~ ~

37
Stick this into the Klein-Gordon equation to get
 
2A i ∇a A a i 1 m2
− −2 ∇ S − 2S + 2 (∇S)2 + 2 = 0. (5.39)
A ~ A ~ ~ ~

Multiply by ~2 and take the limit ~ → 0 to get

(∇S)2 + m2 = 0. (5.40)

So we can identify ∇S with the momentum, ua ∝ ∇a S. The velocity vector of a particle is


orthogonal to the surfaces of constant phase of the wavefunction.
This means that ua obeys the geodesic equation:
1  1
ua (∇a ub ) = (∇a S)∇a (∇b S) = (∇a S)∇b ∇a S = ∇b (∇S)2 = ∇b (−m2 ) = 0. (5.41)
2 2
The geodesic equation is absolutely inevitable quantum-mechanically.
Lect.
Let us think of a Kaluza-Klein spacetime with metric 13

−dt2 + dr 2 + r 2 (dθ 2 + sin2 θdφ2 ) + (dx5 )2 , (5.42)

where the x5 direction is curled up into a circle, and try to solve the Klein-Gordon equation
for this metric. We will have
 
∂2 m2
−2(4) − + 2 φ = 0. (5.43)
∂x5 2 ~

If we separate variables
φ = X(x5 )f (x1 , . . . , x4 ), (5.44)
you will discover by the usual argument that

1 ∂2X
− = constant = k2 , (5.45)
X ∂x5 2
5 5
which gives X(x5 ) = eikx with real k for k2 > 0, and X(x5 ) = e±|k|x for k2 < 0. But the
wavefunction must be single-valued, so under x5 7→ x5 + 2πR, the wavefunction must not
change. This means we must have k2 > 0 with
n
k= , (5.46)
R
where n is an integer. k must be quantised in units of R1 .
Recall that velocity in the x5 direction looks like electric charge. But the component of
velocity in the x5 direction is k, so charge is quantised.
But now we see precisely what is bad: Go back to the Klein-Gordon equation, which becomes
  
2 m2
−2(4) + k + 2 φ = 0. (5.47)
~

We see that the effective mass is also quantised, which is not observed.

38
5.2 Magnetic Monopoles
There is some folklore that any theory with charge quantisation has magnetic monopoles in it.
Kaluza-Klein theory in dimension five is
Z
1 √
I= d5 x g R; (5.48)
16πG(5)
this has as its symmetry group five-dimensional co-ordinate transformations.
Normally one would try to find interesting five-dimensional spacetimes such as Minkowski,
Schwarzschild, etc. Kaluza-Klein theory means specifying that the five-dimensional spacetime must
have a Killing vector that generates a circle S 1 of radius R.

From a four-dimensional spacetime, the action becomes something like


Z  √

1 4 √ 1 ij − 3φ 1 2
I= d x g R − Fij F e − (∇φ) ; (5.49)
16πG(4) 4 2

where the four-dimensional Newton’s constant G is

G(5)
G(4) = . (5.50)
2πR
This is four-dimensional general relativity coupled to a U (1) vector field, which is the Abelian gauge
invariance found in electromagnetism.
We have broken the symmetry group from five-dimensional co-ordinate transformations (diffeomor-
phisms) into the group of four-dimensional co-ordinate transformations ×U (1). This is an instance
of symmetry breaking, rather similar to what you do in Grand Unified Theories.
The vacuum solution in five dimensions in Minkowski space R4,1

−dt2 + dr 2 + r 2 dθ 2 + r 2 sin2 θdφ2 + dτ 2 ; (5.51)

the vacuum of Kaluza-Klein theory is metrically identical:

−dt2 + dr 2 + r 2 dθ 2 + r 2 sin2 θdφ2 + dz 2 , (5.52)

where the z direction is now a circle. To get from five-dimensional general relativity to four-
dimensional general relativity, you need to choose to wrap one direction up to form a circle. That
choice may seem restrictive, but it is simply what you do. You should really think of this as (an-
other) example of symmetry breaking.

In the vacuum, we have φ = 0, A = 0, Rij = 0. You can always find another solution of a
Ricci-flat four-dimensional metric times a flat fifth dimension, such as a magnetic monopole:
1 1
ds2 = −dt2 + (dr 2 +r 2 dθ 2 +r 2 sin2 θdφ2 )+V (r)(dz +4m(1−cos θ)dφ)2 , V = . (5.53)
V (r) 1 + 4m
r

The spacelike part of this metric is the “Euclidean” version of a four-dimensional space that has van-
ishing Ricci tensor, known as Taub-NUT space, and has rather strange properties in its Lorentzian

39
version. The solution obviously solves the five-dimensional Einstein equations.
The co-ordinate ranges are

0 ≤ r < ∞, 0 ≤ θ ≤ π, 0 ≤ φ < 2π. (5.54)

r = 0 corresponds to a co-ordinate singularity, and z must be identified with period 8πm; the radius
of the Kaluza-Klein circle is 4m. The last term in the metric represents the Kaluza-Klein direction.
Thus the vector potential of the electromagnetic field has non-vanishing expectation value,

A = 4m(1 − cos θ)dφ. (5.55)

This looks kind of weird, but we have

F = dA = 4m sin θ dθ ∧ dφ. (5.56)

This is a magnetic field with Fθφ = 4m sin θ.


You will remember that as a three vector,
1
Bα = εαβγ F βγ . (5.57)
2
If we look near r → ∞, we have
q
θφ 4m sin θ θφ 4m
F = 4 2 , Br = εrθφ F = g(3) F θφ = 2 . (5.58)
r sin θ | {z } r
r 2 sin θ

This represents a magnetic monopole of strength 4m. All other components of the electromagnetic
field fall off faster.
The vector potential is singular for θ = 0, but this is a gauge artefact. We can remove this by a
gauge transformation
A− 7→ A− + dΛ ≡ A+ (5.59)
and a corresponding transformation on z. If we choose Λ = −8mφ, then

A+ − −4m(1 + cos θ)dφ. (5.60)

The magnetic field from this is the same, but the singularity has been moved to θ + π, the south
axis. Just as for co-ordinate patches in general relativity, you have found two regions which are
related by a gauge transformation.

'$
 A− singular here.

&%
 A+ singular here.

40
The radius of the Kaluza-Klein circle is R = 4m = P , the magnetic monopole strength. Electric
charge for particles moving in this field is quantised in units of R1 , as we saw before. It follows that
for any particle charge q, qP = n must be an integer. This is the Dirac quantisation condition.
You can discover that r = 0 is only a co-ordinate singularity by calculating the curvature. Near
r = 0,
r
V (r) ∼ . (5.61)
4m
The metric near r = 0 is
4m 2 r
ds2 = −dt2 + (dr + r 2 dθ 2 + r 2 sin2 θdφ2 ) + (dz + 4m(1 − cos θ)dφ)2 . (5.62)
r 4m
We try to invent a co-ordinate transformation that gets rid of the singularity at r = 0: Under

ρ= r, dr = 2ρ dρ, (5.63)

the metric becomes


ρ2
ds2 = −dt2 + 4m(4dρ2 + ρ2 dθ 2 + ρ2 sin2 θdφ2 ) + (dz + 4m(1 − cos θ)dφ)2 . (5.64)
4m
You can get rid of the constants by overall rescaling, and discover that the spatial part is the metric
on flat R4 , written as
dρ2 + ρ2 × (metric on S 3 ). (5.65)
There is an entertaining generalization of the magnetic monopole metric, which we can write as
1
−dt2 + (d~x · d~x) + V (|~x|)(dz + A), (5.66)
V (|~x|)

where V and A satisfy


1 ~ ×A
~=∇
~ 1
∇2flat = 0 (~x 6= 0), ∇ . (5.67)
V (|~x|) V (|~x|)

We found that a simple pole in r in V1 does not cause a spacetime singularity. This suggests that
we can move a single monopole to ~x = ~x1 , such that
1
V = 4m (5.68)
1+ |~
x−~x1 |

or replace it by
1
V = PN 1
. (5.69)
1 + 4m i=1 |~
x−~xi |

Then A~ still satisfies the above relation. This is a configuration of N monopoles of strength 4m,
which are in neutral equilibrium!

41
5.3 S 3 as a Group Manifold
Lect.
Recall that the form of the Kaluza-Klein monopole metric near r = 0 is 14
ρ2
ds2 = −dt2 + 4m(4dρ2 + ρ2 dθ 2 + ρ2 sin2 θdφ2 ) + (dz − 4m cos θdφ)2 , (5.70)
4m
where we have shifted z. We claimed that the spatial part of this is flat R4 . We need to investigate
the S 3 part of this metric further. Start with flat four-dimensional space with metric

ds2 = dτ 2 + dx2 + dy 2 + dz 2 , (5.71)

where we define
x2 + y 2 + z 2 + τ 2 = ρ2 . (5.72)
We want to look at the metric on surfaces of constant ρ, this will give a metric on S 3 .
We use the Euler angle parametrization
   
θ 1 θ 1
x = ρ cos cos (φ − ψ) , y = ρ cos sin (φ − ψ) , (5.73)
2 2 2 2
   
θ 1 θ 1
z = ρ sin cos (φ + ψ) , τ = ρ sin sin (φ + ψ) . (5.74)
2 2 2 2
The ranges of the co-ordinates are

0 ≤ ρ < ∞, 0 ≤ θ ≤ π, 0 ≤ φ < 2π, 0 ≤ ψ < 4π. (5.75)

You can make life easier by assembling (x, y) and (z, τ ) into a pair of complex numbers
   
θ i θ i
u = x + iy = ρ cos exp (φ − ψ) , w = z + iτ = ρ sin exp (φ + ψ) . (5.76)
2 2 2 2
There is no escape from the following mess. To get the metric, notice that flat space in these
co-ordinates has line element
ds2 = du dū + dw dw̄. (5.77)
Note that not all spaces allow for the introduction of complex co-ordinates. We have
θ i 1 θ i i θ i
du = dρ cos e 2 (φ−ψ) − ρ sin dθe 2 (φ−ψ) + (dφ − dψ)ρ cos e 2 (φ−ψ) , (5.78)
2 2 2 2 2
θ i 1 θ i i θ i
dw = dρ sin e 2 (φ−ψ) + ρ cos dθe 2 (φ−ψ) + (dφ + dψ)ρ sin e 2 (φ−ψ) . (5.79)
2 2 2 2 2
4
The metric of R in these co-ordinates is

ds2 = du dū + dw dw̄


1 1 θ 1 θ
= dρ2 + ρ2 dθ 2 + ρ2 cos2 (dφ − dψ)2 + ρ2 sin2 (dφ + dψ)2 ; (5.80)
4 4 2 4 2
you can see that all cross-terms cancel out. We can rewrite this metric as
1 
ds2 = dρ2 + ρ2 dθ 2 + sin2 θdφ2 + (dψ − cos θ dφ)2 , (5.81)
4

42
which looks like the metric we had for the magnetic monopole.
This has nothing to do with S 3 being also a group manifold of SU (2). What do we mean by
group manifold? Elements of SU (2) are of the form
 
a b
, aa∗ + bb∗ = 1. (5.82)
−b∗ a∗

This is the same as the condition uu∗ + ww∗ = 1 (with ρ = 1) that we used previously.
How is one to form a metric, given that S 3 can be represented as a matrix group? We need to find
a basis of one-forms and construct a metric. Suppose you have g ∈ G. The first thing to do is to
construct the Lie algebra; then g−1 dg will give you a basis of one-forms which are left-invariant:
Under g → hg, g−1 dg is invariant.
Alternatively, one could construct right-invariant one-forms wich are invariant under g → gh. Then
one would use dg g−1 .
Suppose one starts with g−1 dg and sends g 7→ g−1 . Then

g−1 dg → g dg−1 , (5.83)

But if you take d of the equation gg−1 = 1, you find that

dg g−1 + g dg−1 = 0, (5.84)

and hence dg−1 = −g−1 dg g−1 . So g dg−1 = −dg g−1 , and the inverse map maps left-invariant
one-forms to right-invariant one-forms.
You can construct a bi-invariant metric by taking
1 1
− Tr(g−1 dg ⊗ g−1 dg) = − Tr(dg g−1 ⊗ dg g−1 ) (5.85)
2 2
where ⊗ is a tensor product of forms. To construct the Lie algebra of SU (2), we choose generators,
the Pauli matrices
     
0 1 0 −i 1 0
σ1 = , σ2 = , σ3 = . (5.86)
1 0 i 0 0 −1

They satisfy σi σj = δij 1 + iǫijk σK . The Euler angle parametrization of SU (2) is then
φ θ ψ
g = ei 2 σ3 ei 2 σ2 e−i 2 σ3 . (5.87)

These angles are almost the same co-ordinates as we used before. Use σ12 = 1 to write this as
   
φ φ θ θ ψ ψ
g = 1 cos + iσ3 sin 1 cos + iσ2 sin 1 cos − iσ3 sin (5.88)
2 2 2 2 2 2
   
φ θ ψ φ θ ψ φ θ ψ φ θ ψ
= cos cos cos + sin cos sin 1 + − cos sin sin − sin sin cos iσ1
2 2 2 2 2 2 2 2 2 2 2 2
   
φ θ ψ φ θ ψ φ θ ψ φ θ ψ
+ − sin sin sin + cos sin cos iσ2 + sin cos cos − cos cos sin iσ3
2 2 2 2 2 2 2 2 2 2 2 2
 i i 
cos θ2 e 2 (φ−ψ) i sin 2θ e− 2 (φ+ψ)
= i i (5.89)
i sin θ2 e 2 (φ+ψ) cos θ2 e− 2 (φ−ψ)

43
Clearly this is an element of SU (2). You will notice that in previous notation, this is
 
u iw̄
g= . (5.90)
iw ū

We need to calculate g−1 dg:  


−1 ū −iw̄
g = , (5.91)
−iw u
    
−1 ū −iw̄ du i dw̄ ūdu + w̄dw i ū dw̄ − i w̄ dū
g dg = = . (5.92)
−iw u i dw dū −i w du + i u dw w dw̄ + u dū
2 + 2M M + M 2 to get the metric
We use Tr M 2 = M11 12 21 22

1 1 
− Tr(g−1 dg g−1 dg) = − (ūdu + w̄dw)2 + (w dw̄ + u dū)2 − 2(ū dw̄ − w̄ dū)(−w du + u dw) .
2 2
(5.93)
To get back to the correct answer, use uū + ww̄ = 1, and

du ū + u dū + w dw̄ + dw w̄ = 0. (5.94)

This gives
1
− Tr(g−1 dg g−1 dg) = −(ūdu + w̄dw)2 + (ū dw̄ − w̄ dū)(−w du + u dw)
2
= (ūdu + w̄dw)(u dū + w dw̄) + (ū dw̄ − w̄ dū)(−w du + u dw)
= uūdu dū + ww̄dw dw̄ + ūw dw̄du + w̄u dw dū
−ūw du dw̄ + ww̄ du dū + uū dw dw̄ − uw̄ dw dū
= du dū + dw dw̄. (5.95)

In practice, this procedure does not work for any matrix bigger than 4 × 4.

We have a group G, and construct an element of the Lie algebra. You can think of

A = g−1 dg (5.96)

as a Lie algebra valued connection one-form. Thus A can be regarded as a Yang-Mills field. A
obeys the Maurer-Cartan equations: The first thing to calculate would be the curvature (field
strength) of A. This is

F = dA + A ∧ A = dg−1 ∧ dg + A ∧ A = −g−1 dg ∧ g−1 dg + A ∧ A = 0. (5.97)

For this reason, g−1 dg is sometimes referred to as a flat connection. Such an A is usually called a
pure gauge field, that means that it is just a gauge transformation of nothing. (You can see this
infintesimally, if g = 1 + ǫ, then g−1 dg = dǫ.)
These fields represent the classical vacuum states of Yang-Mills theory.

44
6 Aspects of Yang-Mills Theory
6.1 Spontaneous Symmetry Breaking
Lect.
We consider Yang-Mills theory with an SU (2) gauge group. The Lagrangian is 15
1 α α ab
L = − Fab F , (6.1)
4
where
α
Fab = ∂a Aαb − ∂b Aαa + gεαβγ Aβa Aγb . (6.2)
Here Greek indices run from one to three and εαβγ is the alternating symbol in three dimensions
which gives the structure constants for SU (2).
The idea of symmetry breaking is to break the gauge group G into a subgroup H by introducing a
triplet of scalar fields φα with a potential in the Lagrangian.
We need to introduce a gauge covariant derivative because the fields φα are charged under SU (2):

Da φα = ∂a φα + gεαβγ Aβa φγ . (6.3)

Then what you do is to add this scalar field into the action,
1
Lscalar = − Da φα D a φα . (6.4)
2
If that is all you have, nothing very interesting will happen; you need to add a potential. What
you add is entirely and utterly up to you. We choose this such that the theory is renormalizable,
which means one can have φ2 , φ3 and φ4 terms, and gauge invariant, so that V must be a singlet
under SU (2). This leaves two possibilities only,

φα φα and (φα φα )2 , (6.5)

where the first is a mass term and the second a quartic coupling. For a conventional mass term,
1 λ
V (φ) = m2 φα φα + (φα φα )2 , (6.6)
2 4
where we must have λ > 0 to obtain a stable theory, the only vacuum is φα = 0 everywhere.
In the situation we are interested here, which corresponds to the Higgs mechanism, one changes
the shape of the potential to be
1 λ
V (φ) = − m2 φα φα + (φα φα )2 , (6.7)
2 4
We obtain the familiar picture (compare with the discussion of domain walls, where one only had
a single scalar field):

45
6V (φ)

-
KA |φ|
A
A
A
A
unbroken SU (2) symmetry

The phase of the theory that has unbroken SU (2) is sitting at a local maximum of V (φ). This
is unstable, excitations around this point have mass im and correspond to “tachyons”. We are
looking for vacua, these are configurations φ = constant satisfying the field equations

D 2 φ − V ′ (φ) = 0, (6.8)

where also Aa ≡ 0. We must have V ′ (φ) = 0, hence

−m2 φα + λ(φα φα )φα = 0. (6.9)

One possible solution is φα ≡ 0, which is called a false vacuum. The second solution is
m2
φα φα = . (6.10)
λ
This defines a sphere in field space, which is a symmetric space S 2 = SU (2)/U (1) = G/H, where
G is the original gauge group and H is the group which one has broken G into. It is called the
true vacuum. The potential in the true vacuum takes the value
 2
1 2 m2 λ m2 m4
− m + =− < 0. (6.11)
2 λ 4 λ 4λ
This model might have cosmological implications since in the false vacuum, one would measure a
vacuum energy corresponding to a “cosmological constant” relative to the true vacuum.
In the true vacuum, there is a massless excitation around the sphere which is called a Goldstone
mode.
2
Now consider the gauge bosons. If you fix φ1 = φ2 = 0 and φ3 = mλ (say), you will discover that
A3 remains massless, while A1 and A2 end up with a mass term in the Lagrangian. To see the
mass term, look at
1
− Da φα D a φα , Da φα = ∂a φα + gεαβγ Aβa φγ . (6.12)
2
There is a term
1
− g2 εαβγ Aβa φγ εαδǫ Aδa φǫ (6.13)
2
in the Lagrangian. With the given values for φα , this is equal to
1 m4 1 g 2 m4 
− g2 2 εαβ3 Aβa gεαδ3 Aδa = − 2
A1a A1a + A2a A2a , (6.14)
2 λ 2 λ

46
which indeed is a mass term for A1 and A2 . There is no mass term for A3 , so this massless field
still has a U (1) gauge symmetry.
This is a toy model for symmetry breaking, called the Georgi-Glashow model, which is a proto-
2
type for unifying electromagnetic and weak interactions. The masses of A1 and A2 are gmλ , these
correspond to the W ± bosons; the massless A3 corresponds to the photon.

Let us check that the number of degrees of freedom is the same in the true vaccum and in the
false vacuum: Massless gauge bosons have two degrees freedom per point in space. Here we started
with A1 , A2 , A3 and φ1 , φ2 , φ3 , so this gives nine degrees of freedom in the false vacuum.
In the true vacuum, A1 and A2 are massive, and massive fields of spin s have (2s + 1) degrees of
freedom. What you see is that A1 and A2 have eaten the degrees of freedom of φ1 and φ2 .
The only field left is the scalar φ3 . The mass of this field at the minimum of the potential φ0 is
given by
1
V (φ) = V (φ0 ) + (φ − φ0 )2 V ′′ (φ0 )2 + . . . (6.15)
2 | {z }
=: m2

Here, we have
V ′′ (φ0 ) = −m2 + 3λφα φα |φα φα = m2 = −m2 + 3m2 = 2m2 . (6.16)
λ

The mass of the Higgs boson is 2m.
This is the simplest theory in which you unify electromagnetism with something else.

6.2 Magnetic Monopoles


The action of the theory is
Z  
4 1 α α ab 1 α a α 1 2 α α λ α α 2
I = d x − Fab F − Da φ D φ + m φ φ − (φ φ ) . (6.17)
2 2 2 4

We found vacuum solutions by asking for φ and A not depending on space. The next step then is
to look for static, spherically symmetric solutions.
You might think that spherical symmetry tells you that Aαa and φα only depend on r. But this is
not a gauge-invariant statement. You can only say that φα φα only depends on r, since this is a
singlet under the gauge group.

We are in flat space, where the metric in Cartesian co-ordinates is

ds2 = −dt2 + d~x · d~x. (6.18)

So try
xα H 2 (r) α α H 2 (r)
φα = H(r), φα φα = (x x ) = . (6.19)
gr 2 g2 r4 g2 r2
This is a reasonable guess as long as we do not mind using α as a spacetime index. Although this
was originally a gauge group index, it can be interpreted as a spatial index here. This looks special
to SU (2), but you can always apply it to any SU (2) subgroup of a given G.
Lect.
16
47
You then have to decide what to do with your gauge field:

xα εα ij xj
Aα0 = J(r), Aαi = (1 − K(r)). (6.20)
gr 2 gr 2
You find the equations of motion by substituting this ansatz into the Lagrangian and finding the
Euler-Lagrange equations for H, J and K. This of course is not a mathematically correct thing to
do. While it works, it is not guaranteed to work. At the end, you ought to check that H, J and K
really do satisfy the equations of motion.

The equations that you get are

r 2 K ′′ (r) = K(r)(K 2 (r) − 1) + K(r)(H 2 (r) − J 2 (r)), (6.21)


2 ′′ 2
r J (r) = 2J(r)K (r), (6.22)
 
λ m2 g 2
r 2 H ′′ (r) = 2H(r)K 2 (r) + H 3 (r) − 2
r H(r) . (6.23)
g2 λ2

if H, J and K all go to zero as r → ∞, there will be a long range classical gauge field.
You can solve these equations numerically and get a mess. What is more entertaining is that
you can solve these equations analytically in a particular limit (the Prasad-Sommerfield1 limit).
From the Yang-Mills coupling g and the scalar field coupling λ, we can construct a length scale
mg
C=√ (6.24)
λ
which controls the scale of the problem. The limit in which you can solve the equations analytically
is λ → 0 and g → 0 at fixed C. The solutions are
Cr
K= , J = 0, H = Cr coth(Cr) − 1, (6.25)
sinh Cr
so as r → ∞, K → 0 but A does not go to zero.
If one identifies the physical electromagnetic field
    1
(electromagn.)
Fab = ∂a φ̂α Aαb − ∂b φ̂α Aαa − εαβγ φα ∂a φβ ∂b φγ , (6.26)
g

where φ̂α is a rescaled Higgs field: p


φ̂α = φα φβ φβ . (6.27)
The output of this is of course well-known to all of us. As r → ∞,
1
Ei = 0, Bi = xi , (6.28)
gr 3
(electromagn.)
where we define F0i = −Ei and 12 εijk F (electromagn.)jk = Bi . This is a magnetic monopole
of charge g1 .

1
Sommerfield is not a famous person!

48
To see this, you can define magnetic charge by the following: Take the spatial R3 and consider
a sphere S 2 at spatial infinity. Then perform a Gaussian integral of the magnetic flux:
Z
1 ~ = 1.
~ · dS
B (6.29)
4π S∞ 2 g

Or as a differential form, you can integrate


Z Z
1 1 1
F = sin θ dθ ∧ dφ = . (6.30)
4π S∞2 4πg g

You should feel miserable about this for the following reason: From Stokes’ theorem, you would
expect Z Z Z
1 1 1
F = dA = A = 0. (6.31)
4π S 2 4π S 2 4π ∂S 2
This argument fails because A is not globally defined. Charges of this type are regarded as topo-
logical. To see that A cannot be globally defined, try
1
A= (1 − cos θ)dφ, (6.32)
g
where the spatial metric is
ds2 = dr 2 + r 2 dθ 2 + r 2 sin2 θdφ2 . (6.33)
The norm of A is
1 (1 − cos θ)2
||A||2 = A0 A0 = . (6.34)
g2 r 2 sin2 θ
Hence the norm blows up along the south axis θ = π. To remove this, you can perform a gauge
transformation which gives the same F , e.g.
1 2
A→ (−1 − cos θ)dφ, A → A + dΛ, Λ = − φ. (6.35)
g g

This then gives ||A||2 → ∞ along the north axis θ = 0.

6.3 Instantons in Yang-Mills Theory


The magnetic charge in the last example is of a topological nature. There are other examples
which are of great interest. A second example of topological charges are instantons, which arise
in Yang-Mills theory and gravity.
Let us consider Yang-Mills theory in flat R4 with positive signature, i.e. metric
2 2 2 2
ds2 = dx1 + dx2 + dx3 + dx4 . (6.36)

Instantons are solutions of the Yang-Mills equations with no singularities, with finite action. We
take G to be compact, then the action is
Z
1
I= F α ∧ ∗Fα . (6.37)
4

49
You can put F into canonical form (the normal form for an antisymmetric 4 × 4 matrix) by writing
it as
1 α a
F α = Fab dx ∧ dxb = F12α
dx1 ∧ dx2 + F34
α
dx3 ∧ dx4 . (6.38)
2
Then
α
∗F = F34 dx1 ∧ dx2 + F12
α
dx3 ∧ dx4 . (6.39)
The action is then Z
1
I= d4 x (F12
2
+ F2 ). (6.40)
2 | {z 34}
≥0

The Yang-Mills equations are


DA F = 0, DA ∗ F = 0. (6.41)
If we are interested in solutions with finite action, we must have F → 0 at infinity, and so A = g−1 dg
for some g at infinity.

It is fairly easy to discover that there is a topological charge for this system. Take the Chern-
Simons three-form (see first example sheet)
2
CS3 = A ∧ dA + A ∧ A ∧ A (6.42)
3
The associated topological charge is
Z Z Z
Tr(CS3 ) = Tr(dCS3 ) = Tr (dA ∧ dA + 2dA ∧ A ∧ A) (6.43)
3
S∞ R4 R4

Recall the definition


F = dA + A ∧ A (6.44)
to write this as
Z Z Z
Tr(CS3 ) = Tr (F ∧ F − 2A ∧ A ∧ A ∧ A) = Tr (F ∧ F ) , (6.45)
3
S∞ R4 R4

since the trace is invariant under cyclic permutations, which change the sign of A ∧ A ∧ A ∧ A. Thus
Z
1
Tr(F ∧ F ), (6.46)
8π 2
called the instanton number, is a topological charge.
The topological charge is related in a simple way to the action: In four dimensions with Euclidean
signature, ∗∗ = 1 on two-forms. Hence ∗ has eigenvalues ±1. You can therefore decompose two-
forms into self-dual and anti-self-dual parts. The self-dual part of F is
1
F+ = (F + ∗F ), (6.47)
2
the anti-self-dual part is
1
F− = (F − ∗F ), (6.48)
2
Clearly ∗F+ = F+ and ∗F− = −F− . Lect.
17
50
The action can be rewritten as
Z Z
1 α α α α 1 
I= (F+ + F− ) ∧ (F+ − F− ) = F+α ∧ F+α − F−α ∧ F−α . (6.49)
4 4
The two remaining terms are positive definite. To see this, take F in canonical form:

F α = F12
α
dx1 ∧ dx2 + F34
α
dx3 ∧ dx4 ≡ F12 dx12 + F34 dx34 (6.50)

in “cheating notation”. Then


1  1 
F+ = (F12 + F34 ) dx12 + dx34 , F− = (F12 − F34 ) dx12 − dx34 (6.51)
2 2
and
1 1
F+ ∧ F+ = (F12 + F34 )2 dx1234 , F− ∧ F− = − (F12 − F34 )2 dx1234 . (6.52)
2 2
Now let us compare this to the toplogical invariant we found, the “instanton number”, which was
Z Z Z
1 1 1 
k = 2 F ∧ F = 2 (F+ + F− ) ∧ (F+ + F− ) = 2 F+α ∧ F+α + F−α ∧ F−α , (6.53)
8π 8π 8π
since F+ ∧ F− = 0. Then there is a simple inequality:

I ≥ 2π 2 |k|, (6.54)

with equality if and only if



 I = 0, hence F ≡ 0 for k = 0,
F− ≡ 0 for k > 0, . (6.55)

F+ ≡ 0 for k < 0
The solutions where equality holds are self-dual, anti-self-dual, or both.
These bounds are absolutely universal in these kinds of situations in physics. The simplest instanton
ought then to be self-dual. For G = SU (2), this is
1
Aa (x) = (xb τ b )† τa , (6.56)
x2 + λ2
where τ1 , τ2 and τ3 are Pauli matrices and τ 4 = 1.
This self-dual solution has instanton number k = 1, and λ defines an arbitrary scale.

Physically instantons can be interpreted as how tunnelling proceeds in the context of quantum
field theory. The mathematics of instantons are a fascinating topic in their own right.

7 Gravitational Instantons
7.1 Topological Quantum Numbers for Gravity
Gravitational instantons have a number of similarities and a number of differences. We consider
non-singular solutions of the Einstein equations Rab = Λgab which have positive signature (++++).
The action for gravity is a bit of an embarrassment:
Z
1 √
I=− d4 x g (R − 2Λ) + possible boundary terms (neglected here). (7.1)
16π

51
The big problem here is that, unlike in Yang-Mills theory, this has no nice boundedness properties.
This usually suggests an instability in the theory.
The simplest way to see the unboundedness is to do a conformal transformation

gab → ĝab = Ω2 (x)gab . (7.2)

You can do a calculation and see that


Z
1 √ 
I[ĝ] = − d4 x g Ω2 (x)R + 6(∇Ω(x))2 − 2Λ Ω4 (7.3)
16π

up to integration by parts. To get this you need to find the conformally rescaled R (see example
sheet).
We are not asking for solutions to the equations of motion, but think about a general variation of
the action. As long as Λ > 0, the Λ Ω4 term corresponds to a positive potential, so that is fine.
But the term involving (∇Ω)2 has the wrong sign. The action can be made arbitrarily negative by
picking a rapidly oscillating Ω.

In gravity, there are two topological quantum numbers:

1. The Euler character. In two dimensions, this is


Z
1
χ= R. (7.4)
4π Σ

If Σ is compact and orientable, χ classifies these surfaces. You can write it as

χ = 2 − 2g, (7.5)

where g is the genus of Σ. Examples are

'$ '$ '$

&% &% &%


g=0 g=1 g=2

Alternatively,
χ = 2 − b1 , (7.6)
where b1 is the first Betti number. bp is the number of S p that cannot be contracted to a
point or deformed into each other.
On S 2 , you can always contract a circle to a point, so b1 = 0. If you can catch the manifold
with a piece of string, it has b1 > 0.

52
There is a theorem by Hodge which states that for compact manifolds, bp is equal to the
number of square integrable harmonic p-forms, these are p-forms satisfying
Z
dp = 0, d ∗ p = 0, pa pa < ∞. (7.7)

It immediately follows that for these manifolds, bp = bd−p , which is known as Poincaré du-
ality.

For d = 4, the Euler character can be written as


Z Z
1 ab 1 √
χ= R ∧ ∗Rab = d4 x g εabcd εabef Rcdgh Rghef , (7.8)
32π 2 128π 2
where you can have extra boundary terms if there is a boundary. In terms of Betti numbers,

χ = 2 − 2b1 + b2 . (7.9)

2. The Hirzebruch signature


Z Z
1 ab 1 √
τ= 2
R ∧ Rab = 2
d4 x g Rabcd Rabef εcd ef . (7.10)
48π 96π
It, too, has a topological interpretation in terms of Betti numbers:

τ = b+ −
2 − b2 , (7.11)

where b2 = b+ − +
2 + b2 and b2 is the number of self-dual harmonic square integrable 2-forms and
b−
2 is the number of anti-self-dual such forms.

There are generalizations of the Euler character for all even dimensions, and of the Hirzebruch
signature for all d which are multiples of four.
These look a bit like action and instanton number for Yang-Mills theory, respectively.
The intellectual history of general relativity is littered with the corpses of people who tried to
manipulate the Euler character into the action for general relativity. But general relativity just is
not like Yang-Mills theory.

The inequality analogous to I ≥ 2π 2 |k| for Yang-Mills theory is

2χ ≥ 3|τ |, (7.12)

with equality if and only if the Riemann tensor is self-dual or anti-self-dual (see example sheet).
Self-duality here means
1
Rabcd = εabef Ref cd . (7.13)
2

53
7.2 De Sitter Space
As an example of a gravitational instanton, we want to consider de Sitter spacetime, written in
static co-ordinates
 
Λ 2 dr 2
2
ds = − 1 − r dt2 + + r 2 (dθ 2 + sin2 θ dφ2 ), (7.14)
3 1 − Λ3 r 2

where Λ > 0. This satisfies Rab = Λgab .


The instanton associated with de Sitter spacetime is found by sending t to iτ . This preserves the
field equations. The metric of the instanton is
 
Λ 2 dr 2
ds = 1 − r dτ 2 +
2
+ r 2 (dθ 2 + sin2 θ dφ2 ). (7.15)
3 1 − Λ3 r 2
Lect.
De Sitter space can be easily viewed as an hyperboloid embedded in R4,1 with metric 18

ds2 = −dv 2 + dw2 + dx2 + dy 2 + dz 2 . (7.16)

The hyperboloid is
3
−v 2 + w2 + x2 + y 2 + z 2 = =: α2 > 0. (7.17)
Λ
In FRW co-ordinates, you can view de Sitter space as a space of constant spatial curvature k = 1
(you could also view it as k = 0 or k = −1). Here the co-ordinates cover the entirety of de Sitter
space. They are
     
t t t
v = α sinh , w = α cosh cos χ, x = α cosh sin χ cos θ,
α α α
   
t t
y = α cosh sin χ sin θ cos φ, z = α cosh sin χ sin θ sin φ. (7.18)
α α

If you are interested in the line element in the (t, χ, θ, φ) co-ordinates, you will discover it is
 
2 2 2 2 t 
ds = −dt + α cosh dχ2 + sin2 χ dθ 2 + sin2 θ dφ2 . (7.19)
α | {z }
Metric on S 3 in hyperspherical co−ordinates

The scale factor for this universe is


 
t
a(t) = α cosh , (7.20)
α

the Hubble parameter is  


ȧ 1 t
= tanh , (7.21)
a α α
the acceleration is
ä 1
= 2 = constant > 0. (7.22)
a α
We can picture de Sitter space as the hyperboloid

54
6t


S 3 at fixed t

What makes the spacetime interesting is the presence of cosmological horizons.

To the future of the light-cone of an observer moving on a timelike geodesic is a region of spacetime
the observer cannot see. This is not an event horizon. These regions are different for different
observers, the horizon is called a cosmological horizon. This leads to all kinds of interesting
paradoxes.
The Penrose diagram is
t = +∞
@
@
@
@
@
χ=0 @ χ=π
@
@
@
@
t = −∞
where we have drawn cosmological horizons for particular observers.
That looks a bit like the Penrose diagram for a black hole, but horizons depend on the observer.
Now we want to construct Schwarzschildesque co-ordinates for this spacetime:
r  ′ r  ′
r2 t r2 t
v = α 1 − 2 sinh , w = α 1 − 2 cosh ,
α α α α
x = r cos θ, y = r sin θ cos φ, z = r sin θ sin φ. (7.23)

Then the line element is, as claimed above,


 
r2 2 dr 2
ds = − 1 − 2 dt′ +
2 2 2 2 2
2 + r (dθ + sin θ dφ ), (7.24)
α 1 − αr 2

which only makes sense for 0 < r < α. This only covers the following region of de Sitter space:

@
@ ′
@ r = α, t → ∞
@
@
@
@

r = α, t@
′ → −∞
@
@

55
In cosmology, calculations are normally done in k = 0 co-ordinates, but these do not cover the
entire spacetime. You should better use k = 1 co-ordinates.
Now look at the de Sitter instanton: Find a “Euclidean” solution to Rab = Λgab by a co-ordinate
transformation t′ → iτ
 
r2 dr 2
2
ds = 1 − 2 dτ 2 + r2
+ r 2 (dθ 2 + sin2 θ dφ2 ). (7.25)
α 1 − α2
q
This is a perfect local solution, but has a singularity at r = α = Λ3 . This is actually a co-ordinate
singularity.
q To see this, define new co-ordinates. Since we want to analyse what happens near
3
r= Λ, set
r  
3 Λ 2
r= 1− δ , (7.26)
Λ 6
then r
Λ 3
dr = − δ dδ, r2 ≈ − δ2 , (7.27)
3 Λ
and the line element becomes
Λ 2 2 3
ds2 = δ dτ + dδ2 + (dθ 2 + sin2 θ dφ2 ) + terms higher order in δ. (7.28)
3 Λ
The singularity is now at δ = 0. This metric now consists of a metric on S 2 and a part which is
the metric on a plane (almost).
The metric on a plane with co-ordinates (ρ, ψ) is

ds2 = dρ2 + ρ2 dψ 2 . (7.29)

At ρ = 0, you need to identify ψ with ψ + 2π. If not, you get a conical singularity; then there is a
defect angle ∆ and ρ = 0 has a δ-function in curvature.
q In the present case, δ = 0 is a co-ordinate
singularity as long as τ is identified with period Λ3 2π.
This has some interesting physical consequences (see Black Holes course).

We already know that the solution is four-dimensional and has constant curvature

Rabcd ∝ (gac gbd − gad gbc ). (7.30)

It has ten Killing vectors, and so must be the metric on S 4 . You can embed it into R5 . We want
to evaluate the topological quantum numbers.
First try to get the proportionality constant. Contract Rabcd to get
Λ
Rac = 3(constant)gac = Λgac , constant = . (7.31)
3
The Hirzebruch signature of this space is
Z
1 √
τ= 2
d4 x g εab ef Rabcd Rcdef = 0. (7.32)
96π

56
All even-dimensional spheres have Euler character two, all odd-dimensional spheres have χ = 0.
We can compute it explicitly using
Z
1 √
χ= d4 x g εabcd εef gh Rabef Rcdgh . (7.33)
128π 2
Whilst this is a true formula, it is inconvenient for this calculation. You can show (on the example
sheet) that an equivalent formula gives
Z  
1 4 √ abcd ab 2
χ= d x g R R abcd − 4R R ab + R = 2. (7.34)
128π 2
The action for the de Sitter instanton is
Z
1 √ Λ
I=− d4 x g (R − 2Λ) = − · Vol, (7.35)
16π 8π
which is negative!

From Schwarzschildesque co-ordinates, you get g = r 2 sin θ. Then the volume of S 4 is
q q
3 3
Λ
2π Λ r  2
Z Z Z
2 2 8π 2 3 3 24π 2
Vol = dτ dr dθ dφ r sin θ = 4π r dr dτ = = , (7.36)
3 λ Λ Λ2
0 0

and the action is



.
I=− (7.37)
Λ
The partition function in statistical mechanics would be
Z
Z = Dφ e−I[φ] , (7.38)

1
where one integrates over all metrics that are periodic in complex time, with period T. It follows
that the temperature of de Sitter space is
r
1 Λ
T = . (7.39)
2π 3
It is believed, but nobody has proved, that for fixed Λ this is a lower bound of the action, so that
always
I ≥ IdeSitter , (7.40)
with equality only for de Sitter.

7.3 Other Examples


Now we can find other instantons, some of which are interesting, all of which are fun:

1. S 2 × S 2 , a direct product of two two-spheres (of radius a).


The metric is a direct product
 
2 Metric on S 2 0
ds = . (7.41)
0 Metric on S 2

57
These are two Einstein manifolds, satisfying Rab = Λgab . Lect.
The Ricci tensor also factorises: 19
 1 2 
a 2 · Metric on S 0
Rab = 1 . (7.42)
0 a2
· Metric on S 2

Then the Ricci scalar is a42 , and the metric solves Einstein’s equations with Λ = 1
a2 .
The topological quantum numbers are (Exercise)

χ = 4, τ = 0. (7.43)

The action is negative:


Z Z
1 √ Λ √ Λ 2π
I =− d4 x g (R − 2Λ) = − d4 x g = − (4πa2 )2 = −2πΛa4 = − . (7.44)
16π 8π 8π Λ

2. Fubini-Study Metric on CP2 (a compact manifold)


This is
!
dr 2 1 2 1 2 1
ds2 =  2 + r Λr 2
(dψ ± cos θ dφ) + dθ 2 + sin2 θ dφ2 Λr 2
. (7.45)
Λr 2 4 1+ 1+
1+ 6
6 6

If Λ = 0, then this is just flat space.


For Λ > 0, it has finite volume and is non-singular.
The topological quantum numbers are

χ = 3, τ = 1. (7.46)

The action is again negative,



I =− . (7.47)

It has interesting property of its curvature, regarding the Weyl tensor C a bcd . Consider the
Weyl two-form
1
C a b = C a bcd dxc ∧ dxd . (7.48)
2
Then for CP2 , the Weyl two-form is (anti-)self-dual,

∗C a b = ±C a b , (7.49)

depending on the choice of sign in the metric.


All of the examples presented here are of great interest in string theory.

3. K3 is the unique simply connected compact four-manifold that satisifies Rab = 0.


The proof of this does not rely on the construction of a metric; the metric is not known. We
know that the topological quantum numbers are

χ = 24, τ = ±16. (7.50)

58
The curvature form must be either self-dual or anti-self-dual.
There is a 58-parameter family of solutions of Rab = 0.
To see this, consider the Betti numbers: Since K3 is simply connected, b1 = 0, so

χ = 2 − 2b1 + b+ − + −
2 + b2 = 2 + b 2 + b2 , τ = b+ −
2 − b2 , (7.51)

which tells you that


b+
2 = 19, b−
2 = 3. (7.52)
There are 19 self-dual harmonic two-forms with components Fab I (I = 1, . . . , 19) and three

anti-self-dual harmonic two-forms with components GJab (J = 1, 2, 3).


Assume that we have a metric g which solves

Rab [g] = 0. (7.53)

Under a slight deformation of g, if

Rab [g + ǫh] = 0, ǫ ≪ 1, (7.54)

then g + ǫh is a new solution of the Einstein equations.


The condition for this (cf. gravitational waves, GR course) is the following differential equa-
tion
−2hab − 2Racbd hcd = 0, (7.55)
where h is not of the form ∇(a ξb) , which would just be a co-ordinate transformation. These
two conditions can be summarised as

∇a hab = 0. (7.56)
I and GJ obey
We know that Fab ab

1 abcd I 1
∇a F Iab = 0, F Iab = ε Fcd ; ∇a GJab = 0, GJab = − εabcd GJcd . (7.57)
2 2
You can construct
hab = GJac F Ic b + GJbc F Ic a , (7.58)
which generically will give 57 independent perturbations, one for each combination of I and
J.
This is transverse, ∇a hab = 0, so not a co-ordinate transformation, and satisfies

−2hab − 2Racbd hcd = 0. (7.59)

There must be one more such deformation. Since no scale is associated with this metric, it
can be multiplied by a constant. If gab obeys Rab = 0, then so does λgab for any constant λ.
The interesting thing to show is that this hab actually satisfies the wave equation (see example
sheet).

If the curvature is self-dual, this tells you something about holonomy.

59
A vector at any given point p can be parallelly transported around a closed loop C. Then the
components of the vector after and before parallel transport, written in some orthonormal
basis, satisfy a relation
V a (C, p) = M a b (C)V b (p). (7.60)
What are the properties of the matrix M ? Parallel transport does not change the norm, so
M (C) is a Lorentz transformation. This is because one uses a metric-preserving connection.
Therefore in the Euclidean case,
M ∈ SO(4). (7.61)
What is remarkably true for any metric with self-dual (or anti-self-dual) curvature is that
M ∈ SU (2). (This is a consequence of supersymmetry.)

4. Another example of huge importance in string theory is the Eguchi-Hanson metric on


a non-compact manifold. This is an attempt to construct the analogue of the Yang-Mills
intanton.
You start from the assumption that curvature should go to zero on an enormous S 3 near
infinity. You start with a metric on R4 :
1 1 
ds2 = dr 2 + r 2 dθ 2 + sin2 θ dφ2 + f (r)(dψ ± cos θ dφ)2 , (7.62)
f (r) 4

where θ, φ, ψ are Euler angles on S 3 with ranges

0 ≤ θ ≤ π, 0 ≤ φ < 2π, 0 ≤ ψ < 4π (7.63)

which cover all of S 3 , and 0 ≤ r < ∞.


This looks very reasonable. The function f can be determined to be
 a 4
f (r) = 1 − , (7.64)
r
where a is an arbitrary scale. This falls off very fast as r → ∞, so the metric is asymptotically
flat. There is a problem at r = a, which is identified as a co-ordinate singularity.
You can look at the (r, ψ) plane to discover this is a conical singularity. What is the condition
that the singularity at r = a is removed? We have done this many times, it tells you that the
period of ψ is 2π.
But we set the period of ψ to be 4π before. You can take the solution to be asymptotically
locally Euclidean, such that the boundary at infinity is not S 3 but S 3 /Z2 , a three-sphere
with antipodal points identified. (This is useful in string theory.)
The topological quantum numbers of this metric are

χ = 0, τ = ±1, (7.65)

where the given formulae need boundary corrections.


The curvature is again self-dual (or anti-self-dual). The action (which also requires boundary
corrections) is zero.

60
It can be generalised to the Multi-Eguchi-Hanson metric, which is of the form
1
ds2 = (dψ + ωi dxi )2 + V (~x)δij dxi dxj . (7.66)
V (~x)

This metric looks like the multi-monopole metric in Kaluza-Klein theory, but with t =
constant. Whereas in Kaluza-Klein theory we had
k
X 1
V =1+λ , (7.67)
|~x − ~xi |
i=1

for the Multi-Eguchi-Hanson metric, we delete the one:


k
X 1
V =λ , (7.68)
|~x − ~xi |
i=1

where as usual,
~ =∇
∇V ~ × ~ω . (7.69)
k = 1 is flat space, k = 2 is the Eguchi-Hanson metric; k > 2 is a Multi-Eguchi-Hanson
metric. These have self-dual curvature, boundary S 3 /Zk−2 , and zero action.

8 Positive Energy
8.1 Geometry of Surfaces
Lect.
Consider a (d − 1)-dimensional surface Σ, embedded in a d-dimensional manifold (possibly space- 20
time). If the surface is characterised by an equation f (x) = 0, the unit normal to the surface
is
na = N ∂a f, (8.1)
where N is a normalisation factor. In a Riemannian manifold, one can always normalise na such
that na na = 1.
In Lorentzian signature, you can have a timelike na (then Σ is called spacelike), a spacelike na (then
Σ is called timelike), or a null na (then Σ is called null). We will ignore the case where na na = 0
since it is more difficult.
We can assume in the following that
na na = ±1. (8.2)
We need other quantities to characterise the geometry of Σ: We can define a symmetric rank two
tensor h, called the first fundamental form for historical reasons, by

ha b = δa b ∓ na nb . (8.3)

It has the following properties.

61
1. h is a projection.

ha b hb c = (δa b ∓ na nb )(δb c ∓ nb nc )
= δ a c ∓ na nc ∓ na nc + na nb nb nc
|{z}
±1
= δca a a
∓ n nc = h c . (8.4)

2. In d dimensions, the trace of h is

ha a = δa a ∓ na na = d − 1. (8.5)

3. h is orthogonal to n in the following sense:

ha b na = (δa b ∓ na nb )na = nb ∓ na na nb = 0. (8.6)


| {z }
±1

You can deduce that any vector Y a defined on Σ can be decomposed into two parts:

Y a = δa b Y b = ha b Y b ± na nb Y b , (8.7)

which is a part tangential to Σ and a part perpendicular to Σ. Indeed, the first part is orthogonal
to na , while the second is annihilated by h.
Absolutely any vector or tensor can be decomposed in this way. In particular, you can project the
metric tensor into the surface Σ. You get

ha c hb d gab = ha c had = hcd . (8.8)

You can consider h as the induced metric on Σ.


There is also a second fundamental form, which describes how n changes as one moves around
Σ:
Kcd = ha c hb d ∇a nb . (8.9)
This is symmetric too: Since na = N ∇a f , we have
∇a N
∇a n b = N ∇a ∇b f + ∇a N ∇b f = N ∇a ∇b f + nb . (8.10)
N
Then  
a b ∇a N
Kcd = h c h d N ∇a ∇b f + nb = ha c hb d N ∇a ∇b f, (8.11)
N
since h annihilates n. This is symmetric for a torsion-free connection.
The covariant derivative of n will have components tangential to Σ and perpendicular to Σ:

∇a nb = (hc a ± nc na )(hd b ± nd nb )∇c ∇d


= Kab + (±nc na hd b ± hc a nd nb )∇c nd + nc na nd nb ∇c nd
= Kab ± nc na hd b ∇c nd
= Kab ± nc na (δd b ± nb nd )∇c nd
= Kab ∓ na ωb , (8.12)

62
where we have used
1
n d ∇c n d = ∇c (nd nd ) = 0 (8.13)
2
and
ωb = −nc ∇c nb (8.14)
is sometimes called the acceleration vector.
There is a notion of covariant derivative in Σ, which is defined by projecting the covariant derivative
of the d-dimensional manifold into Σ:
(d−1) ′ ′ ′
∇e T a... b... = he e ha a′ . . . hb b . . . ∇e′ T a ... b′ ... (8.15)

That may seem a bit perverse, but it is actually quite useful.


If you take ∇ to be the symmetric metric connection,
(d−1) ′ ′
∇a(d−1) ∇b f = ha a hb b ∇a′ (hx b′ ∇x f )
′ ′ ′
= ha a hx b ∇a′ ∇x f + ha a hb b (∇a′ hx b′ )(∇x f )
′ ′ ′
= ha a hx b ∇a′ ∇x f + ha a hb b (∇a′ (δx b′ ∓ nx nb′ ))(∇x f )
′ ′ ′
= ha a hx b ∇′a′ ∇x f ∓ ha a hb b nx (∇a′ nb′ )(∇x f ), (8.16)

where we used hb b nb′ = 0. The first term is symmetric, the second term is proportional to Kab ,
which we know is symmetric.
So (d−1) ∇a is a symmetric connection and its torsion vanishes. It also turns out to be a metric
connection:
(d−1)
∇c hab = hg c he a hf b ∇g hef
= hg c he a hf b ∇g (gef ∓ ne nf )
= ∓hg c he a hf b nf ∇g ne ∓ hg c he a hf b ne ∇g nf = 0. (8.17)

Thus (d−1) ∇ is the unique symmetric metric connection of h. One can find the curvature of it by
calculating
 
(d−1) (d−1) (d−1) ! (d−1)
∇a ∇b −(d−1) ∇b ∇ a Vc = Rabc d Vd (8.18)
for a vector Vc lying in Σ, i.e. satisfying nc Vc = 0. Doing this calculation requires a certain amount
of concentration:
(d−1)
Rabc d Vd = hp a hq b hr c ∇p (hx q hy r ∇x Vy ) − (a ↔ b) (8.19)
p q r x y x y x y x y
= h a h b h c ∇p ((δ q δ r ∓ n nq δ r ∓ δ q n nr + n nq n nr ) ∇x Vy ) − (a ↔ b).

Since h annihilates n, the only contributions can come from


(d−1)
Rabc d Vd = hp a hq b hr c (δx q δy r ∇p ∇x Vy ∓ nx (∇p nq )δy r ∇x Vy ∓ δx q ny (∇p nr )∇x Vy ) − (a ↔ b)
= hp a hx b hy c ∇p ∇x Vy ∓ hp a hq b (∇p nq )hy c nx ∇x Vy ∓ hp a hx b hr c (∇p nr )ny ∇x Vy − (a ↔ b)
= hp a hx b hy c Rpxyz V z ∓ 2K[ab] hy c nx ∇x Vy ∓ 2hp [a hx b] hr c (∇p nr )ny ∇x Vy . (8.20)

The second term is zero, for the third term use

ny ∇x Vy = ∇x (ny Vy ) − Vy ∇x ny = −Vy ∇x ny , (8.21)

63
so that
(d−1)
Rabc d Vd = hp a hx b hy c Rpxyz V z ± 2hp [a hx b] hr c (∇p nr )Vy ∇x ny
= hp a hx b hy c Rpxyz V z ± 2Kc[a hx b] Vy ∇x ny
= hp a hx b hy c Rpxyz V z ± 2Kc[a hx b] δy z V z ∇x ny
= hp a hx b hy c Rpxyz V z ± 2Kc[a hx b] (hy z ± ny nz )V z ∇x ny
= hp a hx b hy c Rpxyz V z ± 2Kc[a Kb]z V z
= (hp a hx b hy c Rpxyz ± Kca Kbz ∓ Kcb Kaz ) V z . (8.22)

We have obtained Gauss’ equation Lect.


(d−1)
21
Rabcd = hp a hq b hr c hs d Rpqrs ∓ Kac Kbd ± Kbc Kad . (8.23)

For the Ricci tensor, we obtain


(d−1)
Rbd = hac (d−1) Rabcd = hq b hs d hpr Rpqrs ∓ KKbd ± Kbc K c d . (8.24)

You can also find a formula for the Ricci scalar:


(d−1)
R = hbd (d−1) Rbd
= hpr hqs Rpqrs ± K 2 ± Kbc K bc
= (gpr ∓ np nr )(gqs ∓ nq ns )Rpqrs ∓ K 2 ± Kbc K bc
= R ∓ 2np nr Rpr ∓ K 2 ± Kbc K bc . (8.25)

This is very useful if you want to divide up spacetime into space and time. There is another useful
equation, the Codazzi equation. This comes from taking the divergence of K:
(d−1)
∇a Kc a −(d−1) ∇c K = hf a he c ha g ∇f Ke g − hb c ∇b (had ∇a nd )
= hf g he c ∇f (hg x hy e ∇y nx ) − hb c ∇b (had ∇a nd )
= hf x hy c ∇f ∇y nx − hb c had ∇b ∇a nd
= hf x hy c (∇f ∇y nx − ∇y ∇f nx )
= hf x hy c Rf y x z nz = hy c Ryz nz . (8.26)

We have used the fact that the connection (d−1) ∇ preserves h.


Two obvious uses for this formalism are the canonical formulation of general relativity (ADM
formalism), and the positive energy theorem, which we will do next.

8.2 Spinors in Curved Spacetime


In Minkowski space, a spinor ψ is a four-component object which transforms in the following way:
Under a Lorentz transformation
x → x′ = Lx, (8.27)
it transforms as  
1 µν
ψ→ 1 + γµν Λ + . . . ψ, (8.28)
2

64
where
1 1
γµν = (γµ γν − γν γµ ) = [γµ , γν ] (8.29)
2 2
and Λµν are the parameters of an infinitesimal Lorentz transformation. The matrices Jµν = 12 γµν
are the generators of the Lorentz group in the spinor representation, the satisfy

[Jµν , Jρσ ] = −ηµρ Jνσ + ηνρ Jµσ + ηµσ Jνρ − ηνσ Jµρ . (8.30)

What happens in an arbitrary spacetime? At each point, you can always construct the tangent space
by finding the vierbeins satisfying gab = ea µ eb ν ηµν . Then under a local Lorentz transformation of
the vierbeins, a spinor field transforms like a Minkowski space spinor:
 
1 µν
ψ → 1 + γµν Λ (x) + . . . ψ. (8.31)
2

Since you have a flat metric at each point, you define as before

{γ µ , γ ν } = 2η µν · 1, (8.32)

such that γ 0 is anti-Hermitian and γ i are Hermitian.


In practice, a useful representation of the γ-matrices is
   
0 0 −1 i 0 σi
γ = , γ = , (8.33)
1 0 σi 0

where σ i are the Pauli matrices. These matrices do not depend on the co-ordinates.
You can as always turn the Lorentz index into a spacetime index by contracting with e:

γ a = ea µ γ µ . (8.34)

Then the matrices γ a generally depend on co-ordinates.


You cannot accommodate spinors withour using either vielbeins or a basis of one-forms.

You want some notion of a covariant derivative of a spinor. This should be a quantity Da ψ
which transforms as a spinor under local Lorentz transformations, and a covector under co-ordinate
transformations.
It is easier to invent Dψ, a spinor-valued one-form, and treat ψ as a spinor-valued 0-form. This is
1
Dψ = ∂a ψ dxa + γµν ω µν ψ. (8.35)
4
As usual, under a Lorentz transformation of ψ, generated by Λ, the first term gives ∂Λ terms,
which are compensated by the connection. If ω is torsion-free, we have

dE µ = −ω µ ν ∧ E ν , (8.36)

and under a Lorentz transformation E → LE you get (schematically)

L dE + dL E = −ω ∧ LE. (8.37)

65
Adding a connection term cancels the ∂Λ terms you get under a Lorentz transformation.

There is a Ricci identity for spinors:


  
1 ρσ 1 µν
DDψ = d + γρσ ω ∧ dψ + γµν ω ψ
4 4
1 1 1 1
= γµν dω µν ψ − γµν ω µν dψ + γρσ ω ρσ dψ + γρσ γµν ω ρσ ∧ ω µν ψ
4
 4 4  16
1 1
= γµν dω µν + [γρσ , γµν ]ω ρσ ∧ ω µν ψ
4 32
 
1 1
= γµν dω µν + (ηµρ γνσ − ηνρ γµσ − ηµσ γνρ + ηνσ γµρ ) ω ρσ ∧ ω µν ψ
4 16
 
1 µν 1 σ µν
= γµν dω + γνσ ωµ ∧ ω ψ
4 4
 
1 1 1
= γµν dω µν + γµν ω µτ ∧ ωτ ν ψ = γµν Rµν ψ, (8.38)
4 4 4

where Rµν is just the curvature 2-form. You could turn this into components:
1 1
(Da Db − Db Da )ψ = Rabµν γ µν ψ = Rabcd γ cd ψ. (8.39)
4 4
The Dirac equation is
(γ a Da + m)ψ = 0. (8.40)
An idea of great technological importance is that of a constant spinor:

Da ψ = 0, (8.41)

these are 16 equations. If a constant spinor exists, one must have Da Db ψ = 0 and hence

Rabcd γ cd ψ = 0. (8.42)

Thus curvature is an obstruction to having a constant spinor. Lect.


We are trying to find constant spinors in Minkowski spacetime, i.e. solutions of 22

∇a ǫ = 0. (8.43)

We use the γ matrices that we defined before. This is best done in spherical co-ordinates, where
the metric is
ds2 = −dt2 + dr 2 + r 2 dθ 2 + r 2 sin2 θdφ2 (8.44)
We pick a basis of one-forms:

E 0 = dt, E 1 = dr, E 2 = r dθ, E 3 = r sin θ dφ (8.45)

As usual we calculate the components of the connection one-form using

dE α = −ω α β ∧ E β . (8.46)

66
The nonvanishing connection components are
1 2 1 1
ω21 = E , ω32 = cot θE 3 , ω31 = E 3 . (8.47)
r r r
You find that the possible spinors are the following
 iθ/2
e (Aeiφ/2 + Be−iφ/2 )

 e−iθ/2 (Aeiφ/2 − Be−iφ/2 ) 
ǫ=  eiθ/2 (Ceiφ/2 + De−iφ/2 )  ,
 (8.48)

e−iθ/2 (Ceiφ/2 − De−iφ/2 )


where A, B, C and D are complex constants.
There is something inherently spinorial about this. If you rotate φ 7→ φ + 2π around the z-axis, ǫ
will go to −ǫ. That is characteristic of a spinor which is not single-valued in spacetime.
Under a rotation φ 7→ φ + 4π, the spinor transforms into itself. This reflects the fact that spinors
are not representations of SO(3, 1), but rather its universal cover.

8.3 Definition of Mass


In general relativity, conserved quantities are only associated with Killing vectors. It is difficult to
give a definition of energy.
We consider asymptotically flat spacetimes, which have Penrose diagram
@
@
Σ @
@
@
@
@
@
@
@

Here Σ is some spacelike surface.


You would like to invent a Gaussian integral; we need to find a two-form to integrate over the S 2
at infinity.
For stationary spacetimes there is a notion of energy, since we have a Killing vector ka ∂x∂ a = ∂t

.
k then defines a one-form and you can integrate
Z
1
∗dk = M for Schwarzschild. (8.49)
8π S∞ 2

All that is required for this definition is a metric asymptotic to the Schwarzschild metric,
   −1
2 2M 2 2M 
ds = − 1 − + . . . dt + 1 − + ... dr 2 + r 2 dθ 2 + sin2 θ dφ2 (8.50)
r r
The one-form k associated with the Killing vector is
 
2M
k = −1 + + . . . dt, (8.51)
r

67
then
 
2M 2M
dk = + . −
. . dr ∧ dt = 2 dt ∧ dr + . . . ,
r2 r
 
2M 2 1
∗dk = 2 r sin θ dθ ∧ dφ + . . . = 2M sin θ dθ ∧ dφ + O . (8.52)
r r
Hence Z Z
∗dk = 2M sin θ dθ ∧ dφ = 8πM, (8.53)
2
S∞ 2
S∞

which works for any stationary metric.


We need to apply the divergence theorem to this:
Z Z
1 ab 1
M =− dS ∇a kb = − dΣb ∇a ∇a kb , (8.54)
4π S∞ 2 4π Σ

where Σ is any surface asymptotic to a two-sphere. Since

2kb = −∇a ∇b ka = −∇b ∇a ka − Ra bac kc = −Rbc kc , (8.55)

where we have used Killing’s equation ∇(a kb) = 0, this becomes


Z
1
M= dΣb Rbc kc . (8.56)
4π Σ
You can pick the surface such that k is everywhere normal to Σ. Then
Z Z  
1 b c 1
M= dΣ Rbc k k = 2 dΣ Tab − T gab ka kb (8.57)
4π Σ Σ 2
by the Einstein equations. This is the closest you can get to something like a Gaussian integral in
general relativity. For a perfect fluid,

Tab = (p + ρ)ua ub + pgab , (8.58)

where ρ is the energy density of the fluid, p its pressure and u its velocity. If you pick an orthonormal
frame, where uµ = (1, 0, 0, 0),
 
ρ 0 0 0
0 p 0 0
Tµν =  0 0 p 0  , T = 3p − ρ.
 (8.59)

0 0 0 p
For kµ = (1, 0, 0, 0), the mass is
Z 
 Z
1
M =2 dΣ ρ + (3p − ρ) = dΣ(ρ + 3p). (8.60)
Σ 2 Σ

Why should M be positive? If not, perpetual motion machines using gravity seem possible. This
would be a sign of instability in the theory.
In Newtonian theory (+ special relativity), the total energy would be

rest mass energy + kinetic energy + potential energy. (8.61)

68
Since potential energy is negative and scales with M 2 , there is no guarantee that this is bounded
by the positive contributions.
In general relativity, an example of a spacetime with negative energy is given by the Schwarzschild
metric    
2 2M 2 2M −1 2
ds = − 1 − dt + 1 − dr + r 2 dθ 2 + r 2 sin2 θ dφ2 (8.62)
r r
M is just a constant of integration, and the spacetime with negative M is still a solution of Rab = 0
locally. It contains a naked singularity at r = 0.

8.4 Energy Conditions


We must require the energy-momentum tensor to satisfy certain conditions. Possible conditions
are

• Weak Energy Condition


T ab ta tb ≥ 0 (8.63)
for any timelike vector t. This means the energy density must be positive in any frame. In
an orthonormal frame in which the matter is at rest,
 
ρ 0 0 0
0 p 0 0
T µν = 
0 0 p 0
 (8.64)

0 0 0 p

is isotropic in space. We can therefore take t to be

tµ = (cosh θ, sinh θ, 0, 0). (8.65)

Then we have
ρ cosh2 θ + p sinh2 θ ≥ 0. (8.66)
Setting θ = 0, we see we must have ρ ≥ 0; in the limit θ → ∞ we get ρ ≥ −p. Almost all
known forms of matter obey this condition. It is not very useful for proving theorems.

• Dominant Energy Condition


This states that T 00 ≥ |T ab | in any orthonormal frame, or

ρ ≥ |p| ≥ 0. (8.67)

Another way of expressing this is to say that

wa = T ab tb (8.68)

is not spacelike for arbitrary timelike or null vectors tb . Examples of use are

(i) If the dominant energy condition holds, the event horizon of a black hole is spherical in
d = 4.

69
(ii) If the dominant energy condition holds, energy in general relativity is positive (see
below.)

Examples:

1. The condition holds for all plausible classical matter including a cosmological constant.
2. The condition is not true in QFT. Consider the Casimir effect which has negative energy
density.

• Strong Energy Condition Lect.


This is useful for singularity theorems. The condition is 23

Rab ta tb ≥ 0, (8.69)

where ta is an arbitrary timelike or null vector, and is a geometrical condition rather than a
physical one. You can translate this into a condition on Tab via the Einstein equations:
 
1
Tab − T gab ta tb ≥ 0. (8.70)
2

If you pick t to be a unit vector as before, then you can turn this into conditions on the
pressure and energy density:
p + ρ ≥ 0, 3p + ρ ≥ 0. (8.71)
This proves positivity of mass for static spacetimes, where ka is everywhere timelike, and
Z
M = dΣ (ρ + 3p). (8.72)

The strong energy condition does not hold for a positive cosmological constant, which has

ρ > 0, p = −ρ. (8.73)

This messes up some singularity theorems.

8.5 Proof of Positive Energy


We want to prove that energy is positive in general. We suppose our spacetime is asymptotically
flat, and contains a spacelike surface Σ that does not contain singularities. This will have an
induced metric hab and an outward normal ta , which we assume is normalised. We assume the
surface is asymptotic to a two-surface with volume element
1
dS ab = (ta r b − tb r a )dS. (8.74)
2
Asymptotically, we have ta ∼ (1, 0, 0, 0) and r a ∼ (1, 0, 0, 0) (in spherical polar co-ordinates). Then
1 1
dS 01 ∼ r 2 sin θ dθ dφ, dS 10 ∼ − r 2 sin θ dθ dφ, all others vanish. (8.75)
2 2

70
m
All that is needed is to find a vector field X a such that X ∼ r2
in the r-direction. Then
Z
1
I =− dS ab ta Xb (8.76)
8π S∞ 2

will correspond to mass. The idea is to turn this into an integral over Σ and find something that
is positive, applying the divergence theorem:
Z
1
I = − dS ab (ta Xb − tb Xa )
16π S∞ 2
Z √
1
= − d3 x hta ∇b (ta Xb − tb Xa )
16π Σ
Z √  
1
= − d3 x h (ta ∇b ta )Xb + ta ta ∇b Xb − (ta ∇b tb )Xa − ta tb ∇b Xa
16π Σ
Z √  
1
= − d3 x h −∇b Xb − (ta ∇b tb )Xa − ta tb ∇b Xa , (8.77)
16π Σ

where we write dΣa = d3 x h ta and we have used ta ta = −1. Since hab = gab + ta tb , we can write
this as Z √  
1
I= d3 x h hab ∇a Xb + ta Xa ∇b tb . (8.78)
16π Σ
If we choose a vector field X that lies in Σ everywhere, the second term vanishes:
Z √  
1
I= d3 x h hab ∇a Xb . (8.79)
16π Σ

To show that this is positive, you have to invent something that turns this into an integral over a
square of soemthing. The only possibility is to use a spinor to do this:

Xa = ǫ† ∇a ǫ. (8.80)

Then
hab ∇a Xb = hab ∇a ǫ† ∇b ǫ + hab ǫ† ∇a ∇b ǫ (8.81)
To get an X which actually lies in Σ, we project it:

X a = hab ǫ† ∇b ǫ. (8.82)

The motivation to do things this way came from supergravity.


We must invent an equation for ǫ to satisfy. Remember we need X r ∼ rm2 near infinity. A first
guess would be ǫ ∼ r −1/2 , but does not quite work. We rather assume boundary conditions

ǫ → constant as r → ∞. (8.83)

It is possible to find such an ǫ near infinity.


We only need a spinor in the surface Σ. Consider the Dirac equation γ a ∇a ǫ = 0 which does not
only describe things in Σ, so take its projection into Σ. This is the Witten equation

hab γa ∇b ǫ = 0. (8.84)

71
The rest of this is an awful calculation. We write
 
ǫ1
 ǫ2 
ǫ=
 ǫ3  .
 (8.85)

ǫ4

You can pick one of the spinors that are covariantly constant in flat space:

ǫ1 = ei(θ+φ)/2 , ǫ2 = ei(φ−θ)/2 , ǫ3 = ǫ4 = 0. (8.86)

The metric on Σ, near infinity, must look like the Schwarzschild metric,
 
2 2M
dsΣ = 1 + + . . . dr 2 + r 2 (dθ 2 + sin2 θ dφ2 ). (8.87)
r

If you solve the Witten equation in powers of 1r , you find


     
i(θ+φ)/2 2M 1 i(φ−θ)/2 2M 1
ǫ1 = e 1− +O 2
, ǫ2 = e 1− +O , ǫ3 = ǫ4 = 0.
r r r r2
(8.88)
Then near infinity, Xr is
2M i(θ+φ)/2 2M 4M
X r = hr b ǫ† ∇b ǫ = e−i(θ+φ)/2 2
e + 2 + ... = 2 + ... (8.89)
r r r
Then indeed Z
1
M =− dSab ta X b , (8.90)
16π 2
S∞

so we can take the integral I as a defintion of mass.


How do you know that such a solution to the Witten equation exists? Let us write

W = hab γa ∇b , (8.91)

so that we try to find a solution of W ǫ = 0. You can write ǫ = ǫ0 + ǫ1 , where ∇a ǫ0 = 0 in flat


space. Then we want
W ǫ1 = −W ǫ0 , (8.92)
where the right-hand side is fixed. You can find the Green’s function of W , call it G. Then
Z √
ǫ1 (x) = − d3 x h G(x, x′ )(W ǫ0 (x′ )). (8.93)

Since this method seems to be able to prove existence of any kind of solution you can think of, it
is not a rigorous proof, which however apparently exists. Now let us go back to
Z √
1
M= d3 x h hab ∇a Xb , Xb = ha b ǫ† ∇a ǫ. (8.94)
16π Σ

72
Square the Witten equation to get
 
0 = hcd γc ∇d hab γa ∇b ǫ
= hcd γc hab γa ∇d ∇b ǫ + hcd γc γa (∇b ǫ)(∇d hab )
= hcd hab (gca + γca ) ∇d ∇b ǫ + hcd γc γa (∇b ǫ)(∇d hab )
= hdb ∇d ∇b ǫ + hcd hab γca ∇[d ∇b] ǫ + hcd γc γa (∇b ǫ)(∇d hab )
1
= hdb ∇d ∇b ǫ + hcd hab γca Rdbef γ ef ǫ + hcd γc γa (∇b ǫ)(∇d hab ), (8.95)
8
where we have used the Ricci identity. The first term is a sort of Laplacian projected into Σ. We
will use this to show that the mass integral is positive: Lect.
Z √   24
1
M = d3 x h hab ∇a hc b ǫ† ∇c ǫ
16π Σ
Z √  
1
= d3 x h hab (∇a ǫ† )hc b ∇c ǫ + hab hc b ǫ† ∇a ∇c ǫ + hab (∇a hc b ) ǫ† ∇c ǫ (8.96)
16π Σ
Since the Dirac conjugate in a curved spacetime must be taken to be

ǭ = ǫ† γ a ta , (8.97)

we have
ǭγ b tb = ǫ† γ a γb ta tb = ǫ† ta ta = −ǫ† . (8.98)
Hence

(∇a ǫ)† = −∇a ǫγ b tb = −∇a ǭγ b tb = −∇a (ǫ† γ c tc )γ b tb = ∇a ǫ† − ǫ† γ c (∇a tc )γ b tb , (8.99)

and
Z √  ac 
1
M= d3 x h h (∇a ǫ)† ∇c ǫ + hac ǫ† γ d (∇a td )γ b tb ∇c ǫ + hac ǫ† ∇a ∇c ǫ + hab (∇a hc b ) ǫ† ∇c ǫ .
16π Σ
(8.100)
The first term is positive, and only zero if ∇a ǫ = 0 everywhere. For the remaining terms, use the
squared Witten equation to get
Z √ 
1
M = d3 x h hac (∇a ǫ)† ∇c ǫ + hac ǫ† γ d (∇a td )γ b tb ∇c ǫ + hab (∇a hc b ) ǫ† ∇c ǫ
16π Σ

1 † cd ab ef cd † ab
− ǫ h h γca Rdbef γ ǫ − h ǫ γc γa (∇b ǫ)(∇d h ) . (8.101)
8
First consider the term involving the Riemann tensor:
1 1   
− ǫ† hcd hab γca Rdbef γ ef ǫ = − ǫ† gcd + tc td gab + ta tb γc γa γe γf Rdb ef ǫ
8 8
1  
= − ǫ† Rcaef γc γa γe γf + 2tc td Rd aef γc γa γe γf ǫ. (8.102)
8
From the Bianchi identity,
 
Rd aef γa γe γf = − Rd ef a + Rd f ae γa γe γf
= −Rd ef a (γe γf γa + 2gae γf − 2gaf γe ) − Rd f ae (γf γa γe + 2gef γa − 2gaf γe )
= −2Rd aef γa γe γf − 6Rd f γf , (8.103)

73
hence
Rd aef γa γe γf = −2Rd f γf (8.104)
and
1 1  
− ǫ† hcd hab γca Rdbef γ ef ǫ = − ǫ† −2R − 4tc td Rd f γc γf ǫ
8 8   
1 † c d f 1 f
= ǫ 2R + 4t t 8πTd + Rδd γc γf ǫ
8 2
 
= 4πTd f ǫ† tc γc γf ǫ td , (8.105)

where we used the Einstein equations. Now

wf := ǫ† tc γc γf ǫ (8.106)

is a timelike vector (check), and so if dominant energy is satisfied, this term is positive. There are
three other terms in M which cancel:

hac ǫ† γ d (∇a td )γ b tb ∇c ǫ + hab (∇a hc b ) ǫ† ∇c ǫ − hcd ǫ† γc γa (∇b ǫ)(∇d hab )


 
= hac ǫ† γ d (∇a td )γ b tb ∇c ǫ + hab ∇a (tc tb ) ǫ† ∇c ǫ − hcd ǫ† γc γa (∇b ǫ)∇d ta tb
 
= hac ǫ† γ d (∇a td )γ b tb ∇c ǫ + hab tc (∇a tb ) ǫ† ∇c ǫ − hcd ǫ† γc γa (∇b ǫ)ta ∇d tb − hcd ǫ† γc γa (∇b ǫ)tb (∇d ta )
= hac ǫ† γ d Kad γ b tb ∇c ǫ + hab tc Kab ǫ† ∇c ǫ − hcd ǫ† γc γa (∇b ǫ)ta Kd b − hcd ǫ† γc γa (∇b ǫ)tb Kd a
= ǫ† γ d K c d γ b tb ∇c ǫ + tc Kǫ† ∇c ǫ − ǫ† γc γa (∇b ǫ)ta K cb − K ca ǫ† γc γa (∇b ǫ)tb = 0. (8.107)

One is left with Z √  ac 


1
M= d3 x h h (∇a ǫ)† ∇c ǫ + 4πTd f wf td . (8.108)
16π Σ
You can interpret the first part as the energy of the gravitational field, and the second as the energy
of matter. The amazing thing is the way is proof is done is based on supergravity.
The result that M ≥ 0 as long as dominant energy holds is absolutely true in classical general
relativity.
It must mean that gravitational energy is not localised: Imagine some matter distribution in a
region on the surface Σ, then you could deform Σ slightly to a new surface Σ′ not including this
region. M would be the same for Σ and Σ′ , but the contributions from the two terms could be
quite different.
The partition between “gravitational” energy and “matter” energy is different on different surfaces.
So one sees that gravitational energy cannot be localised.
Finally,
M = 0 ⇒ ∇a ǫ = 0, Tab = 0, (8.109)
so if spacetime is asymptotically flat with no horizons and zero mass, it must be flat space.

- END -

74

You might also like