QFT1notes PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 271

June 2012

Introduction to quantum field theory I

lectures by
Horatiu Nastase

Instituto de Fı́sica Teórica, UNESP

São Paulo 01140-070, SP, Brazil

1
Contents
1 Lecture 1: Review of classical field theory: Lagrangeans, Lorentz group
and its representations, Noether theorem 4

2 Lecture 2: Quantum mechanics: harmonic oscillator and QM in terms of


path integrals 14

3 Lecture 3: Canonical quantization of scalar fields 24

4 Lecture 4: Propagators for free scalar fields 33

5 Lecture 5: Interaction picture and Wick theorem for λϕ4 in operator for-
malism 42

6 Lecture 6: Feynman rules for λϕ4 from the operator formalism 51

7 Lecture 7: The driven (forced) harmonic oscillator 61

8 Lecture 8. Euclidean formulation and finite temperature field theory 70

9 Lecture 9. The Feynman path integral for a scalar field 79

10 Lecture 10. Wick theorem for path integrals and Feynman rules part I 85

11 Lecture 11. Feynman rules in x-space and p-space 93

12 Lecture 12. Quantization of the Dirac field and fermionic path integral 100

13 Lecture 13. Wick theorem, gaussian integration and Feynman rules for
fermions 112

14 Lecture 14. Spin sums, Dirac field bilinears and C,P,T symmetries for
fermions 122

15 Lecture 15. Dirac quantization of constrained systems 131

16 Lecture 16. Quantization of gauge fields, their path integral, and the
photon propagator 140

17 Lecture 17. Generating functional for connected Green’s functions and


the effective action (1PI diagrams) 151

18 Lecture 18. Dyson-Schwinger equations and Ward identities 161

19 Lecture 19. Cross sections and the S-matrix 171

2
20 Lecture 20. S-matrix and Feynman diagrams 181

21 Lecture 21. The optical theorem and the cutting rules 189

22 QED: Definition and Feynman rules; Ward-Takahashi identities 197

23 Lecture 23. Nonrelativistic processes: Yukawa potential, Coulomb poten-


tial and Rutherford scattering 205

24 Lecture 24. e+ e− → l¯l unpolarized cross section 215

25 Lecture 25. e+ e− → l¯l polarized cross section; crossing symmetry 224

26 Lecture 26. (Unpolarized) Compton scattering 235

27 Lecture 27. One-loop determinants, vacuum energy and zeta function


regularization 243

28 Lecture 28. One-loop divergences for scalars; power counting 252

29 Lecture 29. Regularization, definitions: cut-off, Pauli-Villars, dimensional


regularization 262

3
1 Lecture 1: Review of classical field theory: Lagrangeans,
Lorentz group and its representations, Noether the-
orem
In these lectures, I will assume that classical field theory and quatum mechanics are known,
and I will only review of few notions immediately useful in the first two lectures. In this
lecture, I will start by describing what is QFT, and then I will review a few things about
classical field theory.
What is and why Quantum Field Theory?
Quantum mechanics deals with the quantization of particles, and is a nonrelativistic
theory: time is treated as special, and for the energy we use nonrelativistic formulas.
On the other hand, we want to do quantum field theory, which is an application of
quantum mechanics to fields instead of particles, and it has the property of being relativistic
as well.
Quantum field theory is often called (when derived from first principles) second quanti-
zation, the idea being that:
-the first quantization is when we have a single particle and we quantize its behaviour
(its motion) in terms of a wavefunction describing probabilities.
-the second quantization is when we quantize the wavefunction itself (instead of a func-
tion now we have an operator), the quantum object now being the number of particles the
wavefunction describes, which is an arbitrary (a variable) quantum number. Therefore now
the field is a description of an arbitrary number of particles (and antiparticles), and this
number can change, i.e. it is not a constant.
People have tried to build a relativistic quantum mechanics, but it was quickly observed
that if we do that we cannot describe a single particle.
-First, the relativistic relation E = mc2 , together with the existence (experimentally
confirmed) of antiparticles which annihilate with particles giving only energy (photons),
means that if we have an energy E > mp c2 + mp̄ c2 we can create a particle-antiparticle pair,
and therefore the number of particles cannot be a constant in a relativistic theory.
-Second, even if E < mp c2 + mp̄ c2 , the particle-antiparticle pair can still be created for
a short time. Indeed, Heisenberg’s uncertainty principle in the (E, t) sector (as opposed to
the usual (x, p) sector) means that ∆E · ∆t ∼ ~, meaning that for a short time ∆t ∼ ~/∆E
I can have an uncertainty in the energy ∆E, for instance such that E + ∆E > mp c2 + mp̄ c2 .
That means that I can create a pair of virtual particles, i.e. particles that are forbidden
by energy and momentum conservation to exist as asymptotic particles, but can exist as
quantum fluctuations for a short time.
-Thirdly, causality is violated by a single particle propagating via usual
√ quantum me-
chanics formulas, even with the relativistic formula for the energy, E = p⃗2 + m2 .
The amplitude for propagation from ⃗x0 to ⃗x in a time t is in quantum mechanics

U (t) =< ⃗x|e−iHt |⃗x0 > (1.1)

4

and replacing E, the eigenvalue of H, by p⃗2 + m2 , we obtain
√ ∫ √
−it p 1 3 −it p
U (t) =< ⃗x|e ⃗2 +m2
|⃗x0 >= 3
d p⃗e e p·(⃗x−⃗x0 )
⃗2 +m2 i⃗
(1.2)
(2π)

But
∫ ∫ ∫ ∫ [ ] ∫ [ ]
p·⃗ 2π ipx −ipx 4π
3 i⃗
d p⃗e x 2
= p dp 2π sin θdθe ipx cos θ 2
= p dp (e − e 2
) = p dp sin(px)
ipx px
(1.3)
therefore ∫ ∞ √
1 −it p2 +m2
U (t) = 2 pdp sin(p|⃗x − ⃗x0 |)e (1.4)
2π |⃗x − ⃗x0 | 0
For x∫2 ≫ t2 , we use a saddle point approximation, which is the idea that the integral
I = dxef (x) can be approximated √ by the gaussian around the saddle point, i.e. I ≃
∫ f ′′ (x0 )δx2
e f (x0 )
dδxe ≃ e f (x0 )
π/f (x0 ), where x0 is the saddle point, where f ′ (x0 ) = 0.
′′

Generally, if we are interested √ in leading behaviour in some large parameter, the function
ef (x0 ) dominates over π/f ′′ (x0 ) and we can just approximate I ∼ ef (x0 ) .
In our case, we obtain
d √ tp imx
(ipx − it p2 + m2 ) = 0 ⇒ x = √ ⇒ p = p0 = √ (1.5)
dp 2
p +m 2 x2 − t2

Since we are at x2 ≫ t2 , we obtain


√2 2 √
U (t) ∝ eip0 x−it p0 +m ∼ e− x −t ̸= 0
2 2
(1.6)

So we see that even much outside the lightcone, at x2 ≫ t2 , we have nonzero probability for
propagation, meaning a breakdown of causality.
However, we will see that this problem is fixed in quantum field theory, which will be
causal.
In quantum field theory, the fields describe many particles. One example of this that
is easy to understand is the electromagnetic field, (E, ⃗ → Fµν , which describes many
⃗ B)
photons. Indeed, we know from the correspondence principle of quantum mechanics that
a classical state is equivalent to a state with many photons, and also that the number of
photons is not a constant in any sense: we can define a (quantum) average number of photons
that is related to the classical intensity of a electromagnetic beam, but the number of photons
is not a classically measurable quantity.
We will describe processes involving many particles by Feynman diagrams, which we will
be an important part of this course. In quantum mechanics, a particle propagates forever,
so its ”Feynman diagram” is always a single line, as in Fig.1
In quantum field theory however, we will derive the mathematical form of Feynman
diagrams, but the simple physical interpretation for which Feynman introduced them is that
we can have processes where for instance a particle splits into two (or more) (see Fig.1 a)),
two (or more) particles merge into one (see Fig. 1 b)), or two (or more) particles of one type

5
QM:

QFT: OR
a) b)

e−

OR

e+ c)

Figure 1: Quantum Mechanics: particle goes on forever. Quantum Field Theory: particles
can split (a), join (b), and particles of different types can appear and disappear, like in the
QED process (c).

disappear and another type is created, like for instance in the annihilation of a e+ (positron)
with an e− (electron) into a photon (γ) as in Fig. 1 c), etc.
Moreover, we can have (as we mentioned) virtual processes, like a photon γ creating a
e e pair, which lives for a short time ∆t and then annihilates into a γ, creating a e+ e−
+ −

virtual loop inside the propagating γ, as in Fig.2. Of course, E, p⃗ conservation means that
(E, p⃗) is the same for the γ before and after the loop.
Now we review a few notions of classical field theory. To understand that, we begin with
classical mechanics.
We have a Lagrangean L(qi , q̇i ), and the corresponding action
∫ t2
S= dtL(qi (t), q̇i (t)) (1.7)
t1

By varying the action with fixed boundary values, δS = 0, we obtain the Lagrange equations
(equations of motion)
∂L d ∂L
− =0 (1.8)
∂qi dt ∂ q̇i
We can also do a Legendre transformation from L(qi , q̇i ) to H(qi , pi ) in the usual way,

H(p, q) = pi q̇i − L(qi , q̇i ) (1.9)
i

6
e−

e+

Figure 2: Virtual particles can appear for a short time in a loop. Here a photon creates a
virtual electron-positron pair, that then annihilates back into the photon.

where
∂L
pi ≡ (1.10)
∂ q̇i
is a momentum conjugate to the coordinate qi .
Differentiating the Legendre transformation formula we get the first order Hamilton equa-
tions (instead of the second order Lagrange equations)
∂H
= q̇i
∂pi
∂H ∂L
=− = −ṗi (1.11)
∂qi ∂qi
The generalization to field theory is that instead of a set {qi (t)}i , which is a collection
of given particles, we now have fields ϕ(⃗x, t), where ⃗x is a generalization of i, and not a
coordinate of a particle!!
We will be interested in local field theories, which means all objects are integrals over ⃗x
of functions defined at a point, in particular the Lagrangean is

L(t) = d3⃗xL(⃗x, t) (1.12)

Here L is called the Lagrange density, but by an abuse of notation, one usually refers to it
also as the Lagrangean.
We are also interested in relativistic field theories, which means that L(⃗x, t) is a relativis-
tically invariant function of fields and their derivatives,

L(⃗x, t) = L(ϕ(⃗x, t), ∂µ ϕ(⃗x, t)) (1.13)

Considering also several ϕa , we have an action


∫ ∫
S = Ldt = d4 xL(ϕa , ∂µ ϕa ) (1.14)

7
where d4 x = dtd3⃗x is the relativistically invariant volume element for spacetime.
The Lagrange equations are obtained in the same way, as
[ ]
∂L ∂L
− ∂µ =0 (1.15)
∂ϕa ∂(∂µ ϕa )

Note that you could think of L(qi ) as a discretization over ⃗x of d3⃗xL(ϕa ), but it is not
particularly useful.
In the Lagrangean we have relativistic fields, i.e. fields that have a well defined transfor-
mation property under Lorentz transformations

x′µ = Λµ ν xν (1.16)

namely
ϕ′i (x′ ) = Ri j ϕi (x) (1.17)
where i is some index for the fields, related to its Lorentz properties. We will come back to
this later, but for now let us just observe that for a scalar field there is no i and R ≡ 1, i.e.
ϕ′ (x′ ) = ϕ(x).
In these notes I will use the convention for the spacetime metric with ”mostly plus” on
the diagonal, i.e. the Minkowski metric is

ηµν = diag(−1, +1, +1, +1) (1.18)

Note that this is the convention that is the most natural in order to make heavy use of
Euclidean field theory via Wick rotation, as we will do (by just redefining the time t by a
factor of i), and so is very useful if we work with the functional formalism, where Euclidean
field theory is essential.
On the other hand, for various reasons, people connected with phenomenology and
making heavy use of the operator formalism often use the ”mostly minus” metric (ηµν =
diag(+1, −1, −1, −1)), for instance Peskin and Schroeder does so, so one has to be very
careful when translating results from one convention to the other.
With this metric, the Lagrangean for a scalar field is generically
1 1
L = − ∂µ ϕ∂ µ ϕ − m2 ϕ2 − V (ϕ)
2 2
1 2 1 ⃗ 2 1 2 2
= ϕ̇ − |∇ϕ| − m ϕ − V (ϕ) (1.19)
2 2 2

so is of the general type q̇ 2 /2 − Ṽ (q), as it should (where 1/2|∇ϕ| ⃗ 2 + m2 ϕ2 /2 are also part
of Ṽ (q)).
To go to the Hamiltonian formalism, we first must define the momentum conjugate to
the field ϕ(⃗x) (remembering that ⃗x is a label like i),

∂L ∂
p(⃗x) = = d3 ⃗y L(ϕ(⃗y ), ∂µ ϕ(⃗y )) = π(⃗x)d3⃗x (1.20)
∂ ϕ̇(⃗x) ∂ ϕ̇(⃗x)

8
where
δL
π(⃗x) = (1.21)
δ ϕ̇(⃗x)
is a conjugate momentum density, but by an abuse of notation again will be called just
conjugate momentum.
Then the Hamiltonian is

H = p(⃗x)ϕ̇(⃗x) − L
∫ ⃗
x ∫
→ d ⃗x[π(⃗x)ϕ̇(⃗x) − L] ≡ d3⃗xH
3
(1.22)

where H is a Hamiltonian density.


Noether theorem
The statement of the Noether theorem is that for every symmetry of the Lagrangean L,
there is a corresponding conserved charge.
The best known examples are the time translation t → t + a invariance, corresponding to
conserved energy E, and space translation ⃗x → ⃗x +⃗a, corresponding to conserved momentum
p⃗, together spacetime translation xµ → xµ + aµ , corresponding to conserved 4-momentum
P µ . The currents corresponding to these charges form the energy-momentum tensor Tµν .
Consider the symmetry ϕ(x) → ϕ′ (x) = ϕ(x) + α∆ϕ that takes the Lagrangean density

L → L + α∂µ J µ (1.23)

such that the Lagrangean L is invariant, if the fields vanish on the boundary, usually con-
sidered at t = ±∞, since the boundary term
∫ I ∫
4 µ
d x∂µ J = dSµ J = d3⃗xJ 0 |t=+∞
µ
t=−∞ (1.24)
bd

is then zero. In this case, there exists a conserved current j µ , i.e.

∂µ j µ (x) = 0 (1.25)

where
∂L
j µ (x) = ∆ϕ − J µ (1.26)
∂(∂µ ϕ)
For linear symmetries (linear in ϕ), we can define

(α∆ϕ)i ≡ αa (T a )i j ϕj (1.27)

such that, if J µ = 0, we have the Noether current


∂L
j µ,a = (T a )i j ϕj (1.28)
∂(∂µ ϕ)

9
Applying to translations, xµ → xµ + aµ , we have for an infinitesimal parameter aµ

ϕ(x) → ϕ(x + a) = ϕ(x) + aµ ∂µ ϕ (1.29)

which are the first terms in the Taylor expansion. The corresponding conserved current is
then
∂L
T µν ≡ ∂ν ϕ − Lδνµ (1.30)
∂(∂µ ϕ)
where we have added a term J(ν) µ
= Lδνµ to get the conventional definition of the energy-
momentum tensor or stress-energy tensor. The conserved charges are integrals of the energy-
momentum tensor, i.e. P µ . Note that the above translation can be considered as giving also
µ
the term J(ν) from the general formalism, since we can check that for αν = aν , the Lagrangean
µ
changes by ∂µ J(ν) .
Lorentz representations
The Lorentz group is SO(1, 3), i.e. an orthogonal group that generalizes SO(3), the
group of rotations in the (euclidean) 3 spatial dimensions.
Its basic objects in the fundamental representation, defined as the representation that
acts onto coordinates xµ (or rather dxµ ), are called Λµ ν , and thus

dx′µ = Λµ ν dxν (1.31)

If η is the matrix ηµν , the Minkowski metric diag(−1, +1, +1, +1), the orthogonal group
SO(1, 3) is the group of elements Λ that satisfy

ΛηΛT = η (1.32)

Note that the usual rotation group SO(3) is an orthogonal group satisfying

ΛΛT = 1 ⇒ Λ−1 = ΛT (1.33)

but we should write is actually as


Λ1ΛT = 1 (1.34)
which admits a generalization to SO(p, q) groups as

ΛgΛT = g (1.35)

where g = diag(−1, ..., −1, +1, ... + 1) with p minuses and q pluses. In the above, Λ satisfies
the group property, namely if Λ1 , Λ2 belong in the group, then

Λ1 · Λ2 ≡ Λ (1.36)

is also in the group.


General representations are a generalization of (1.31), namely instead of acting on x, the
group acts on a vector space ϕa by

ϕ′a (Λx) = R(Λ)a b ϕb (x) (1.37)

10
such that it respects the group property, i.e.

R(Λ1 )R(Λ2 ) = R(Λ1 · Λ2 ) (1.38)

Group elements are represented for infinitesimally small parameters β a as exponentials of


(R)
the Lie algebra generators in the R representation, ta , i.e.
a t(R)
R(β) = eiβ a
(1.39)
(R)
The statement that ta form a Lie algebra is the statement that we have a relation
(R) c (R)
[t(R)
a , tb ] = ifab tc (1.40)

where fab c are called the structure constants. Note that the factor of i is conventional, with
this definition we can have hermitian generators, for which tr(ta tb ) = δab ; if we redefine ta by
an i we can remove it from there, but then tr(ta tb ) can be put only to −δab (antihermitian
generators).
The representations of the Lorentz group are:
-bosonic: scalars ϕ, for which ϕ′ (x′ ) = ϕ(x); vectors like the electromagnetic field
Aµ = (ϕ, A)⃗ that transform as ∂µ (covariant) or dxµ (contravariant), and representations
which have products of indices, like for instance the electromagnetic field strength Fµν which
transforms as

Fµν (Λx) = Λµ ρ Λν σ Fρσ (x) (1.41)
ν ...ν
where Λµ ν = ηµρ η νσ Λρ σ . and for fields with more indices, Bµ11 ...µjk it transforms as the
appropriate products of Λ.
-fermionic: spinors, which will be treated in more detail later on in the course. For now,
let us just say that fundamental spinor representations ψ are acted upon by gamma matrices
γ µ.
The Lie algebra of the Lorentz group SO(1, 3) is

[Jµν , Jρσ ] = −iηµρ Jνσ + iηµσ Jνρ − iηνσ Jµρ + iηνρ Jµσ (1.42)

Note that if we denote a ≡ (µν), b ≡ (ρσ) and c ≡ (λπ) we then have

fab c = −ηµρ δ[ν


λ π λ π
δσ] + ηµσ δ[ν δρ] − ηνσ δ[µ
λ π λ π
δρ] + ηνρ δ[µ δσ] (1.43)

so (1.42) is indeed of the Lie algebra type.


The Lie algebra SO(1, 3) is (modulo some global subtleties) the same as a products of
two SU (2)’s, i.e. SU (2) × SU (2), which can be seen by first defining

J0i ≡ Ki ; Jij ≡ ϵijk Jk (1.44)

where i, j, k = 1, 2, 3, and then redefining


Ji + iKi Ji − iKi
Mi ≡ ; Ni ≡ (1.45)
2 2
11
after which we obtain

[Mi , Mj ] = iϵijk Mk
[Ni , Nj ] = iϵijk Nk
[Mi , Nj ] = 0 (1.46)

which we leave as an exercise to prove.

Important concepts to remember

• Quantum field theory is a relativistic quantum mechanics, which necessarily describes


an arbitrary number of particles.

• Particle-antiparticle pairs can be created and disappear, both as real (energetically


allowed) and virtual (energetically disallowed, only possible due to Heisenberg’s uncer-
tainty principle).

• If we use the usual quantum mechanics rules, even with E = p2 + m2 , we have
causality breakdown: the amplitude for propagation is nonzero even much outside the
lightcone.

• Feynman diagrams represent the interaction processes of creation and annihilation of


particles.

• When generalizing classical mechanics to field theory, the label i is generalized to ⃗x in


ϕ(⃗x, t), and we have a Lagrangean density L(⃗x, t), conjugate momentum density π(⃗x, t)
and Hamiltonian density H(⃗x, t).

• For relativistic and local theories, L is a relativistically invariant function defined at a


point xµ .

• The Noether theorem associates a conserved current (∂µ j µ = 0) with a symmetry of


the Lagrangean L, in particular the energy-momentum tensor Tνµ with translations
xµ → xµ + aµ .

• Lorentz representations act on the fields ϕa , and are the exponentials of Lie algebra
generators.

• The Lie algebra of SO(1, 3) splits into two SU (2)’s.

Further reading: See chapters 2.1 and 2.2 in [2] and 1 in [1].

12
Exercises, Lecture 1

1) Prove that for the Lorentz Lie algebra,

[Jµν , Jρσ ] = − (−iηµρ Jνσ + iηµσ Jνρ − iηνσ Jµρ + iηνρ Jµσ ) , (1.47)

defining

J0i ≡ Ki ; Jij ≡ ϵijk Jk


Ji + iKi Ji − iKi
Mi ≡ ; Ni ≡ (1.48)
2 2
we obtain that the Mi and Ni satisfy

[Mi , Mj ] = iϵijk Mk
[Ni , Nj ] = iϵijk Nk
[Mi , Nj ] = 0 (1.49)

2) Consider the action in Minkowski space


∫ ( )
1 ∗ µ
S = d x − Fµν F − ψ̄(D
4 µν
/ + m)ψ − (Dµ ϕ) D ϕ (1.50)
4

where Dµ = ∂µ − ieAµ , D / = Dµ γ µ , Fµν = ∂µ Aν − ∂ν Aµ , ψ̄ = ψ † iγ0 , ψ is a spinor field and


ϕ is a scalar field, and γµ are the gamma matrices, satisfying {γµ , γν } = 2ηµν . Consider the
electromagnetic U (1) transformation

ψ ′ (x) = eieλ(x) ψ(x); ϕ′ (x) = eieλ(x) ϕ(x); A′µ (x) = Aµ (x) + ∂µ λ(x) (1.51)

Calculate the Noether current.

13
2 Lecture 2: Quantum mechanics: harmonic oscillator
and QM in terms of path integrals
”The career of a young theoretical physicist
consists of treating the harmonic oscillator
in ever-increasing levels of abstraction”
Sidney Coleman

In this lecture, I will review some facts about the harmonic oscillator in quantum mechan-
ics, and then present how to do quantum mechanics in terms of path integrals, something
that should be taught in a standard quantum mechanics course, though it does not always
happen.
As the quote above shows, understanding really well the harmonic oscillator is crucial:
we understand everything if we understand really well this simple example, such that we
can generalize it to more complicated systems. Similarly, most of the issues of quantum
field theory in path integral formalism can be described by using the simple example of the
quantum mechanical path integral.
Harmonic oscillator
The harmonic oscillator is the simplest possible nontrivial quantum system, with a
quadratic potential, i.e. with the Lagrangean

q̇ 2 q2
L= − ω2 (2.1)
2 2
giving the Hamiltonian
1
H = (p2 + ω 2 q 2 ) (2.2)
2
Using the definition
1
a = √ (ωq + ip)

1
a† = √ (ωq − ip) (2.3)

inverted as

ω
p = −i (a − a† )
2
1
q = √ (a + a† ) (2.4)

we can write the Hamiltonian as
ω
H= (aa† + a† a) (2.5)
2

14
where, even though we are now at the classical level, we have been careful to keep the order
of a, a† as it is. Of course, classically we could then write

H = ωa† a (2.6)

In classical mechanics, one can define the Poisson bracket of two functions f (p, q) and
g(p, q) as
∑ ( ∂f ∂g ∂f ∂g
)
{f, g}P.B. ≡ − (2.7)
i
∂q i ∂p i ∂p i ∂q i

With this definition, we can immediately check that

{pi , qj }P.B. = −δij (2.8)

The Hamilton equations of motion then become


∂H
q̇i = = {qi , H}
∂pi
∂H
ṗi = − = {pi , H} (2.9)
∂qi
Then, canonical quantization is simply the procedure of replacing the c-number vari-
ables (q, p) with the operators (q̂, p̂), and replacing the Poisson brackets {, }P.B. with 1/(i~)[, ]
(commutator).
In this way, in theoretical physicist’s units, with ~ = 1, we have

[p̂, q̂] = −i (2.10)

Replacing in the definition of a, a† , we find also

[â, ↠] = 1 (2.11)

One thing that is not obvious from the above is the picture we are in. We know that
we can describe quantum mechanics in the Schrödinger picture, with operators independent
of time, or in the Heisenberg picture, where operators depend on time. There are other
pictures, in particular the interaction picture that will be relevant for us later, but are not
important at this time.
In the Heisenberg picture, we can translate the classical Hamilton equations in terms of
Poisson brackets into equations for the time evolution of the Heisenberg picture operators,
obtaining
dq̂i
i~ = [q̂i , H]
dt
dp̂i
i~ = [p̂i , H] (2.12)
dt
For the quantum Hamiltonian of the harmonic oscillator, we write from (2.5),
~ω ( † ) 1
Ĥqu = ââ + ↠â = ~ω(↠â + ) (2.13)
2 2
15
where we have reintroduced ~ just so we remember that ~ω is an energy. The operators â and
↠are called destruction (annihilation) or lowering, and creation or raising operators, since
the eigenstates of the harmonic oscillator hamiltonian are defined by a occupation number
n, such that
↠|n >∝ |n + 1 >; â|n >∝ |n − 1 > (2.14)
such that
↠â|n >≡ N̂ |n >= n|n > (2.15)
That means that in the vacuum, for occupation number n = 0, we still have an energy

E0 = (2.16)
2
called vacuum energy or zero point energy. On the other hand, the remainder is called
normal ordered Hamiltonian : Ĥ :,
: Ĥ := ~ω↠â (2.17)
where we define the normal order such that ↠is to the left of â, i.e.

: ↠â := ↠â; : â↠:= ↠â (2.18)

Feynman path integral in quantum mechanics in phase space


We now turn to the discussion of the Feynman path integral.
Given a position q at time t, an important quantity is the probability to find the particle
at point q ′ and time t′ ,
F (q ′ , t′ ; q, t) = H < q ′ , t′ |q, t >H (2.19)
where |q, t >H is the state, eigenstate of q̂(t) at time t, in the Heisenberg picture.
Let us remember a bit about pictures in Quantum Mechanics. There are more pictures,
but for now we will be interested in the two basic ones, the Schrödinger picture and the
Heisenberg picture. In the Heisenberg picture, operators depend on time, in particular we
have q̂H (t), and the state |q, t >H is independent of time, and t is just a label, which means
that it is an eigenstate of q̂H (t) at time t, i.e.

q̂H (τ = t)|q, t >H = q|q, t >H (2.20)

and it is not an eigenstate for τ ̸= t. The operator in the Heisenberg picture q̂H (t) is related
to the one in the Schrödinger picture q̂S by

q̂H (t) = eiĤt q̂S e−iĤt (2.21)

and the Schrödinger picture state is related as

|q >= e−iĤt |q, t >H (2.22)

and is an eigestate of q̂S , i.e.


q̂S |q >= q|q > (2.23)

16
In terms of the Schrödinger picture we then have

F (q ′ , t′ ; q, t) =< q ′ |e−iĤ(t −t) |q > (2.24)

From now on we will drop the index H and S for states, since it is obvious, if we write |q, t >
we are in the Heisenberg picture, if we write |q > we are in the Schrödinger picture.
Let us now derive the path integral representation.
Divide the time interval between t and t′ into a large number n + 1 of equal intervals,
and denote
t′ − t
ϵ≡ ; t0 = t, t1 = t + ϵ, ..., tn+1 = t′ (2.25)
n+1
But at any fixed ti , the set {|qi , ti >H |qi ∈ R} is a complete set, meaning that we have the
completeness relation ∫
dqi |qi , ti >< qi , ti | = 1 (2.26)

We then introduce n factors of 1, one for each ti , i = 1, ..., n in F (q ′ , t′ ; q, t) in (2.19),


obtaining

F (q, t ; q, t) = dq1 ...dqn < q ′ , t′ |qn , tn >< qn , tn |qn−1 , tn−1 > ...|q1 , t1 >< q1 , t1 |q, t >

(2.27)

where qi ≡ q(ti ) give us a regularized path between q and q .
But note that this is not a classical path, since at any ti , q can be anything (qi ∈
R), independent of qi−1 , and independent of how small ϵ is, whereas classically we have a
continuous path, meaning as ϵ gets smaller, qi − qi−1 can only be smaller and smaller. But
integrating over all qi = q(ti ) means we integrate over these quantum paths, where qi is
arbitrary (independent on qi−1 ), as in Fig.3. Therefore we denote


n
Dq(t) ≡ dq(ti ) (2.28)
i=1

and this is an ”integral over all paths”, or ”path integral”.


On the other hand, considering that

dp
|q >= |p >< p|q >
∫ 2π
|p >= dq|q >< q|p > (2.29)

∫ ′
(note the factor of 2π, which is necessary, since < q|p >= eipq and dqeiq(p−p ) = 2πδ(p−p′ )),
we have

−iϵĤ dp(ti )
H ⟨q(ti ), ti |q(ti−1 ), ti−1 ⟩H = ⟨q(ti )|e |q(ti−1 )⟩ = ⟨q(ti )|p(ti )⟩⟨p(ti )|e−iϵĤ |q(ti−1 )⟩

(2.30)

17
q

qi
q’

t
t ti t’

Figure 3: In the quantum mechanical path integral, we integrate over discretized paths. The
paths are not necessarily smooth, as classical paths are: we divide the path into a large
number of discrete points, and then integrate over the positions of these points.

Now we need a technical requirement on the quantum Hamiltonian: it has to be ordered


such that all the p̂’s are to the left of the q̂’s.
Then, to the first order in ϵ, we can write

< p(ti )|e−iϵĤ(p̂,q̂) |q(ti−1 ) >= e−iϵH(p(ti ),q(ti−1 )) < p(ti )|q(ti−1 ) >= e−iϵH(p(ti ),q(ti−1 )) e−ip(ti )q(ti−1 )
(2.31)
since p̂ will act on the left on < p(ti )| and q̂ will act on the right on |q(ti−1 ) >. Of course,
to higher order in ϵ, we have Ĥ(p̂, q̂)Ĥ(p̂, q̂) which will have terms like p̂q̂ p̂q̂ which are more
complicated. But since we have ϵ → 0, we only need the first order in ϵ.
Then we get
∫ ∏
dp(ti ) ∏
n n
′ ′
F (q , t ; q, t) = dq(tj )⟨q(tn+1 |p(tn+1 )⟩⟨p(tn+1 )|e−iϵĤ |q(tn )⟩...
i=1
2π j=1

...⟨q(t1 )|p(t1 )⟩⟨p(t1 )|e−iϵĤ |q(t0 )⟩


= Dp(t)Dq(t) exp {i [p(tn+1 )(q(tn+1 ) − q(tn )) + ... + p(t1 )(q(t1 ) − q(t0 ))
−ϵ(H(p(tn+1 ), q(t { n∫)) + ... + H(p(t1 ), q(t0 )))]} }
tn+1
= Dp(t)Dq(t) exp i dt[p(t)q̇(t) − H(p(t), q(t))] (2.32)
t0

where we have used that q(ti+1 ) − q(ti ) → dtq̇(ti ). The above expression is called the path
integral in phase space.
We note that this was derived rigorously (for a physicist, of course...). But we would
like a path integral in configuration space. For that however, we need one more technical

18
requirement: we need the Hamiltonian to be quadratic in momenta, i.e.

p2
H(p, q) = + V (q) (2.33)
2
If this is not true, we have to start in phase space and see what we get in configuration space.
But if we have only quadratic terms in momenta, we can use gaussian integration to derive
the path integral in configuration space. Therefore we will make a math interlude to define
some gaussian integration formulas that will be useful here and later.
Gaussian integration
The basic gaussian integral is
∫ +∞ √
−αx2 π
I= e dx = (2.34)
−∞ α
Squaring this integral formula, we obtain also
∫ ∫ 2π ∫ ∞
−(x2 +y 2 )
rdre−r = π
2 2
I = dxdye = dϕ (2.35)
0 0

We can generalize it as

dn xe− 2 xi Aij xj = (2π)n/2 (det A)−1/2
1
(2.36)

which can be proven for instance by diagonalizing the matrix A, and since det A = i αi ,
with αi the eigenvalues of A, we get the above.
Finally, consider the object
1
S = xT Ax + bT x (2.37)
2
(which will later on in the course be used as an action). Considering it as an action, the
classical solution will be ∂S/∂xi = 0,

xc = −A−1 b (2.38)

and then
1
S(xc ) = − bT A−1 b (2.39)
2
which means that we can write
1 1
S = (x − xc )T A(x − xc ) − bT A−1 b (2.40)
2 2
and thus we find

1 T −1
dn xe−S(x) = (2π)n/2 (det A)−1/2 e−S(xc ) = (2π)n/2 (det A)−1/2 e+ 2 b A b (2.41)

Path integral in configuration space

19
We are now ready to go to configuration space. The gaussian integration we need to do
is then the one over Dp(t), which is
∫ ∫ t′ 1 2
Dp(τ )ei t dτ [p(τ )q̇(τ )− 2 p (τ )] (2.42)

which discretized is
∏ dp(ti ) [ ]
1 2
exp i∆τ (p(ti )q̇(ti ) − p (τi )) (2.43)
i
2π 2
and therefore xi = p(ti ), Aij = i∆τ δij , b = −i∆τ q̇(ti ), giving
∫ ∫ t′ ∫ t′
1 2 q̇(τ )2
Dp(τ )ei t dτ [p(τ )q̇(τ )− 2 p (τ )] = N ei t dτ 2 (2.44)

where N contains constant factors of 2, π, i, ∆τ , which we will see are irrelevant.


∫ { ∫ ′ [ ]}
t 2
q̇ (τ )
F (q ′ t; q, t) = N Dq exp i dτ − V (q)
2
∫ { ∫ ′t
}
t
= N Dq exp i dτ L(q(τ ), q̇(τ ))
∫ t

= N DqeiS[q] (2.45)

This is the path integral in configuration space that we were seeking. But we have to
remember that this is valid only if the Hamiltonian is quadratic in momenta.
Correlations functions
We have found how to write the probability of transition between (q, t) and (q ′ , t′ ), and
that is good. But there are other observables that we can construct which are of interest,
for instance the correlations functions.
The simplest one is the one-point function
< q ′ , t′ |q̂(t̄)|q, t > (2.46)
where we can make it such that t̄ coincides with a ti in the discretization of the time interval.
The calculation now proceeds as before, but in the step (2.27) we introduce the 1’s such that
we have (besides the usual products) also the expectation value
< qi+1 , ti+1 |q̂(t̄)|qi , ti >= q(t̄) < qi+1 , ti+1 |qi , ti > (2.47)
since t̄ = ti ⇒ q(t̄) = qi . Then the calculation proceeds as before, since the only new thing
is the appearance of q(t̄), leading to

′ ′
< q , t |q̂(t̄)|q, t >= DqeiS[q] q(t̄) (2.48)

Consider next the two-point functions,


< q ′ , t′ |q̂(t̄2 )q̂(t̄1 )|q, t > (2.49)

20
If we have t̄1 < t̄2 , we can proceed as before. Indeed, remember that in (2.27) the 1’s
introduced in time order, such that we can do the rest of the calculation and get the path
integral. Therefore, if t̄1 < t̄2 , we can choose t̄1 = ti , t̄2 = tj , such that j > i, and then we
have

... < qj+1 , tj+1 |q̂(t̄2 )|qj , tj > ... < qi+1 , ti+1 |q̂(t̄1 )|qi , ti > ...
= ...q(t̄2 ) < qj+1 , tj+1 |qj , tj > ...q(t̄1 ) < qi+1 , ti+1 |qi , ti > ... (2.50)

besides the usual products, leading to



DqeiS[q] q(t̄2 )q(t̄1 ) (2.51)

Then reversely, the path integral leads to the two-point function where the q(t) are ordered
according to time (time ordering), i.e.

DqeiS[q] q(t̄1 )q(t̄2 ) =< q ′ , t′ |T {q̂(t̄1 )q̂(t̄2 )}|q, t > (2.52)

where time ordering is defined as

T {q̂(t̄1 )q̂(t̄2 )} = q̂(t̄1 )q̂(t̄2 ) if t̄1 > t̄2


= q̂(t̄2 )q̂(t̄1 ) if t̄2 > t̄1 (2.53)

which has an obvious generalization to

T {q̂(t̄1 )...q̂(t̄N )} = q̂(t1 )...q̂(t̄N ) if t̄1 > t̄2 > ... > t̄N (2.54)

and otherwise they are ordered in the order of their times.


Then we similarly find that the n-point function or correlation function is

′ ′
Gn (t̄1 , ..., t̄n ) ≡< q , t |T {q̂(t̄1 )....q(t̄n )}|q, t >= DqeiS[q] q(t̄1 )...q(t̄n ) (2.55)

In math, for a set {an }n , we can define a generating function F (z),


∑ 1
F (z) ≡ an z n (2.56)
n
n!

such that we can find an from its derivatives,


dn
an = F (z)|z=0 (2.57)
dz n
Similarly now, we can define a generating functional
∑∫ ∫
iN
Z[J] = dt1 ... dtN GN (t1 , ..., tN )J(t1 )...J(tN ) (2.58)
N ≥0
N!

21
As we see, the difference is that now we have GN (t1 , ..., tN ) instead of aN , so we needed to
integrate over dt, and instead of z, introduce J(t), and i was conventional. Using (2.55), the
integrals factorize, and we obtain just a product of the same integral,
∫ ∑ 1 [∫ ]N
Z[J] = Dqe iS[q]
dtiq(t)J(t) (2.59)
N ≥0
N !

so finally ∫ ∫ ∫
Z[J] = Dqe iS[q,J]
= DqeiS[q]+i dtJ(t)q(t)
(2.60)

We then find that this object indeed generates the correlation functions by

δN
Z[J]|J=0 = DqeiS[q] q(t1 )...q(tN ) = GN (t1 , ..., tN ) (2.61)
iδJ(t1 )...iJ(tN )

Important concepts to remember

• For the harmonic oscillator, the Hamiltonian in terms of a, a† is H = ω/2(aa† + a† a).

• Canonical quantization replaces classical functions with quantum operators, and Pois-
son brackets with commutators, {, }P,B. → 1/(i~)[, ].

• At the quantum level, the harmonic oscillator Hamiltonian is the sum of a normal
ordered part and a zero point energy part.

• The transition∫ probability ∫ from (q, t) to (q ′ , t′ ) is a path integral in phase space,


F (q ′ , t′ ; q, t) = DqDpei (pq̇−H) .

• If the Hamiltonian is quadratic in momenta,


∫ we can go to the path integral in config-
′ ′
uration space and find F (q , t ; q, t) = Dqe .
iS

• The n-point functions or correlation functions, with insertions of q(ti ) in the path
integral, gives the expectation value of time-ordered q̂(t)’s.

• The n-point functions can be found from the derivatives of the generating function
Z[J].

Further reading: See chapters 1.3 in [4] and 2 in [3].

22
Exercises, Lecture 2

1) Let
q̇ 2 λ
L(q, q̇) = − q4 (2.62)
2 4!
Write down the Hamiltonian equations of motion and the path integral in phase space for
this model.

2) Let ∫ ∫ ∫
J 2 (t) J 3 (t) J 4 (t)
ln Z[J] = dt f (t) + λ dt + λ̃ dt (2.63)
2 3! 4!
Calculate the 3-point function and the 4-point function.

23
3 Lecture 3: Canonical quantization of scalar fields
As we saw, in quantum mechanics, for a particle with Hamiltonian H(p, q), we replace the
Poisson bracket
∑ ∂f ∂g ∂f ∂g
{f, g}P.B. = ( − ) (3.1)
i
∂qi ∂pi ∂pi ∂qi

with {pi , qj }P.B. = −δij , with the commutator, {, }P.B. → i~1 [, ], and all functions of (p, q)
become quantum operators, in particular [p̂i , q̂j ] = −i~ and for the harmonic oscillator we
have
[â, ↠] = 1 (3.2)
Then, in order to generalize to field theory, we must first define the Poisson brackets. As
we already have a definition in the case of a set of particles, we will discretize space, in order
to use that definition. Therefore we consider the coordinates and conjugate momenta

qi (t) = √∆V ϕi (t)
pi (t) = ∆V πi (t) (3.3)

where ϕi (t) ≡ ϕ(⃗xi , t) and similarly for πi (t). We should also define how to go from the
derivatives in the Poisson brackets to functional derivatives. The recipe is

1 ∂fi (t) δf (ϕ(⃗xi , t), π(⃗xi , t))



∆V ∂ϕj (t) δϕ(⃗xj , t)
∆V → d x3
(3.4)

where functional derivatives are defined such that, for instance



ϕ2 (⃗x, t) δH(t)
H(t) = d3 x ⇒ = ϕ(⃗x, t) (3.5)
2 δϕ(⃗x, t)

i.e., by dropping the integral sign and then taking normal derivatives.
Replacing these definitions in the Poisson brackets (3.1), we get
∫ [ ]
δf δg δf δg
{f, g}P.B. = d x 3
− (3.6)
δϕ(⃗x, t) δπ(⃗x, t) δπ(⃗x, t) δϕ(⃗x, t)

and then we immediately find

{ϕ(⃗x, t), π(⃗x′ , t)}P.B. = δ 3 (⃗x − ⃗x′ )


{ϕ(⃗x, t), ϕ(⃗x′ , t)}P.B. = {π(⃗x, t), π(⃗x′ , t)} = 0 (3.7)

where we note that these are equal time commutation relations, in the same way that we had
before really {qi (t), pj (t)}P.B. = δij .
We can now easily do canonical quantization of this scalar field. We just replace classical
fields ϕ(⃗x, t) with quantum Heisenberg operators ϕH (⃗x, t) (we will drop the H, understanding
that if there is time dependence we are in the Heisenberg picture and if we don’t have

24
time dependence we are in the Schrödinger picture), and {, }P.B. → 1/(i~)[, ], obtaining the
fundamental equal time commutation relations

[ϕ(⃗x, t), π(⃗x′ , t)] = i~δ 3 (⃗x − ⃗x′ )


[ϕ(⃗x, t), ϕ(⃗x′ , t)] = [π(⃗x, t), π(⃗x′ , t)] = 0 (3.8)

We further define the Fourier transforms



d3 p i⃗x·⃗p
ϕ(⃗x, t) = e ϕ(⃗p, t)
(2π)3 ∫
⇒ ϕ(⃗p, t) = d3 xe−i⃗p·⃗x ϕ(⃗x, t)

d3 p i⃗x·⃗p
π(⃗x, t) = e π(⃗p, t)
(2π)3 ∫
⇒ π(⃗p, t) = d3 xe−i⃗p·⃗x π(⃗x, t) (3.9)

We further define, using the same formulas as we did for the harmonic oscillator, but
now for the momentum modes,

ωk ⃗ i
a(⃗k, t) = ϕ(k, t) + √ π(⃗k, t)
√ 2 2ω k
† ⃗ ωk † ⃗ i
a (k, t) = ϕ (k, t) − √ π † (⃗k, t) (3.10)
2 2ωk

where we will find later that we have ωk = ⃗k 2 + m2 , but for the moment we will only need
that ωk = ω(|⃗k|). Then replacing these definitions in ϕ and π, we obtain

d3 p 1
ϕ(⃗x, t) = 3
√ (a(⃗p, t)ei⃗p·⃗x + a† (⃗p, t)e−i⃗p·⃗x )
(2π) 2ωp

d3 p 1
= 3
√ ei⃗p·⃗x (a(⃗p, t) + a† (−⃗p, t))
(2π) 2ωp
∫ ( √ )
d3 p ωp
π(⃗x, t) = 3
−i (a(⃗p, t)ei⃗p·⃗x − a† (⃗p, t)e−i⃗p·⃗x )
(2π) ( √ 2 )

d3 p ωp
= 3
−i (a(⃗p, t) − a† (−⃗p, t)) (3.11)
(2π) 2

In terms of a(⃗p, t) and a† (⃗p, t) we obtain the commutators

[a(⃗p, t), a† (⃗p′ , t)] = (2π)3 δ 3 (⃗p − p⃗′ )


[a(⃗p, t), a(⃗p′ , t)] = [a† (⃗p, t), a† (⃗p′ , t)] = 0 (3.12)

We can check, for instance, that the [ϕ, π] commutator is correct with these commutators:
∫ ( √ )
′ d3 p d3 p′ i ωp′ † ′ † ′ p·⃗ p′ ·⃗
x′ )
[ϕ(⃗x, t), π(⃗x , t)] = − ([a (−⃗
p , t), a(⃗
p , t)] − [a(⃗
p , t), a (−⃗
p , t)])e i(⃗ x+⃗
(2π)3 (2π)3 2 ωp

25
= iδ 3 (⃗x − ⃗x′ ). (3.13)

We note now that the calculation above was independent on the form of ωp (= p2 + m2 ),
we only used that ωp = ω(|⃗p|), but otherwise we just used the definitions from the harmonic
oscillator. We also have not written any explicit time dependence, it was left implicit through
a(⃗p, t), a† (⃗p, t). Yet, we obtained the same formulas as for the harmonic oscillator.
We should now understand the dynamics, which will give us the formula for ωp .
We therefore go back, and start systematically. We work with a free scalar field, i.e. one
with V = 0, with Lagrangean

1 m2 2
L = − ∂µ ϕ∂ µ ϕ − ϕ (3.14)
2 2
∫ ∫ ∫
and action S = d4 xL. Partially integrating −1/2 ∂µ ϕ∂ µ ϕ = +1/2ϕ ∂µ ∂ µ ϕ, we obtain
the Klein-Gordon (KG) equation of motion

(∂µ ∂ µ − m2 )ϕ = 0 ⇒ (−∂t2 + ∂⃗x2 − m2 )ϕ = 0 (3.15)

Going to momentum (⃗p) space via a Fourier transform, we obtain


[ 2 ]
∂ 2 2
+ (⃗p + m ) ϕ(⃗p, t) = 0 (3.16)
∂t2

which is the equation of motion for a harmonic oscillator with ω = ωp = p 2 + m2 .
This means that the hamiltonian is
1
H = (p2 + ωp2 ϕ2 ) (3.17)
2
and we can use the transformation

1 ω
ϕ = √ (a + a† ); p = −i (a − a† ) (3.18)
2ω 2

and we have [a, a† ] = 1. Therefore now we can justify the transformations that we did a
posteriori.
Let’s also calculate the Hamiltonian. As explained in the first lecture, using the dis-
cretization of space, we write

H = p(⃗x, t)ϕ̇(⃗x, t) − L
∫⃗x
= d3 x[π(⃗x, t)ϕ̇(⃗x, t) − L] ≡ d3 xH (3.19)

and from the Lagrangean (3.14) we obtain

π(⃗x, t) = ϕ̇(⃗x, t) ⇒

26
1 2 1 ⃗ 2 1 2 2
H = π + (∇ϕ) + m ϕ (3.20)
2 2 2
Substituing ϕ, π inside this, we obtain
∫ ∫ { √
d3 p d3 p′ i(⃗p+⃗p′ )⃗x ωp ωp′
H = 3
dx 3 3
e − (a(⃗p, t) − a† (−⃗p, t))(a(⃗p′ , t) − a† (−⃗p′ , t))
(2π) (2π) 4 }
−⃗p · p⃗′ + m2 † ′ † ′
+ √ (a(⃗p, t) + a (−⃗p, t))(a(⃗p , t) − a (−⃗p , t))
4 ωp ωp′

d3 p ωp †
= (a (⃗p, t)a(⃗p, t) + a(⃗p, t)a† (⃗p, t)) (3.21)
(2π)3 2

where in the last line we have first done the integral over ⃗x, obtaining δ 3 (⃗p + p⃗′ ) and then we
integrate over p⃗′ , obtaining p⃗′ = −⃗p. We have finally reduced the Hamiltonian to an infinite
(continuum, even) sum over harmonic oscillators.
We have dealt with the first point observed earlier, about the dynamics of the theory.
We now address the second, of the explicit time dependence.
We have Heisenberg operators, for which the time evolution is
d
i a(⃗p, t) = [a(⃗p, t), H] (3.22)
dt
Calculating the commutator from the above Hamiltonian (using the fact that [a, aa† ] =
[a, a† a] = a), we obtain
d
i a(⃗p, t) = ωp a(⃗p, t) (3.23)
dt
More generally, the time evolution of the Heisenberg operators in field theories is given by

O(x) = OH (⃗x, t) = eiHt O(⃗x)e−iHt (3.24)

which is equivalent to

i OH = [O, H] (3.25)
∂t
via
d iAt −iAt
i (e Be ) = [B, A] (3.26)
dt
The solution of (3.23) is

a(⃗p, t) = ap⃗ e−iωp t


a† (⃗p, t) = a†p⃗ e+iωp t (3.27)

Replacing in ϕ(⃗x, t) and π(⃗x, t), we obtain



d3 p 1
ϕ(⃗x, t) = 3
√ (ap⃗ eip·x + a†p⃗ e−ip·x )|p0 =Ep
(2π) 2Ep

π(⃗x, t) = ϕ(⃗x, t) (3.28)
∂t
27
so we have formed the Lorentz invariants e±ip·x , though we haven’t written an explicitly
Lorentz
√ invariant formula. We will do that next lecture. Here we have denoted Ep =
p⃗ + m2 remembering that it is the relativistic energy of a particle of momentum p⃗ and
2

mass m.
Finally, of course, if we want the Schrödinger picture operators, we have to remember
the relation between the Heisenberg and Schrödinger pictures,

ϕH (⃗x, t) = eiHt ϕ(⃗x)e−iHt (3.29)

Discretization
Continuous systems are hard to understand, so it would be better if we could find a
rigorous way to discretize the system. Luckily, there is such a method, namely we consider a
space of finite volume V , i.e. we ”put the system in a box”. Obviously, this doesn’t discretize
space, but it does discretize momenta, since in a direction z of length Lz , allowed momenta
will be only kn = 2πn/Lz .
Then, the discretization is defined, as always, by

1 ∑
d3 k →
V
⃗k

δ (⃗k − ⃗k ′ ) → V δ⃗k⃗k′
3
(3.30)

to which we add the redefinition



a⃗k → V (2π)3 α⃗k (3.31)

which allows us to keep the usual orthonormality condition in the discrete limit, [α⃗k , α⃗k† ′ ] =
δ⃗k⃗k′ .
Using these relations, and replacing the time dependence, which cancels out of the Hamil-
tonian, we get the Hamiltonian of the free scalar field in a box of volume V ,
∑ ~ωk ∑
H= (α⃗k† α⃗k + α⃗k a⃗†k ) = h⃗k (3.32)
2
⃗k ⃗k

where h⃗k is the Hamiltonian of a single harmonic oscillator,


( )
1
h⃗k = ω⃗k N⃗k + , (3.33)
2

N⃗k = α⃗k† α⃗k is the number operator for mode ⃗k, with eigenstates |n⃗k >,

N⃗k |n⃗k >= n⃗k |n⃗k > (3.34)

and the orthonormal eigenstates are


1
|n >= √ (α† )n |0 >; < n|m >= δmn (3.35)
n!

28
Here α⃗k† = raising/creation operator and α⃗k =lowering/annihilation(desctruction) operator,
named so since they create and annihilate a particle, respectively, i.e.

α⃗k |n⃗k > = n |n − 1 >

√ ⃗k ⃗k
α⃗k |n⃗k > = n⃗k + 1|n⃗k + 1 >
( )
1
h⃗k |n > = ω⃗k n + |n > (3.36)
2
Therefore, as we know from quantum mechanics, there is a ground state |0 >, ⇔ n⃗k ∈ N+ ,
in which case n⃗k is called occupation number, or number of particles in the state ⃗k.
Fock space
The Hilbert space of states in terms of eigenstates of the number operator is called Fock
space, or Fock space representation. The Fock space representation for the states of a single
harmonic oscillator is H⃗k = {|n⃗k >}.
Since the total Hamiltonian is the sum of the Hamiltonians for each mode, the total
Hilbert space is the direct product of the Hilbert spaces of the Hamiltonians for each mode,
H = ⊗⃗k H⃗k . Its states are then
 
∏ ∏ 1
|{n⃗k } >= |n⃗k >=  √ (α⃗k† )n⃗k  |0 > (3.37)
⃗ ⃗
n⃗k !
k k

Note that we have defined a unique vacuum for all the Hamiltonians, |0 >, such that a⃗k |0 >=

0, ∀⃗k, instead of denoting it as ⃗k |0 >⃗k .
Normal ordering
For a single harmonic oscillator mode, the ground state energy, or zero point energy, is
~ω⃗k /2, which we may think it could have some physical significance. But ∑ for a free scalar
field, even one in a box (”discretized”), the total ground state energy is ⃗k ~ω⃗k /2 = ∞,
and since an observable of infinite value doesn’t make sense, we have to consider that it is
unobservable, and put it to zero.
In this simple model, that’s no problem, but consider the case where this free scalar field
is coupled to gravity. In a gravitational theory, energy is equivalent to mass, and gravitates,
i.e. it can be measured by its gravitational effects. So how can we drop a constant piece from
the energy? Are we allowed to do that? In fact, this is part of one of the biggest problems
of modern theoretical physics, the cosmological constant problem, and the answer to this
question is far from obvious. At this level however, we will not bother with this question
anymore, and drop the infinite constant.
But it is also worth mentioning, that while infinities are of course unobservable, the finite
difference between two infinite quantities might be observable, and in fact one such case was
already measured. If we consider the difference in the zero point energies between fields in
two different boxes, one of volume V1 and another of volume V2 , that is measurable, and
leads to the so-called Casimir effect, which we will discuss at the end of the course.
We are then led to define the normal ordered Hamiltonian,
1∑ ∑
: H := H − ~ω⃗k = ~ω⃗k N⃗k (3.38)
2
⃗k ⃗k

29
by dropping the infinite constant. The normal order is to have always a† before a, i.e.
: a† a := a† a, : aa† := a† a. Since a’s commute among themselves, as do a† ’s, and operators
from different modes, in case these appear, we don’t need to bother with their order. For
instance then, : aa† a† aaaa† := a† a† a† aaa.
We then consider that only normal ordered operators have physical expectation values,
i.e. for instance
< 0| : O : |0 > (3.39)
is measurable.
One more observation to make is that in the expansion of ϕ we have

(ap⃗ eip·x + a†p⃗ e−ip·x )p0 =Ep (3.40)



and here Ep = + p⃗2 + m2 , but we note that the second has positive frequency (energy),
a†p⃗ e−iEp t , whereas the first has negative frequency (energy), ae+iEp t , i.e. we create E > 0 and
destroy E < 0, which means that in this context we have only positive energy excitations.
But we will see in the next lecture that in the case of the complex scalar field, we create
both E > 0 and E < 0 and similarly destroy, leading to the concept of anti-particles. At
this time however, we don’t have that.
Bose-Einstein statistics
Since [a⃗†k , a⃗†k′ ] = 0, for a general state defined by a wavefunction ψ(⃗k1 , ⃗k2 ),

|ψ > = ψ(⃗k1 , ⃗k2 )α⃗k† α⃗k† |0 >
1 2
⃗k1 ,⃗k2

= ψ(⃗k2 , ⃗k1 )α⃗k† α⃗k† |0 > (3.41)
1 2
⃗k1 ,⃗k2

where in the second line we have commuted the two α† ’s and then renamed ⃗k1 ↔ ⃗k2 .
We then obtain Bose-Einstein statistics,

ψ(⃗k1 , ⃗k2 ) = ψ(⃗k2 , ⃗k1 ) (3.42)

i.e. for undistiguishable particles (permuting them we obtain the same state).
As an aside, note that the Hamiltonian of the free (bosonic) oscillator is 1/2(aa† + a† a)
(and of the free fermionic oscillator is 1/2(b† b−bb† )), so in order to have a well defined system
we must have [a, a† ] = 1, {b, b† } = 1. In turn, [a, a† ] = 1 leads to Bose-Einstein statistics, as
above.

Important concepts to remember

• The commutation relations for scalar fields are defined at equal time. The Poisson
brackets are defined in terms of integrals of functional derivatives.

30
• The canonical commutation relations for the free scalar field imply that we can use the
same redefinitions as for the harmonic oscillator, for the momentum modes, to obtain
the [a, a† ] = 1 relations.

• The Klein-Gordon equation for the free scalar field implies the Hamiltonian of the free
harmonic oscillator for each of the momentum modes.

• Putting the system in a box, we find a sum over discrete momenta of harmonic oscil-
lators, each with a Fock space.

• The Fock space for the free scalar field is the direct product of the Fock space for each
mode.

• We must use normal ordered operators, for physical observables, in particular for the
Hamiltonian, in order to avoid unphysical infinities.

• The scalar field is quantized in terms of the bose-einstein statistics.

Further reading: See chapters 2.3 in [2] and 2.1 and 2.3 in [1].

31
Exercises, Lecture 3

1) Consider the classical Hamiltonian


∫ { }
2
π (⃗
x , t) 1 λ λ̃
H = d3 x + (∇ϕ)
⃗ 2 + ϕ3 (⃗x, t) + ϕ4 (⃗x, t) (3.43)
2 2 3! 4!

Using the Poisson brackets, write the Hamiltonian equations of motion. Quantize canonically
the free system and compute the equations of motion for the Heisenberg operators.

2) Write down the Hamiltonian above in teerms of a(⃗p, t) and a† (⃗p, t) at the quantum
level, and then write down the normal ordered Hamiltonian.

32
4 Lecture 4: Propagators for free scalar fields
Relativistic invariance
We still need to understand the relativistic invariance of the quantization of the free
scalar field. The first issue is the normalization of the states. We saw that in the discrete
version of the quantized scalar field, we had in each mode states
1
|n⃗k >= √ (α⃗k† )n⃗k |0 > (4.1)
nk !

normalized as < m|n >= δmn . In discretizing, we had a⃗k → V α⃗k and δ 3 (⃗k − ⃗k ′ ) → V δ⃗k⃗k′ .
However, we want to have a relativistic normalization.
< p⃗|⃗q >= 2Ep⃗ (2π)3 δ 3 (⃗p − ⃗q) (4.2)
or in general, for occupation numbers in all momentum modes,
∑∏
< {⃗ki }|{qj } >= 2ω⃗ki (2π)3 δ 3 (⃗ki − ⃗qπ(j) ) (4.3)
π(j) i

We see that we are missing a factor of 2ωk V in each mode in order to get 2ωk δ 3 (⃗k − ⃗k ′ )
instead of δ⃗k⃗k′ . We therefore take the normalized states
∏ 1 √ √ ∏ 1 √
√ ( 2ω⃗k V (2π)3 α⃗k )n⃗k |0 >→ √ [a⃗†k 2ω⃗k ]n⃗k |0 >≡ |{⃗ki } > (4.4)
⃗k
n⃗k ! ⃗
n⃗k !
k

We now prove that we have a relativistically invariant formula.


First, we look at the normalization. It is obviously invariant under rotations, so we need
only look at boosts,
p′3 = γ(p3 + βE); E ′ = γ(E + βp3 ) (4.5)
Since
1
δ(f (x) − f (x0 )) = δ(x − x0 ) (4.6)
|f ′ (x0 )|
and a boost acts only on p3 , but not on p1 , p2 , we have

( )
3 ′ ′ dp3 ′ ′ dE3 γ E′
δ (⃗p − ⃗q) = δ (⃗p − ⃗q )
3
= δ(⃗p − ⃗q )γ 1 + β = δ 3 (⃗p′ − ⃗q′ ) (E + βp3 ) = δ 3 (⃗p′ − ⃗q′ )
dp3 dp3 E E
(4.7)
That means that Eδ 3 (⃗p − ⃗q) is relativistically invariant, as we wanted.
Also the expansion of the scalar field,

d3 p 1
ϕ(⃗x, t) = (ap⃗ eip·x + a†p⃗ e−ip·x )|p0 =Ep (4.8)
(2π)3 2Ep
contains the relativistic invariants e±ip·x , but we also have a relativistically invariant measure
∫ ∫
d3 p 1 d4 p
3
= 4
(2π)δ(p2 + m2 )|p0 >0 (4.9)
(2π) 2Ep (2π)

33
(since δ(p2 + m2 ) = δ(−(p0 )2 + Ep2 ) and then we use (4.6)), allowing to write

d4 p
ϕ(x) ≡ ϕ(⃗x, t) = 4
(2π)δ(p2 + m2 )|p0 >0 (ap⃗ eip·x + a†p⃗ e−ip·x )|p0 =Ep (4.10)
(2π)
Complex scalar field
We now turn to the quantization of the complex scalar field, in order to understand better
the physics of propagation in quantum field theory.
The Lagrangean is
L = −∂µ ϕ∂ µ ϕ∗ − m2 |ϕ2 |2 − U (|ϕ|2 ) (4.11)
This Lagrangean has a U (1) global symmetry ϕ → ϕeiα , or in other words ϕ is charged
under the U (1) symmetry.
Note the absence of the factor 1/2 in the kinetic term for ϕ with respect to the real scalar
field. The reason is that we treat ϕ and ϕ∗ as independent fields. Then the equation of
motion of ϕ is (∂µ ∂ µ − m2 )ϕ∗ = ∂U/∂ϕ. We could write the Lagrangean as a sum of two
real scalars, but then with a factor of 1/2, −∂µ ϕ1 ∂ µ ϕ1 /2 − ∂µ ϕ2 ∂ µ ϕ2 /2, since then we get
(∂µ ∂ µ − m2 )ϕ1 = ∂U/∂ϕ1 .
Exactly paralleling the discussion of the real scalar field, we obtain an expansion in terms
of a and a† operators, just that now we have complex fields, with twice as many degrees of
freedom, so we have a± and a†± , with half of the degrees of freedom in ϕ and half in ϕ† ,

d3 p 1
ϕ(⃗x, t) = 3
√ (a+ (⃗p, t)ei⃗p·⃗x + a†− (⃗p, t)e−i⃗p·⃗x )
(2π) 2ωp
∫ 3
dp 1
ϕ(⃗x, t) = 3
√ (a†+ (⃗p, t)e−i⃗p·⃗x + a− (⃗p, t)ei⃗p·⃗x )
(2π) 2ωp
∫ 3
( √ )
dp ωp
π(⃗x, t) = −i (a− (⃗p, t)ei⃗p·⃗x − a†+ (⃗p, t)e−i⃗p·⃗x )
(2π)3 ( √ 2)

d3 p ωp

π (⃗x, t) = 3
i (a†− (⃗p, t)e−i⃗p·⃗x − a+ (⃗p, t)ei⃗p·⃗x ) (4.12)
(2π) 2
As before, this ansatz is based on the harmonic oscillator, whereas the form of ωp comes out
of the KG equation. Substituting this ansatz inside the canonical quantization commutators,
we find
[a± (⃗p, t), a†± (⃗p′ , t)] = (2π)3 δ 3 (⃗p − p⃗′ ) (4.13)
and the rest zero. Again, we note the equal time for the commutators. Also, the time
dependence is the same as before.
We can calculate the U (1) charge operator (left as an exercise), obtaining

d3 k †
Q= [a a ⃗ − a†−⃗k a−⃗k ] (4.14)
(2π)3 +⃗k +k
Thus as expected from the notation used, a+ has charge + and a− has charge −, and therefore
we have ∫
d3 k
Q= [N ⃗ − N−⃗k ] (4.15)
(2π)3 +k

34
(the number of + charges minus the number of − charges).
We then see that ϕ creates − charge and annihilates + charge, and ϕ† creates + charge
and annihilates − charge.
Since in this simple example there are no other charges, we see that + and − particles are
particle/antiparticle pairs, i.e. pairs which are equal in everything, except they have opposite
charges. As promised last lecture, we have now introduced the concept of antiparticle, and
it is related to the existence of positive and negative frequency modes.
Therefore in a real field, the particle is its own antiparticle.
We consider the object
< 0|ϕ† (x)ϕ(y)|0 > (4.16)
corresponding to propagation from y = (ty , ⃗y ) to x = (tx , ⃗x), the same way as in quan-
tum mechanics < q ′ , t′ |q, t > corresponds to propagation from (q, t) to (q ′ , t′ ). This object
corresponds to a measurement of the field ϕ at y, then of ϕ† at x.
For simplicity, we will analyze the real scalar field, and we will use the complex scalar
only for interpretation. Substituting the expansion of ϕ(x), since a|0 >=< 0|a† = 0, and we
have < 0|(a + a† )(a + a† )|0 >, only
< 0|ap⃗ a†q⃗|0 > ei(⃗p·⃗x+⃗q·⃗y) (4.17)

survives in the sum, and, as < 0|aa† |0 >=< 0|a† a|0 > +[a, a† ] < 0|0 >, we get (2π)3 δ(⃗p − ⃗q)
from the expectation value. Then finally we obtain for the scalar propagation from y to x,

d3 p 1 ip·(x−y)
D(x − y) ≡< 0|ϕ(x)ϕ(y)|0 >= e (4.18)
(2π)3 2Ep
We now analyze what happens for varying x − y. By Lorentz transformations, we have
only two cases to analyze.
we can put tx − ty = t and ⃗x − ⃗y = 0. In this case, using
a) For timelike separation, √
3 2
d p = dΩp dp and dE/dp = p/ p2 + m2 ,
∫ ∞ 2 √
p dp 1 −i p2 +m2 t
D(x − y) = 4π √ e
(2π)3 2 p2 + m2
∫ ∞
0

1 t→∞
= 2
dE E 2 − m2 e−iEt ∝ e−imt (4.19)
4π m
which is oscillatory, i.e. it doesn’t vanish. But that’s OK, since in this case, we remain in
the same point as time passes, so the probability should be large.
b) For spacelike separation, tx = ty and ⃗x − ⃗y = ⃗r, we obtain
∫ ∫ ∞ ∫ 1
d3 p 1 i⃗p·⃗r p2 dp
D(⃗x − ⃗y ) = e = 2π d(cos θ)eipr cos θ
(2π)3 2Ep 2E (2π) 3
∫ ∞ 0 p −1 ∫
p2 dp [ eipr − e−ipr ] −i +∞
eipr
= 2π = pdp √ (4.20)
0 2Ep (2π)3 ipr (2π)2 2r −∞ p 2 + m2
∫∞
where in the last line we have redefined in the second term p → −p, and then added up 0
∫0
to −∞ .

35
+im

−im

+im

−im

Figure 4: By closing the contour in the upper-half plane, we pick up the residue of the pole
in the upper-half plane, at +im.

In the last form, we have eipr multiplying a function with poles at p = ±im, so we know,
by a theorem from complex analysis, that we can consider the integral in the complex p
plane, and add for free the integral on an infinite semicircle in the upper half plane, since
then eipr ∝ e−Im(p)r → 0. Thus closing the contour, we can use the residue theorem and say
that our integral equals the residue in the upper half plane, i.e. at +im, see Fig.4. Looking
at the residue for r → ∞, its leading behaviour is

D(⃗x − ⃗y ) ∝ ei(im)r = e−mr (4.21)

But for spacelike separation, at r → ∞, we are much outside the lightcone (in no time, we
move in space), and yet, propagation gives a small but nonzero amplitude, which is not OK.
But the relevant question is, will measurements be affected? We will see later why,
but the only relevant issue is whether the commutator [ϕ(x), ϕ(y)] is nonzero for spacelike

36
separation. We thus compute
∫ ∫
d3 p d3 q 1
[ϕ(x), ϕ(y)] = 3 3
√ [(ap⃗ eip·x + ap†⃗ e−ip·x ), (a⃗qeip·x + a†q⃗e−iq·y )]
(2π) (2π) 2Ep 2Eq
∫ 3
d p 1 ip·(x−y)
= 3
(e − eiq·(y−x) )
(2π) 2Ep
= D(x − y) − D(y − x) (4.22)

But if (x−y)2 > 0 (spacelike), (x−y) = (0, ⃗x −⃗y ) and we can make a Lorentz transforma-
tion (a rotation, really) (⃗x −⃗y ) → −(⃗x −⃗y ), leading to (x−y) → −(x−y). But since D(x−y)
is Lorentz invariant, it follows that for spacelike separation we have D(x − y) = D(y − x),
and therefore
[ϕ(x), ϕ(y)] = 0 (4.23)
and we have causality. Note that this is due to the existence of negative frequency states
(eip·x ) in the scalar field expansion. On the other hand, we should also check that for
timelike separation we have a nonzero result. Indeed, for (x − y)2 < 0, we can set (x − y) =
(tx − ty , 0) and so −(x − y) = (−(tx − ty ), 0) corresponds to time reversal, so is not a Lorentz
transformation, therefore we have D(−(x − y)) ̸= D(x − y), and so [ϕ(x), ϕ(y)] ̸= 0.
Klein-Gordon propagator
We are finally ready to describe the propagator. Consider the c-number

d3 p 1 ip·(x−y)
[ϕ(x), ϕ(y)] = < 0|[ϕ(x), ϕ(y)]|0 >= 3
(e − e−ip·(x−y) )
∫ [ (2π) 2Ep ]
d3 p 1 ip·(x−y) 1
= e |p0 =Ep + eip·(x−y)
|p0 =−Ep (4.24)
(2π)3 2Ep −2Ep

For x0 > y 0 , we can write it as


∫ ∫
d3 p dp0 1
eip·(x−y) (4.25)
(2π)3 C
2
2πi p + m 2

where the contour C is on the real line, except it avoids slightly above the two poles at
p0 = ±Ep , and then, in order to select both poles, we need to close the contour below,
with an infinite semicircle in the lower half plane, as in Fig.5. Closing the contour below
works, since for x0 − y 0 > 0, we have e−ip (x −y ) = e−Im(p )(x −y ) → 0. Note that this way
0 0 0 0 0 0

we get a contour closed clockwise, hence its result is minus the residue (plus the residue is
for a contour closed anti-clockwise), giving the extra minus sign for the contour integral to
reproduce the right result.
On the other hand, for this contour, if we have x0 < y 0 instead, the same reason above
says we need to close the contour above (with an infinite semicircle in the upper half plane).
In this case, there are no poles inside the contour, therefore the integral is zero.
We can then finally write for the retarded propagator (vanishes for x0 < y 0 )

d3 p dp0 1
DR (x − y) ≡ θ(x − y ) < 0|[ϕ(x), ϕ(y)]|0 >=
0 0
3 2 2
eip·(x−y)
(2π) 2πi p + m

37
−Ep +Ep

Figure 5: For the retarded propagator, the contour is such that closing it in the lower-half
plane picks up both poles at ±Ep .


d4 p −i
= eip·(x−y) (4.26)
(2π) p + m2
4 2

This object is a Green’s functions from the KG operator. This is easier to see in momentum
space. Indeed, making a Fourier transform,

d4 p ip·(x−y)
DR (x − y) = e DR (p) (4.27)
(2π)4

we obtain
−i
DR (p) = ⇒ (p2 + m2 )DR (p) = −i (4.28)
p2 + m2
which means that in x space (the Fourier transform of 1 is δ(x), and the Fourier transform
of p2 is −∂ 2 ),

(∂ 2 − m2 )DR (x − y) = iδ 4 (x − y) ↔ −i(∂ 2 − m2 )DR (x − y) = δ 4 (x − y) (4.29)

The Feynman propagator


Consider now a different ”iϵ prescription” for the contour of integration C. Consider a
contour that avoids slightly below the −Ep pole and avoids slightly above the +Ep pole, as
in Fig.6. This is equivalent with the Feynman prescription,

d4 p −i
DF (x − y) = eip·(x−y) (4.30)
(2π) p + m2 − iϵ
4 2

Since p2 + m2 − iϵ = −(p0 )2 + Ep2 − iϵ = −(p0 + Ep − iϵ/2)(p0 − Ep + iϵ/2), we have poles at


p0 = ±(Ep − iϵ/2), so in this form we have a contour along the real line, but the poles are
modified instead (+Ep is moved below, and −Ep is moved above the contour).
For this contour, as for DR (for the same reasons), for x0 > y 0 we close the contour below
(with an infinite semicircle in the lower half plane). The result for the integration is then

38
−Ep

+Ep

Figure 6: The contour for the Feynman propagator avoids −Ep from below and +Ep from
above.

the residue inside the closed contour, i.e. the residue at +Ep . But we saw that the clockwise
residue at +Ep is D(x − y) (and the clockwise residue at −Ep is −D(y − x)).
For x0 < y 0 , we need to close the contour above (with an infinite semicircle in the upper
half plane), and then we get the anticlockwise residue at −Ep , therefore +D(y − x). The
final result is then

DF (x−y) = θ(x0 −y 0 ) < 0|ϕ(x)ϕ(y)|0 > +θ(y 0 −x0 ) < 0|ϕ(y)ϕ(x)|0 >≡< 0|T (ϕ(x)ϕ(y))|0 >
(4.31)
This is then the Feynman propagator, which again is a Green’s function for the KG operator,
with the time ordering operator in the two-point function, the same that we defined in the
n-point functions in quantum mechanics, e.g. < q ′ , t′ |T [q̂(t1 )q̂(t2 )]|q, t >.
As suggested by the quantum mechanical case, the Feynman propagator will appear in the
Feynman rules, and has a physical interpretation as propagation of the particle excitations
of the quantum field.

Important concepts to remember

• The expansion of the free scalar field in quantum fields is relativistically invariant, as
is the relativistic normalization.

• The complex scalar field is quantized in terms of a± and a†± , which correspond to U (1)
charge ±1. ϕ creates minus particles and destroys plus particles, and ϕ† creates plus
particles and destroys minus particles.

• The plus and minus particles are particle/antiparticle pairs, since they only differ by
their charge.

• The object D(x−y) =< 0|ϕ(x)ϕ(y)|0 > is nonzero much outside the lightcone, however
[ϕ(x), ϕ(y)] is zero outside the lightcone, and since only this object leads to measurable
quantities, QFT is causal.

• DR (x − y) is a retarded propagator and corresponds to a contour of integration that


avoids the poles ±Ep from slightly above, and is a Green’s function for the KG operator.

39
• DF (x − y) =< 0|T [ϕ(x)ϕ(y)]|0 >, the Feynman propagator, corresponds to the −iϵ
prescription, i.e. avoids −Ep from below and +Ep from above, is also a Green’s function
for the KG operator, and it will appear in Feynman diagrams. It has the physical
interpretation of propagation of particle excitations of the quantum field.

Further reading: See chapters 2.4 in [2] and 2.4 in [1].

40
Exercises, Lecture 4

1) For
L = −∂µ ϕ∂ µ ϕ∗ − m2 |ϕ|2 − U (|ϕ|2 ) (4.32)
calculate the Noether current for the U (1) symmetry ϕ → ϕeiα in terms of ϕ(⃗x, t) and show
that it then reduces to the expression in the text,

d3 k †
Q= 3
[a+⃗k a+⃗k − a†−⃗k a−⃗k ] (4.33)
(2π)

2) Calculate the advanced propagator, DA (x − y), by using the integration contour that
avoids the ±Ep poles from below, instead of the contours for DR , DF , as in Fig.7.

−Ep +Ep

Figure 7: The contour for the advanced propagator avoids both poles at ±Ep from below.

41
5 Lecture 5: Interaction picture and Wick theorem for
λϕ4 in operator formalism
Quantum mechanics pictures
One can transform states and opertors (similar to the canonical transformations in clas-
sical mechanics) with unitary operators W (W † = W −1 ) by

|ψW >= W |ψ >


AW = W AW −1 (5.1)

Then the Hamiltonian in the new formulation is


∂W †
H′ = HW + i~ W (5.2)
∂t
where HW stands for the original Hamiltonian, transformed as an operator by W . The
statement is that H′ generates the new time evolution, so in particular we obtain for the
time evolution of the transformed operators
( )
∂AW ∂A ∂W †
i~ = i~ + [i~ W , AW ] (5.3)
∂t ∂t W ∂t

and that the new evolution operator is

U ′ (t, t0 ) = Ŵ (t)U (t, t0 )W † (t0 ) ̸= ÛW (5.4)

(note that on the left we have time t and on the right t0 , so we don’t obtain UW ). Here the
various pictures are assumed to coincide at time t = 0.
With this general formalism, let’s review the usual pictures.
Schrödinger picture (usual)
In the Schrödinger picture, ψS (t) depends on time, and usual operators don’t ∂∂t ÂS
= 0.
We could have at most a c-number time dependence, like As (t) · 1, maybe.
Heisenberg picture
In the Heisenberg picture, wavefunctions (states) are independent on time, ∂ψH /∂t = 0,
which is the same as saying that H′ = 0, the Hamiltonian generating the evolution of states
vanishes. That means that we also have
∂W †
HH = HS = −i~ W (5.5)
∂t
giving the time evolution of operators,

i~ ÂH (t) = [ÂH (t), H] (5.6)
∂t
If at t0 , the Schrödinger picture equals the Heisenberg picture, then

W (t) = US−1 (t, t0 ) = US (t0 , t) (5.7)

42
where US (t, t0 ) is the evolution operator (giving the time evolution of states) in the Schrödinger
picture.
Dirac (interaction) picture
The need for the general theory of QM pictures was so that we could define the picture
used in perturbation theory, the Dirac, or interaction, picture. We need it when we have

Ĥ = Ĥ0 + Ĥ1 (5.8)

where Ĥ0 is a free quadratic piece, and Ĥ1 is an interacting piece. Then, we would like to
write formulas like the ones already written, for the quantization of fields. Therefore we need
the time evolution of operators to be only in terms of Ĥ0 , leading to

i~ |ψI (t) >= Ĥ1I |ψI (t) >
∂t

i~ ÂI (t) = [ÂI (t), Ĥ0 ] (5.9)
∂t
where note that H0,I = H0,S . Note that here we have denoted the interaction piece by H1
instead of Hi , as we will later on, in order not to confuse the subscript with I, meaning in
the interaction picture. Once the distinction will become clear, we will go back to the usual
notation Hi .
Why is this picture, where operators (for instance the quantum fields) evolve with the
free part (Ĥ0 ) only, preferred? Because of the usual physical set-up. The interaction region,
where there is actual interaction between fields (or particles), like for instance in a scattering
experiment at CERN, is finite both in space and in time. That means that at t = ±∞, we
can treat the states as free. And therefore, it would be useful if we could take advantage of
the fact that asymptotically the states are the free states considered in the previous lectures.
We must distinguish then between the true vacuum of the interacting theory, a state
that we will call |Ω >, and the vacuum of the free theory, which we will call |0 > as before,
that satisfies ap⃗ |0 >= 0, ∀⃗p. For the full theory we should use |Ω >, but to use perturbation
theory we will relate to |0 >.
The basic objects to study are correlators, like the two-point function < Ω|T [ϕ(x)ϕ(y)]|Ω >,
where ϕ(x) are Heisenberg operators. The interaction picture field is related to the field at
the reference time t0 , where we define that the interaction picture field is the same as the
Heisenberg and the Schrödinger picture field, by

ϕI (t, ⃗x) = e∫iH0 (t−t0 ) ϕ(t0 , ⃗x)e−iH0 (t−t0 )


d3 p 1
= 3
√ (ap⃗ eip·x + a†p⃗ e−ip·x )|x0 =t−t0 ,p0 =Ep⃗ (5.10)
(2π) 2Ep⃗

since, as explained in a previous lecture, i~∂A/∂t = [A, H0 ] is equivalent to

A(t) = eiH(t−t0 ) A(t0 )e−iH0 (t−t0 ) , (5.11)

so the first line relates ϕI (t) with ϕI (t0 ), and then by the definition of the reference point t0
as the one point of equality of pictures, we can replace with the Schrödinger picture operator,

43
with no time dependence and the usual definition of space dependence in terms of a and a†
(as in the canonical quantization lecture). Then the action of H0 is replaced with the usual
(free) p0 in the exponent of eip·x (again as in the canonical quantization lecture).
λϕ4 theory
We will specialize to λϕ4 theory, with the free KG piece, plus an interaction,

λ
H = H0 + H1 = H0 + d 3 x ϕ 4 (5.12)
4!
for simplicity.
Let us consider the interaction picture evolution operator (note that here t0 is the refer-
ence time when the pictures are all equal; compare with (5.4))

UI (t, t0 ) = eiH0 (t−t0 ) e−iH(t−t0 ) (5.13)

where e−iH(t−t0 ) is the Schrödinger picture evolution operator US (t, t0 ), i.e.

|ψS (t) > = US (t, t0 )|ψS (t0 ) >


|ψI (t) > = UI (t, t0 )|ψI (t0 ) > (5.14)

Since the Heisenberg operator has ϕ(t, ⃗x) = eiH(t−t0 ) ϕ(t0 , ⃗x)e−iH(t−t0 ) , we have

ϕI (t, ⃗x) = eiH0 (t−t0 ) e−iH(t−t0 ) ϕ(t, ⃗x)eiH(t−t0 ) e−iH0 (t−t0 ) ⇒


ϕ(t, ⃗x) = UI† (t, t0 )ϕI (t, ⃗x)UI (t, t0 ) (5.15)

The goal is to find an expression for UI (t, t0 ), but to do that, we wil first prove a differential
equation:

i UI (t, t0 ) = Hi,I UI (t, t0 ) (5.16)
∂t
Let’s prove this. When we take the derivative on UI , we choose to put the resulting H and
H0 in the same place, in the middle, as:

i UI (t, t0 ) = eiH0 (t−t0 ) (H − H0 )e−iH(t−t0 ) (5.17)
∂t
But H − H0 = Hi,S (in the Schrödinger picture), and by the relation between Schrödinger
and interaction picture operators,

Hi,I = eiH0 (t−t0 ) Hi,S e−iH0 (t−t0 ) (5.18)

and therefore

UI (t, t0 ) = Hi,I eiH0 (t−t0 ) e−iH(t−t0 ) = Hi,I UI (t, t0 )
i (5.19)
∂t
q.e.d. We then also note that

iH0 (t−t0 ) −iH(t−t0 ) λ
Hi,I = e Hi,S e = d3 x ϕ4I (5.20)
4!

44
We now try to solve this differential equation for something like U ∼ e−iHI t by analogy with
the Schrödinger picture. However, now we need to be more careful. We can write for the
first few terms in the solution
∫ t ∫ t ∫ t1
2
UI (t, t0 ) = 1 + (−i) dt1 Hi,I (t1 ) + (−i) dt1 dt2 Hi,I (t1 )Hi,I (t2 ) + ... (5.21)
t0 t0 t0

and naming these the zeroth term, the first term and the second term, we can easily check
that we have
i∂t (first term) = Hi,I (t) × zeroth term
i∂t (second term) = Hi,I (t) × first term (5.22)
and this is like what
∑ we have when we write for c-numbers for instance ∂x eax = aeax , and
expanding eax = n (ax)n /n!, we have a similar relation.

t2

t1

Figure 8: The integration in (t1 , t2 ) is over a triangle, but can be written as half the integra-
tion over the rectangle.

But the integration region is a problem. In the (t1 , t2 ) plane, the integration domain is
the lower right-angle triangle in between t0 and t, i.e. half the rectangle bounded by t0 and
t. So we can replace the triangular domain by half the rectangular domain, see Fig.8. But
we have to be careful since [HI (t), HI (t′ )] ̸= 0, and since in the integral expression above, we
actually have time ordering, since t1 > t2 , to obtain equality suffices to put time ordering in
front, i.e.
∫ t ∫ ∫ t
(−i)2 t
UI (t, t0 ) = 1 + (−i) dt1 Hi,I (t1 ) + dt1 dt2 T {Hi,I (t1 )Hi,I (t2 )} + ... (5.23)
t0 2! t0 t0

and the higher order terms can be written in the same way, leading finally to
{ [ ∫ t ]}
UI (t, t0 ) = T exp − i dt′ Hi,I (t′ ) (5.24)
t0

Now that we have an expression for UI , we still want an expression for the vacuum of
the full theory, |Ω >. We want to express it in terms of the free vacuum, |0 >. Consider

45
< Ω|H|Ω >= E0 (the energy of the full vacuum), and we also have states |n > of higher
energy, En > E0 . On the other hand for the free vacuum, H0 |0 >= 0. The completeness
relation for the full theory is then

1 = |Ω >< Ω| + |n >< n| (5.25)
n̸=0

and introducing it in between e−iHT and |0 > in their product, we obtain



e−iHT |0 >= e−iE0 T |Ω >< Ω|0 > + e−iEn T |n >< n|0 > (5.26)
n̸=0

Then consider T → ∞(1 − iϵ), where we take a slightly complex time in order to dampen
away the higher |n > modes, since then e−iEn T → e−iEn ∞ × e−En ∞ϵ and e−En ∞ϵ ≪ e−E0 ∞ϵ ,
so we can drop the terms with n ̸= 0, with e−iEn T . Thus we obtain

e−iH(T +t0 ) |0 >


|Ω > = lim
T →∞(1−iϵ) e−iE0 (T +t0 ) < Ω|0 >
e−iH(T +t0 ) eiH0 (T +t0 ) |0 >
= lim
T →∞(1−iϵ) e−iE0 (T +t0 ) < Ω|0 >
UI (t0 , −T )|0 >
= lim −iE 0 (T +t0 ) < Ω|0 >
(5.27)
T →∞(1−iϵ) e

where in the first line we isolated T → T + t0 , in the second line we have introduced for free
a term, due to H0 |0 >= 0, and in the third line we have formed the evolution operator.
Similarly, we obtain for < Ω|,

< 0|UI (T, t0 )


< Ω| = lim −iE 0 (T −t0 ) < 0|Ω
(5.28)
T →∞(1−iϵ) e >

We now have all the ingredients to calculate the two-point function.


Consider in the case x0 > y 0 > t0 the quantity

< 0|UI (T, t0 )


< Ω|ϕ(x)ϕ(y)|Ω > = lim UI† (x0 , t0 )ϕI (x)UI (x0 , t0 )×
T →∞(1−iϵ) e−iE0 (T −t0 ) < 0|Ω >
UI (t0 , −T )|0 >
×UI† (y 0 , t0 )ϕ(y)UI (y 0 , t0 ) −iE0 (T +t0 )
e < Ω|0 >
< 0|UI (T, x0 )ϕI (x)UI (x0 , y 0 )ϕI (y)UI (y 0 , −T )|0 >
= lim (5.29)
T →∞(1−iϵ) e−iE0 (2T ) | < 0|Ω > |2

where we have used U † (t, t′ ) = U −1 (t, t′ ) = U (t′ , t) and U (t1 , t2 )U (t2 , t3 ) = U (t1 , t3 ).
On the other hand, multiplying the expression we have for |Ω > and < Ω|, we get

< 0|UI (T, t0 )UI (t0 , −T )|0 >


1 =< Ω|Ω >= (5.30)
| < 0|Ω > |2 e−iE0 (2T )

46
We divide then (5.29) by it, obtaining

< 0|UI (T, x0 )ϕI (x)UI (x0 , y 0 )ϕI (y)UI (y 0 , −T )|0 >
< Ω|ϕ(x)ϕ(y)|Ω >= lim (5.31)
T →∞(1−iϵ) < 0|UI (T, −T )|0 >

We now observe that, since we used x0 > y 0 > t0 in our calculation, both the left hand
side and the right hand side are time ordered, so we can generalize this relation with T
symbols on both lhs and rhs. But then, since T (AB) = T (BA) (the time ordering operators
orders temporally, independent of their initial order inside it), we can permute how we want
the operators inside T (), in particular by extracting the field operators to the left and then
multiplying the resulting evolution operators, UI (T, x0 )UI (x0 , y 0 )UI (y 0 , −T ) = UI (T, −T ).
We then use the exponential expression for the evolution operator, to finally get our desired
result
{ [ ∫T ]}
< 0|T ϕI (x)ϕI (y) exp − i −T dtHI (t) |0 >
< Ω|T {ϕ(x)ϕ(y)}|Ω >= lim { [ ∫T ]} (5.32)
T →∞(1−iϵ)
< 0|T exp − i −T dtHI (t) |0 >

This is called Feynman’s theorem and we can generalize it to a product of any number of
ϕ’s, and in fact any insertions of Heisenberg operators OH , so we can write
{ [ ∫T ]}
< 0|T OI (x1 )...OI (xn ) exp − i −T dtHI (t) |0 >
< Ω|T {OH (x1 )...OH (xn )}|Ω >= lim { [ ∫T ]}
T →∞(1−iϵ)
< 0|T exp − i −T dtHI (t) |0 >
(5.33)
Wick’s theorem
From Feynman’s theorem, we see we need to calculate < 0|T {ϕI (x1 )...ϕI (xn )}|0 > (from
explicit insertions, or the expansion of the exponential of HI (t)’s). To do that, we split

ϕI (x) = ϕ+
I (x) + ϕI (x) (5.34)

where

d3 p 1
ϕ+
I = 3
√ ap⃗ eipx
(2π) 2Ep

d3 p 1

ϕI = 3
√ a†p⃗ e−ipx (5.35)
(2π) 2Ep

Note that then ϕ+ I |0 >= 0 =< 0|ϕI . We defined normal order, : () :, which we will denote
here (since in the Wick’s theorem case it is usually denoted like this) by N (), which means
ϕ− +
I to the left of ϕI .
Consider then, in the x0 > y 0 case,
− − − − −
T (ϕI (x)ϕI (y)) = ϕ+ + + + +
I (x)ϕI (y) + ϕI (x)ϕI (y) + ϕI (x)ϕI (y) + ϕI (y)ϕI (x) + [ϕI (x), ϕI (y)]

= N (ϕI (x)ϕI (y)) + [ϕ+
I (y), ϕI (y)] (5.36)

47
In the y 0 > x0 case, we need to change x ↔ y.
We now define the contraction

ϕI (x)ϕI (y) = [ϕ+ 0
I (x), ϕI (y)], for x > y
0
+ − 0 0
= [ϕI (y), ϕI (x)] for y > x (5.37)

Then we can write


T [ϕI (x)ϕI (y)] = N [ϕI (x)ϕI (y) + ϕI (x)ϕI (y)] (5.38)
But since [ϕ+ , ϕ− ] is a c-number, we have

[ϕ+ , ϕ− ] =< 0|[ϕ+ , ϕ− ]|0 >=< 0|ϕ+ ϕ− |0 >= D(x − y) (5.39)

where we also used < 0|ϕ− = ϕ+ |0 >= 0. Then we get

ϕ(x)ϕ(y) = DF (x − y) (5.40)

Generalizing (5.38), we get Wick’s theorem,

T {ϕI (x1 )...ϕI (xn )} = N {ϕI (x1 )...ϕI (xn ) + all possible contractions} (5.41)

where all possible contractions means not only full contractions, but also partial contractions.
Note that for instance if we contract 1 with 3, we obtain

N (ϕ1 ϕ2 ϕ3 ϕ4 ) = DF (x1 − x3 )N (ϕ2 ϕ4 ) (5.42)

Why is the Wick theorem useful? Since for any nontrivial operator, < 0|N (anything)|0 >=
0, as < 0|ϕ− = ϕ+ |0 >= 0, so only the fully contracted terms (c-numbers) remain in the
< 0|N ()|0 >, giving a simple result. Consider for instance the simplest nontrivial term,

T (ϕ1 ϕ2 ϕ3 ϕ4 ) = N (ϕ1 ϕ2 ϕ3 ϕ4 + ϕ1 ϕ2 ϕ3 ϕ4 + ϕ1 ϕ2 ϕ3 ϕ4 + ϕ1 ϕ2 ϕ3 ϕ4

+ϕ1 ϕ2 ϕ3 ϕ4 + ϕ1 ϕ2 ϕ3 ϕ4 + ϕ1 ϕ2 ϕ3 ϕ4

+ϕ1 ϕ2 ϕ3 ϕ4 + ϕ1 ϕ2 ϕ3 ϕ4 + ϕ1 ϕ2 ϕ3 ϕ4 ) (5.43)

Which then gives under the vacuum expectation value,

< 0|T (ϕ1 ϕ2 ϕ3 ϕ4 )|0 > = DF (x1 − x2 )DF (x3 − x4 ) + DF (x1 − x4 )DF (x2 − x3 )
+DF (x1 − x3 )DF (x2 − x4 ) (5.44)

For the proof of the Wick’s theorem, we use induction. We have proved the initial step,
for n = 2, so it remains to prove the step for n, assuming the step for n − 1. This is left as
an exercise.

Important concepts to remember

48
• Unitary operators transform states and operators in quantum mechanics, changing the
picture. The new Hamiltonian is different than the transformed Hamiltonian.

• In QFT, we use the Dirac or interaction picture, where operators evolve in time with
H0 , and states evolve with Hi,I .

• Perturbation theory is based on the existence of asymptotically free states (at t = ±∞


and/or spatial infinity) where we can use canonical quantization of the free theory, and
use the vacuum |0 > to perturb around, while the full vacuum is |Ω >.

• In the interacting theory, the interaction picture field ϕI (t, ⃗x) has the same expansion
as the Heisenberg field in the free KG case studied before.

• The time evolution operator of the interaction picture is written as UI (t, t0 ) = T exp[−i dtHi,I (t)].

• Feynman’s theorem expresses correlators (n-point functions) of the full theory as the
ratio of VEV in the free vacuum of T of interaction picture operators, with an insertion
of UI (T, −T ), divided by the same thing without the insertion of operators.

• the contraction of two fields gives the Feynman propagator.

• Wick’s theorem relates the time order of operators with the normal order of a sum of
terms: the operators, plus all their possible contractions.

• In < 0|T (...)|0 > that appears in Feynman’s theorem as the thing to calculate, only
the full contractions give a nonzero result.

Further reading: See chapters 4.2,4.3 in [2] and 2.5 in [1].

49
Exercises, Lecture 5

1) For an interaction Hamiltonian



λ
HI = d3 x ϕ4I , (5.45)
4!

write down the explicit form of the Feynman theorem for the operators OI (x)OI (y)OI (z),
where OI (x) = ϕ2 (x), in order λ2 for the numerator, and then the Wick theorem for the
numerator (not all the terms, just a few representative ones).

2) Complete the induction step of Wick’s theorem and then write all the nonzero terms
of the Wick theorem for
< 0|T {ϕ1 ϕ2 ϕ3 ϕ4 ϕ5 ϕ6 }|0 > (5.46)

50
6 Lecture 6: Feynman rules for λϕ4 from the operator
formalism
We saw that the Feynman theorem relates the correlators to a ratio of vacuum expectation
values (VEVs) in the free theory, of time orderings of the interaction picture fields, i.e. things
like
< 0|T {ϕ(x1 )...ϕ(xn )|0 > (6.1)
which can then be evaluated using Wick’s theorem, as the sum of all possible contractions
(products of Feynman propagators).
This includes insertions of [ ∫ ]n
− i dtHi,I (t) (6.2)

inside the time ordering.


We can make a diagrammatic representation for
< 0|T (ϕ1 ϕ2 ϕ3 ϕ4 )|0 > = DF (x1 − x2 )DF (x3 − x4 ) + DF (x1 − x4 )DF (x2 − x3 )
+DF (x1 − x3 )DF (x2 − x4 ) (6.3)
as the sum of 3 terms: one where the pairs of points (12), (34) are connected by lines, the
others for (13), (24) and (14), (23), as in Fig.9. We call these the Feynman diagrams, and we
will see later how to make this in general.

1 2 1 2

3 4 3 4
1 2
+
3 4

Figure 9: Feynman diagrams for free processes for the 4-point function. We can join the
external points in 3 different ways.

These quantities that we are calculating are not yet physical, we will later relate it to
some physical scattering amplitudes, but we can still use a physical interpretation of particles
propagating, being created and annihilated.
We will later define more precisely the so-called S-matrices that relate to physical scatter-
ing amplitudes, and define their relation to the correlators we compute here, but for the mo-
ment we will just note that we have something like Sf i =< f |S|i >, where S = UI (+∞, −∞).

51
So in some sense this corresponds to propagating the state |i > from −∞ to +∞, and then
computing the probability to end up in the state |f >. For now, we will just study the
abstract story, and leave the physical interpretation for later.
Let us consider a term with a Hi,I in it, for instance the first order term (order λ term)
in Hi,I for the 2-point functions,
{ [ ∫ ]}
< 0|T ϕ(x)ϕ(y) − i dtHi,I |0 > (6.4)
∫ ∫
where dtHi,I (t) = d4 zλ/4!ϕ4I . We then write this term as
{ ( )∫ }
λ
< 0|T ϕ(x)ϕ(y) −i d zϕ(z)ϕ(z)ϕ(z)ϕ(z) |0 >
4
(6.5)
4!
In this, there are 2 independent Wick contractions we can consider:
-contracting x with y and z’s among themselves: ϕ(x)ϕ(y) and ϕ(z)ϕ(z)ϕ(z)ϕ(z)
-contracting x with z and y with z: ϕ(x)ϕ(z), ϕ(y)ϕ(z), ϕ(z)ϕ(z).
The first contraction can be done in 3 ways, because we can choose in 3 ways which is
the contraction partner for the first ϕ(z) (and the last contraction is then fixed).
The second contraction can be done in 4 × 3 = 12 ways, since we have 4 choices for the
ϕ(z) to contract with ϕ(x), then 3 remaining choices for the ϕ(z) to contract with ϕ(y). In
total we have 3 + 12 = 15 terms, corresponding to having the 6 ϕ’s, meaning 5 ways to make
the first contraction between them, 3 ways to make the second, and one way to make the
last.
The result is then
{ [ ∫ ]} ( ) ∫
λ
< 0|T ϕ(x)ϕ(y) − i dtHi,I |0 > = 3 −i DF (x − y) d4 zDF (z − z)DF (z − z)
( 4! )∫
λ
+12 −i d4 zDF (x − z)DF (y − z)DF (z − z)
4!
(6.6)
The corresponding Feynman diagrams for these terms are: a line between x and y, and a
figure eight with the middle point being z, and for the second term, a line between x and y
with a z in the middle, and an extra loop starting and ending again at z, as in Fig.10.
Consider now a more complicated contraction: In the O(λ3 ) term
{ ( )3 ∫ ∫ ∫
1 λ
< 0|T ϕ(x)ϕ(y) −i d zϕ(z)ϕ(z)ϕ(z)ϕ(z) d wϕ(w)ϕ(w)ϕ(w)ϕ(w) d4 uϕ(u)ϕ(u)ϕ(u)ϕ(u)
4 4
3! 4!
(6.7)
contract ϕ(x) with a ϕ(z), ϕ(y) with a ϕ(w), a ϕ(z) with a ϕ(w), two ϕ(w)’s with two ϕ(u)’s,
and the remaing ϕ(z)’s between them, the remaining ϕ(u)’s between them. This gives the
result
( )3 ∫
1 −iλ
d4 zd4 wd4 uDF (x−z)DF (z−z)DF (z−w)DF (w−u)DF (w−u)DF (u−u)DF (w−y)
3! 4!
(6.8)

52
z +
x y x z y

Figure 10: The two Feynman diagrams at order λ in the expansion (first order in the
interaction vertex).

The corresponding Feynman diagram then has the points x and y with a line in betwen
them, on which we have the points z and w, with an extra loop starting and ending at z,
two lines forming a loop between w and an extra point u, and an extra loop starting and
ending at u, as in Fig.11.

x z w y

Figure 11: A Feynman diagram at order λ3 , forming 3 loops.

Let us count the number of identical contractions for this diagram:∫ there are 3! ways
4
of choosing which to call z, w, u (permutations of 3 objects). Then∫ 4in d zϕϕϕϕ, we can
choose the contractions with ϕ(x) and with ϕ(w) in 4 × 3 ways, in d wϕϕϕϕ we can choose
the∫ contractions with z, y and u, u in 4 × 3 × 2 × 1 ways, and then choose the contractions
in d4 uϕϕϕϕ with ϕ(w)ϕ(w) in 4 × 3 ways. But then, we overcounted in choosing the two
w − u contractions, by counting a 2 in both w and u, so we must divide by 2, for a total of

1 3! × (4!)3
3! × (4 × 3) × (4 × 3 × 2) × (4 × 3) × = (6.9)
2 8
We note that this almost cancels the 1/(3! × (4!)3 ) in front of the Feynman diagram result.
We therefore define the symmetry factor, in general

denominator p!(4!)p
= (6.10)
nr. of contractions # of contractions
which in fact equals the number of symmetries of the Feynman diagram. Here the symmetry
factor is
2×2×2=8 (6.11)

53
which comes because we can interchange the two ends of DF (z − z) in the Feynman diagram,
obtaining the same diagram, then we can interchange the two ends of DF (u − u) in the same
way, and we can also interchange the two DF (w − u) obtaining the same diagram.
Let us see a few more examples. Consider the one loop diagram with a line between x
and y, with a point z in the middle, with an extra loop starting and ending at the same z,
see Fig.12a. We can interchange the two ends of DF (z − z) obtaining the same diagram,
thus the symmetry factor is S = 2. Consider then the figure eight diagram (vacuum bubble)
with a central point z, see Fig.12b. It has a symmetry factor of S = 2 × 2 × 2 = 8, since
we can interchange the ends of one of the DF (z − z), also of the other DF (z − z), and we
can also interchange the two DF (z − z) between them. Consider then the ”setting sun”
diagram, see Fig.12c, with DF (x − z), theree DF (z − w)’s, and then DF (w − y). We can
permute in 3! ways the three DF (z − w)’s, obtaining the same diagram, thus S = 3! = 6.
Finally, consider the diagram with DF (x − z), DF (z − y), DF (z − u), DF (z − w) and three
DF (u − w)’s, see Fig.12d. It has a symmetry factor of S = 3! × 2 = 12, since there are 3!
ways of interchanging the three DF (u − w)’s, and we can rotate the subfigure touching at
point z, thus interchanging DF (z − u) with DF (z − w) and u with w.

S=2x2x2=8 S=3!=6
S=3!x2=12
S=2
x y
x y
x y
a) c) d)
b)

Figure 12: Examples of symmetry factors. For diagram a, S = 2, for diagram b, S = 8, for
diagram c, S = 6 and for diagram d, S = 12.

x-space Feynman rules for λϕ4


We are now ready to state the x-space Feynman rules for the numerator of the Feynman
theorem, { [ ∫ ]}
< 0|T ϕI (x1 )...ϕI (xn ) exp − i Hi,I (t) |0 >, (6.12)

see Fig.13.

propagator = vertex=
x external point=
x y
x

Figure 13: Pictorial representations for Feynman rules in x space.

-For the propagator we draw a line between x and y, corresponding to DF (x − y).

54
-For
∫ the vertex we draw a point z with 4 lines coming out of it, and it corresponds to
(−iλ) d4 z.
-For the external point (line) at x, a we draw a line ending at x, corresponding to a factor
of 1 (nothing new).
-After drawing the Feynman diagram according to the above, we divide the result by the
symmetry factor.
-Then summing over all possible diagrams of all orders in λ, we get the full result for the
numerator above.
Let us now turn to p-space.
The Feynman propagator is written as

d4 p −i
DF (x − y) = eip·(x−y) (6.13)
(2π) p + m − iϵ
4 2 2

so that the p-space propagator is


−i
DF (p) = (6.14)
p2 + m2 − iϵ

p4
p1

p3
p2

Figure 14: Examples of momenta at a vertex, for definition of convention.

We must choose a direction for the propagator, and we choose arbitrarily to be from x
to y (note that DF (x − y) = DF (y − x), so the order doesn’t matter). Then for instance the
line going into a point y corresponds to a factor of e−ipy (see the expression above). Consider
that at the 4-point vertex z, for instance p4 goes out, and p1 , p2 , p3 go in, as in Fig.14. Then
we have for the factors depending on z

d4 ze−ip1 z e−ip2 z e−ip3 z e+ip4 z = (2π)4 δ (4) (p1 + p2 + p3 − p4 ) (6.15)

i.e., momentum conservation.


p-space Feynman rules
Thus the Feynman rules in p-space are (see Fig.15)
-For the propagator we write a line with an arrow in the direction of the momentum,
corresponding to DF (p).
-For the vertex, we write 4 lines going into a point, corresponding to a factor of −iλ.
-For the external point (line), we write an arrow going into the point x, with momentum
p, corresponding to a factor of e−ip·x .

55
-We then impose momentum conservation ∫at each vertex.
-We integrate over the internal momenta, d4 p/(2π)4 .
-We divide by the symmetry factor.

propagator= external point=


=vertex x p
p

Figure 15: Pictorial representation of Feynman rules in p space.

We should note something: why do we have the Feynman propagator, i.e. the Feynman
contour? The Feynman contour avoids the +Ep pole from above and the −Ep pole from
below, and is equivalent with the −iϵ prescription, which takes p0 ∝ 1 + iϵ, see Fig.16.

−Ep −Ep
=
+Ep +Ep

Figure 16: The Feynman contour can be understood in two different ways. On the rhs, the
contour is a straight line.
∫T
It is needed since without it, due to T → ∞(1 − iϵ), in −T dz 0 d3 zei(p1 +p2 +p3 −p4 )·z , we
would have a factor of e±(p1 +p2 +p3 −p4 )ϵ which blows up on one of the sides (positive or negative
0 0 0 0

p0 ). But with it, we have in the exponent (1 + iϵ)(1 − iϵ) = 1 + O(ϵ2 ), so we have no factor
to blow up in order ϵ.
Cancelling of the vacuum bubbles in numerator vs. denominator
Vacuum bubbles give infinities, for instance the p-space double figure eight vacuum di-
agram, with momenta p1 and p2 between points z and w, and also a loop of momentum
p3 going out and then back inside z and another loop of momentum p4 going out and then
back inside w, as in Fig.17. Then momentum conservation at z and w, together with the
integration over p1 gives
∫ 4
d p1
(2π)4 δ 4 (p1 + p2 ) × (2π)4 δ 4 (p1 + p2 ) = (2π)4 δ 4 (0) = ∞ (6.16)
(2π)4

which can also be understood as d4 w(const) = 2T · V (time T → ∞ and volume V → ∞).
But these unphysical infinities cancel out between the numerator and denominator of the
Feynman theorem: they factorize and exponentiate in the same way in the two. Let us see
that.

56
p1

p4
p3
p2

Figure 17: Vacuum bubble diagram, giving infinities.

In the numerator, for the two point function with ϕ(x) and ϕ(y), we have the connected
pieces, like the setting sun diagram, factorized, multiplied by a sum of all the possible vacuum
bubbles (i.e., with no external points x and y): terms like the figure eight bubble, that we
will call V1 , a product of two figure eights, (V1 )2 , the same times a setting sun bubble (4
propagators between two internal points) called V2 , etc., see Fig.18.

+ X + x +....
X

V1 V2

Figure 18: The infinite vacuum bubbles factorize in the calculation of n-point functions
(here, 2-point function).

Then in general we have (the 1/ni ! is for permutations of identical diagrams)


{ [ ∫ T ]}
lim < 0|T ϕI (x)ϕI (y) exp − i dtHI (t) |0 >
T →∞(1−iϵ) −T
∑ ∏ 1
= connected piece × (Vi )ni
n i !
all {ni } sets i
( )( )
∑ 1 ∑ 1
= connected pieces × (V1 ) n1
(V2 ) n2
× ...
n(
n1 ! n2 !
1 ) n 2
∏ ∑ 1
= connected pieces × (Vi )ni
i n
n i !
∑ i

= connected pieces × e i Vi
, (6.17)

as in Fig.19.
But the same thing happens for the denominator, obtaining
{ [ ∫ T ]} ∑
lim < 0|T exp − i dtHI (t) |0 >= e i Vi (6.18)
T →∞(1−iϵ) −T

57
+ + +...
x y x y x y

exp + + +....

Figure 19: The final result for the factorization of the vacuum bubble diagrams in the 2-point
function, with an exponential that cancels against the same one in the denominator.

and cancels with the numerator, obtaining that

< Ω|T {ϕ(x)ϕ(y)}|Ω > (6.19)

is the sum of only the connected diagrams.


Thus in the same way, in general,

< Ω|T {ϕ(x1 )...ϕ(xn )}|Ω >= sum of all connected diagrams (6.20)

This is the last thing to add to the Feynman rules: for the correlators, take only connected
digrams.

Important concepts to remember

• Feynman diagrams are pictorial representations of the contractions of the perturbative


terms in the Feynman theorem. They have the interpretation of particle propagation,
creation and annihilation, but at this point, it is a bit abstract, since we have not yet
related to the physical S-matrices for particle scattering.

• The symmetry factor is given by the number of symmetric diagrams (by a symmetry,
we get the same diagram).

• The Feynman rules in x-space have DF (x − y) for the propagator, −iλ d4 z for the
vertex and 1 for the external point, and dividing by the symmetry factor at the end.

• The Feynman rules in p-space have DF (p) for the propagator, −iλ for the vertex, e−ip·x
for the external line going in x, momentum conservation at each vertex, integration
over internal momenta, and divide by the symmetry factor.

58
• Vacuum bubbles factorize, exponentiate and cancel between the numerator and de-
nominator of the Feynman theorem, leading to the fact that the n-point functions
have only connected diagrams.

Further reading: See chapter 4.4 in [2].

59
Exercises, Lecture 6

1) Apply the x-space Feynman rules to write down the expression for the Feynman
diagram in Fig.20.

x y

Figure 20: x-space Feynman diagram.

2) Idem, for the p-space diagram in Fig.21.

p p

Figure 21: p space Feynman diagram.

60
7 Lecture 7: The driven (forced) harmonic oscillator
We have seen that for a quantum mechanical system, we can write the transition amplitudes
as path integrals, via

F (q ′ , t′ ; q, t) ≡ ∫ ′ ′ ′ −iĤ(t −t)
H < q , t |q, t >H =< q∫|e |q >
{ tn+1 }
= Dp(t)Dq(t) exp i dt[p(t)q̇(t) − H(q(t), q(t))] (7.1)
t0

and, if the Hamiltonian is quadratic in momenta, H(p, q) = p2 /2 + V (q), then



′ ′
F (q , t ; q, t) = N DqeiS[q] (7.2)

where N is a constant. The important objects to calculate in general (in quantum mechanics
F (q ′ t; q ′ t) would be sufficient, but not in quantum field theory) are the correlators or n-point
functions,
′ ′
∫ q , t |T {q̂(t̄1 )...q̂(t̄N )}|q, t >
GN (t̄1 , ..., t̄N ) = <
= Dq(t)eiS[q] q(t̄1 )...q(t̄N ) (7.3)

We can compute all the correlators from their generating functional,


∫ ∫ ∫
Z[J] = Dqe iS[q;J]
≡ DqeiS[q]+i dtJ(t)q(t) (7.4)

by
δ δ
GN (t1 , ..., tN ) = ... Z[J]|J=0 (7.5)
iδJ(t1 ) iδJ(tN )
In the above, the object J(t) was just a mathematical artifice, useful only to obtain the
correlators through
∫ derivatives of Z[J]. But actually, considering that we have the action
S[q; J] = S[q]+ dtJ(t)q(t), we see that a nonzero J(t) acts as a source term for the classical
q(t), i.e. an external driving force for the harmonic oscillator, since its equation of motion is
now
δS[q; J] δS[q]
0= = + J(t) (7.6)
δq(t) δq(t)
So in the presence of nonzero J(t) we have a driven harmonic oscillator.
We can consider then as a field theory primer the (free) harmonic oscillator driven by an
external force, with action
∫ [ ]
1
S[q; J] = dt (q̇ 2 − ω 2 q 2 ) + J(t)q(t) (7.7)
2
which is quadratic in q, therefore the path integral is gaussian, of the type we already
performed in Lecture 2 (when we went from the phase space path integral to the configuration
space path integral). That means that we will be able to compute the path integral exactly.

61
But there is an important issue of boundary conditions for q(t). We will first make a
naive treatment, then come back to do it better.
Sloppy treatment
If we can partially integrate q̇ 2 /2 in the action without boundary terms (not quite correct,
see later), then we have
∫ { [ d2 ] }
1
S[q; J] = dt − q(t) 2 + ω 2 q(t) + J(t)q(t) (7.8)
2 dt
so then the path integral is of the form

−1 q+iJ·q
Dqe− 2 q∆
1
Z[J] = N (7.9)

where we have defined iS = −1/2q∆−1 q + ..., so that


[ d2 ]
−1
∆ q(t) ≡ i 2 + ω q(t)
2
∫ dt
J ·q ≡ dtJ(t)q(t) (7.10)

Here ∆ is the propagator. Then, remembering the general gaussian integration formula,
1 T
S = x Ax + bT x ⇒
∫ 2
1 T −1
dn xe−S(x) = (2π)n/2 (det A)−1/2 e 2 b A b (7.11)

where now b = −iJ(t) and A = ∆−1 , we obtain

Z[J] = N ′ e− 2 J·∆·J
1
(7.12)

where N ′ contains besides N and factors of 2π, also (det ∆)1/2 , which is certainly nontrivial,
however it is J-independent, so it is put as part of the overall constant. Also,
∫ ∫
J · ∆ · J ≡ dt dt′ J(t)∆(t, t′ )J(t′ ) (7.13)

Here we find ∫ ′
′ dp e−ip(t−t )
∆(t, t ) = i (7.14)
2π p2 − ω 2
since
[ d2 ] ∫ dp e−ip(t−t′ ) ∫
dp −p2 + ω 2 −ip(t−t′ )
−1 ′
∆ ∆(t, t ) = i 2 + ω i2
= − e
∫ dt 2π p2 − ω 2 2π p2 − ω 2
dp −ip(t−t′ )
= e = δ(t − t′ ) (7.15)

62
But we note that there is a singularity p2 = ω 2 . In fact, we have seen before that we avoided
these singularities using a certain integration contour in complex p0 space, giving the various
propagators. Let’s think better what this means. The question is, in the presence of this
singularity, is the operator ∆−1 invertible, so that we can write down the above formula for
∆? An operator depends also on the space of functions on which it is defined, i.e. in the
case of quantum mechanics, on the Hilbert space of the theory.
If we ask whether ∆−1 is invertible on the space of all the functions, then the answer is
obviously no. Indeed, there are zero modes, i.e. eigenfunctions with eigenvalue zero for ∆−1 ,
namely ones satisfying
[ d2 ]
2
+ ω q0 (t) = 0 (7.16)
dt2
and since ∆−1 q0 = 0, obviously on q0 we cannot invert ∆−1 . Moreover, these zero modes are
not even pathological, so that we can say we neglect them, but rather these are the classical
solutions for the free oscillator!
So in order to find an invertible operator ∆−1 we must exclude these zero modes from
the Hilbert space on which ∆−1 acts (they are classical solutions, so obviously they exist in
the theory), by imposing some boundary conditions that exclude them.
We will argue that the correct result is
∫ ′
′ dp e−ip(t−t )
∆F (t, t ) = i (7.17)
2π p2 − ω 2 + iϵ
which is the Feynman propagator (in 0 + 1 dimensions, i.e. with only time, but no space).
Indeed, check that, with p · p = −(p0 )2 , and p0 called simply p in quantum mechanics, we
get this formula from the previously defined Feynman propagator.
Note that if ∆−1
F has {qi (t)} eigenfunctions with λi ̸= 0 eigenvalues, such that the eigen-
functions are orthonormal, i.e.

qi · qj ≡ dtqi (t)∗ qj (t) = δij

(7.18)

then we can write ∑


∆−1 ′
F (t, t ) = λi qi (t)qi (t)∗ (7.19)
i

inverted to ∑ 1
∆F (t, t′ ) = qi (t)qi (t′ )∗ (7.20)
i
λi
More generally, for an operator A with eigenstates |q > and eigenvalues aq , i.e. A|q >=
aq |q >, with < q|q ′ >= δqq′ , then

A= aq |q >< q| (7.21)
q

Our operator
∑ ∫ is roughly of this form, but not quite. We have something like qi (t) ∼
−ipt
e , i ∼ dp/(2π) and λi ∼ (p − ω 2 + iϵ), but we must be more precise. Note that
2

63
if at t → +∞, all qi (t) → same function, then ∆F (t → +∞) gives the same, and analo-
gously for t → −∞.
We can in fact do the integral in ∆F as in previous lectures, since we have poles at
p = ±(ω − iϵ), corresponding to the Feynman contour for the p integration. For t − t′ > 0,
we can close the contour below, since we have exponential decay in the lower half plane, and
pick up the pole at +ω, whereas for t − t′ < 0 we close the contour above and pick up the
pole at −ω, all in all giving
1 −iω|t−t′ |
∆F (t, t′ ) = e (7.22)

We see that for t → ∞, ∆F ∼ e−iωt , whereas for t → −∞, ∆F ∼ e+iωt , therefore the
boundary conditions for the q(t)’s are

q(t) ∼ e−iωt , t → ∞
q(t) ∼ e+iωt , t → −∞ (7.23)

Note that [d2 /dt2 + ω 2 ]e±iωt = 0, though of course [d2 /dt2 + ω 2 ]q(t) ̸= 0 (since it is nonzero
at finite time).
That means that we must define the space of functions as functions satisfying (7.23), and
we do the path integral on this space.
So, a better definition of the path integral is in terms of

q(t) = qcl (t; J) + q̃(t) (7.24)

where the classical solution qcl (t; J) satisfies the correct boundary conditions, and the quan-
tum fluctuation q̃(t) satisfies zero boundary conditions, so that it doesn’t modify the ones of
qcl (t; J). But we know that if we have
1 2
S(x; J) = Ax + J · x ⇒
2
1
S(x; J) = S(x0 ; J) + A(x − x0 )2 = S(x0 ; J) + S(x − x0 ; 0) (7.25)
2
where x0 = x0 (J) is the extremum. Therefore in our case we have

S[q; J] = S[qcl ; J] + S[q̃; 0] (7.26)

and the path integral is


∫ ∫
Z[q; J] = Dqe iS[q;J]
=e iS[qcl ;J]
Dq̃eiS[q̃;0] (7.27)

and the path integral over q̃ is now J-independent, i.e. part of N . The classical equation of
motion
∆−1 qcl (t; J) = iJ(t) (7.28)
is solved by
qcl (t; J) = qcl (t; 0) + i(∆ · J)(t) (7.29)

64
where qcl (t; 0) is a zero mode, satisfying

∆−1 qcl (t; 0) = 0 (7.30)

Note the appearance of the zero mode qcl (t; 0) that doesn’t satisfy the correct boundary
conditions. But the quantum solutions q(t) on which we invert ∆−1 F have correct boundary
conditions instead, and do not include these ∫ zero modes. This is so since qcl (t; J) has the
boundary conditions of the i(∆ · J) = i dt ∆(t, t′ )J(t′ ) term, i.e. of ∆(t, t′ ) (the only

nontrivial t dependence), as opposed to qcl (t; 0) which has trivial boundary conditions.
Then we have
∫ [ δS
δfull S[qcl ; J] δqcl (t; J) δS ]
= dt′ |q=qcl + (7.31)
δfull J(t) δq(t′ ) δJ(t) δJ(t)

and the first term is zero by the classical equations of motion, and the second is equal to
qcl (t; J), giving

δfull S[qcl ; J]
= qcl (t; J) = qcl (t; 0) + i(∆ · J)(t) ⇒
δfull J(t)
i
S[qcl (J); J] = S[qcl (0); 0] + qcl (0) · J + J · ∆ · J (7.32)
2
where in the second line we have integrated over J the first line. Then we obtain for the
path integral
Z[J] = N ′′ e− 2 J·∆·J+iqcl (0)·J
1
(7.33)
where the second term is new, and it depends on the fact that we have nontrivial boundary
conditions. Indeed, this was a zero mode that we needed to introduce in qcl , as it has
the correct boundary conditions, as opposed to the quantum fluctuations which have zero
boundary conditions.
All of the above however was more suggestive than rigorous, as we were trying to define
the boundary conditions for our functions, but we didn’t really need the precise boundary
conditions.
Correct treatment: harmonic phase space
The correct treatment gives the path integral in terms of a modified phase space path
integral, called harmonic phase space. Let us derive it.
The classical hamiltonian of the free harmonic oscillator is
p2 ω 2 q 2
H(p, q) = + (7.34)
2 2
Making the definitions
1
q(t) = √ [a(t) + a† (t)]


ω
p(t) = −i [a(t) − a† (t)], (7.35)
2

65
inverted as
1
a(t) = √ [ωq − ip] = a(0)e−iωt

1
a (t) = √ [ωq + ip] = a† (0)e+iωt ,

(7.36)

the Hamiltonian is
H(a, a† ) = ωa† a (7.37)
In quantum mechanics, we quantize by [â, ↠] = 1, and we write the Fock space representa-
tion, in terms of states
1
|n >= √ (↠)n |0 > (7.38)
n!
which are eigenstates of the Hamiltonian.
But we can also define coherent states

∑ αn
|α >= eαâ |0 >= (↠)n |0 > (7.39)
n≥0
n!

which are a linear combination of all the Fock space states. These states are eigenstates of
â, since
† †
â|α >= [â, eαâ ]|0 >= αeαâ |0 >= α|α > (7.40)
where we have used

∑ αn
[â, eαâ ] = ([â, ↠](↠)n−1 + ↠[â, ↠](↠)n−2 + ...(↠)n−1 [â, ↠])
n≥0
n!
∑ αn−1 †
= α (↠)n−1 = αeαa (7.41)
n≥0
(n − 1)!

Thus these coherent states are eigenstates of a, with eigenvalue α. Similarly, defining the
bra state

< α∗ | ≡< 0|eα â (7.42)
we have
< α∗ |↠=< α∗ |α∗ (7.43)
For the inner product of coherent states, we obtain
∗ ∗α ∗α
< α∗ |α >=< 0|eα â |α >= eα < 0|α >= eα (7.44)

since < 0|a >=< 0|eαâ |0 >=< 0|0 >= 1, as < 0|a† = 0. We also have the completeness
relation ∫
dαdα∗ −αα∗
1= e |α >< α∗ | (7.45)
2πi
which is left as an exercise. (note that the constant depends on the definition of the complex
integration measure dzdz̄).

66
We will compute the transition amplitude between the Heisenberg states |α, t >H and
H < α∗ , t′ |,

F (α∗ , t′ ; α, t) = H < α∗ , t′ |α, t >H =< α∗ |e−iĤ(t −t) |a > (7.46)
where |α > and < α∗ | are Schrödinger states.
The Hamiltonian in the presence of a driving force is
H(a† , a; t) = ωa† a − γ(t)a† − γ̄(t)a (7.47)
The sources (driving forces) γ, γ̄ are related to the J defined before in the Lagrangean
formalism as
J(t) ¯
J(t)
γ(t) = √ ; γ̄(t) = √ (7.48)
2ω 2ω
¯ the coupling is γ(t)a† + γ̄(t)a = J(t)q(t).
Indeed, we can check that for real J, i.e. J = J,
We need one more formula before we compute F (α∗ , t′ ; α, t), namely
∗β
< α∗ |Ĥ(↠, â; t)|β >= H(α∗ , β; t) < α∗ |β >= H(α∗ , β; t)eα (7.49)
Now, like in the case of the phase space path integral, we divide the path in N + 1
small pieces, with ϵ ≡ (t′ − t)/(N + 1), with times t0 = t, t1 , ..., tn , tn+1 = t′ . We then
insert the identity as the completeness relation (7.45) at each point ti , i = 1, ..., n, dividing

e−iĤ(t −t) = e−iϵĤ × ... × e−iϵĤ . We obtain
∫ ∏[
∗ ′ dα(ti )dα∗ (ti ) −α∗ (ti )α(ti ) ]
F (α , t ; α, t) = e < α∗ (t′ )|e−iϵĤ |α(tn ) > ×
i
2πi
× < α∗ (tn )|e−iϵĤ |α(tn−1 ) >< α(tn−1 )|... < α∗ (t1 )|e−iϵĤ |α(t) >(7.50)
Since
∗ (t ∗ (t
< α∗ (ti+1 )|e−iϵĤ |α(ti ) >= e−iϵH(α i+1 ),α(ti ))
eα i+1 )α(ti )
(7.51)
when we collect all the terms we obtain
∫ ∏[
dα(ti )dα∗ (ti ) ] [
exp α∗ (t′ )α(tn ) − α∗ (tn )α(tn ) + α∗ (tn )α(tn−1 ) − α∗ (tn−1 )α(tn−1 ) + ...
2πi
i
∫ t′ ]

+... + α (t1 )α(t) − i dτ H(α∗ (τ ), α(τ ); τ ) (7.52)
t

We see that we have pairs with alternating signs, with α(ti+1 )α(ti ) − α∗ (ti )α(ti ) being the
discrete version of dτ α̇∗ (τ )α(τ ), and the last term remaining uncoupled, giving finally
∫ ∏[ { ∫ t′
dα(ti )dα∗ (ti ) ] }
exp dτ [α̇∗ (τ )α(τ ) − iH(α∗ (τ ), α(τ ); τ )] + α∗ (t)α(t) (7.53)
i
2πi t

so that we obtain
∫ { ∫ t′ [ α̇∗ (τ ) ] }
∗ ′ ∗
F (α , t ; α, t) = DαDα exp i dτ α(τ ) − H + α∗ (t)α(t) (7.54)
t i

Important concepts to remember

67
• The source term J in the generating functional Z[J] acts as a driving force, or souce,
for the classical action.

• At the naive level, the Feynman propagator appears as the inversion of the kinetic
operator, but we need to avoid the singularities.

• To avoid the singularities, we need to define boundary conditions for the space of
functions. We find that the correct boundary conditions are e−iωt at +∞ and e+iωt at
−∞, which means that the classical driven field has these boundary conditions, but
the quantum fluctuations have zero boundary conditions. In this way, the zero modes
do not appear in the space on which we invert ∆−1 .

• With the boundary conditions, we also get a term linear in J in the exponential,
Z[J] = N ′′ e−1/2J·∆J+iqcl (0)J .

• The correct treatment is in harmonic phase space, in terms of coherent states |α >,
which are eigenstates of â.

• The transition amplitude between Heisenberg states of |α > is written as a path integral
over α(t) and α∗ (t).

Further reading: See chapters 1.4 in [4] and 2.3 in [3].

68
Exercises, Lecture 7

1) Using the Z[J] for the free driven harmonic oscillator calculated in the lecture, calculate
the 4-point function G4 (t1 , t2 , t3 , t4 ) for the free driven harmonic oscillator.

2) Prove that ∫
dαdα∗ −αα∗
1= e |α >< α∗ | (7.55)
2πi
and calculate
† â+λ(â+↠)3 ]
ei[â |α > (7.56)

69
8 Lecture 8. Euclidean formulation and finite temper-
ature field theory
In this lecture we will define the Euclidean formulation of quantum mechanics, to be extended
next lecture to quantum field theory, and the finite temperature version associated with it,
making the connection to statistical mechanics.
But first, let us finish the discussion of the path integral in harmonic phase space. We
have seen that the transition amplitude between Heisenberg states |α, t > and < α∗ , t′ | is
given as a path integral,
∫ { ∫ t′ [ α̇∗ (τ ) ] }
∗ ′ ∗
F (α , t ; α, t) = DαDα exp i dτ α(τ ) − H + α∗ (t)α(t) (8.1)
t i
But we need to understand the boundary conditions for the path integral, related to the
boundary conditions for the transition amplitude.
Classically, if we have two variables, like α(t) and α∗ (t), obeying linear differential equa-
tions, or one variable obeying a quadratic differential equation, we could choose to impose
their values (or the values of the function and its derivative in the second case) at a given
initial time t, or the same at a final time t′ . But in quantum mechanics, it is not so sim-
ple. We know that the eigenvalues of two operators that don’t commute cannot be given
(measured with infinite precision) at the same time (in the same state).
The precise statement is that if we have operators  and B̂ such that [Â, B̂] ̸= 0, then
defining ∆A by (∆A)2 =< ψ|(Â − A1)2 |ψ >, where |ψ > is some given state, we have
(theorem in quantum mechanics) the generalized Heisenberg’s uncertainty principle
1
(∆A)(∆B) ≥ | < ψ|i[Â, B̂]|ψ > | (8.2)
2
For instance, since [q̂, p̂] = i~, we get (∆q)(∆p) ≥ ~/2 in a given state, for instance at a
given time t for a single particle in its evolution.
That means that now, since [â, ↠] = 1, we can’t specify their eigenvalues α and α∗ at
the same time (with infinite precision). So the only possibility for boundary conditions is
something that from the point of view of classical mechanics looks strange, but it’s OK in
quantum mechanis, namely to define
-at time t, α(t) = α and α∗ (t) is unspecified (free), and
-at time t′ , α∗ (t′ ) = α∗ and α(t) is unspecified (free).
The equations of motion, i.e. the equations obtained from the stationarity of the path
integral, by varying the exponential with respect to α(τ ) and α∗ (τ ), are
∂H
α̇ + i = 0 → α̇ + iωα − iγ = 0
∂α∗
∂H
α̇∗ − i = 0 → α̇∗ − iωα∗ + iγ = 0 (8.3)
∂α
The classical solutions of this system of equations is
∫ τ
iω(t−τ )
αcl (τ ) = αe +i eiω(s−τ ) γ(s)ds
t

70
∫ t′
∗ ∗ iω(τ −t′ )
αcl (τ ) = α e +i eiω(τ −s) γ(s)ds (8.4)
τ

On these solutions (i.e., using the above equations of motion), we can compute that the
object in the exponent in the path integral is
∫ t′

= α (t)α(t) + i dτ γ(τ )α∗ (τ )
t ∫
t′
∗ ′
−iω(t −t) ′
= α αe +i ds[αeiω(t−s) γ̄(s) + α∗ eiω(s−t ) γ(s)]
∫ ′ ∫ t′ t
1 t ′
− ds ds γ(s)γ̄(s′ )e−iω|s −s|

(8.5)
2 t t

In the above we have skipped some steps, which are left as an exercise.
Then, like we did in the last lecture for the configuration space path integral, we can do the
gaussian integral by shifting α(t) = αcl (t)+ α̃(t) and α∗ (t) = αcl

(t)+ α̃∗ (t), where αcl (t), αcl

(t)
are the above classical solutions, and the tilde quantities are quantum fluctuations. Then,
by shifting the path integral, we get that (due to the quadratic nature of the exponential)
the object in the exponent,

E[α(t), α∗ (t); γ, γ̄] = E[αcl , αcl



(t); γ, γ̄] + E[α̃(t), α̃∗ (t); 0] (8.6)

resulting in the path integral being a constant times the exponential of the classical exponent,

N eE[αcl (t),αcl (t);γ,γ̄] (8.7)

We see that this object has the same structure as (7.33), since we can absorb the constant
in (8.5) in N , and then we are left with a term linear in J and a term quadratic in J in the
exponent, like in the case of the configuration space path integral.
But we still need to relate the above harmonic phase space path integral with the config-
uration space path integral of the last lecture. First, we would need to have vacuum states

at the two ends. Since |α >= eαa |0 >, choosing α = 0 gives |α >= |0 >, and choosing
α∗ = 0 gives < α∗ | =< 0|. Next, we need to take t → −∞ and t′ → ∞, since this is the same
that was required for the configuration space path integral. In this case, in the exponent in
(8.5), the constant and linear terms disappear, and we are left with the object
{ }
1
Z[J] ≡< 0, +∞|0, −∞ >J = exp − J · ∆F · J < 0|0 >0 (8.8)
2
where |0, −∞ > means |α = 0, t = −∞ >, and we have written the overall constant as
< 0|0 >0 , since it is indeed what we obtain for the path integral if we put J = 0. Note that
here we have
1 −iω|s−s′ |
∆F = e (8.9)


in the quadratic piece in the critical exponent, as we can see by substituting γ = J/ 2ω.
This is, as we saw, the Feynman propagator.

71
Now the boundary condition is α(t) = 0 at t = −∞, and α∗ (t) free, in other words, pure
creation part (α∗ is eigenvalue of a† ) at t = −∞, and α∗ (t) = 0 at t = +∞, and α(t) free, in
other words, pure annihilation part (α is eigenvalue of a). Since
1
q(t) = √ [ae−iωt + a† e+iωt ] (8.10)

we see that this is consistent with the previously defined (unrigorous) boundary condition,
q(t = −∞) ∼ e+iωt and q(t = +∞) ∼ e−iωt . But now we have finally defined rigorously the
boundary condition and the resulting path integral.
Wick rotation to Euclidean time
But while the path integral is well defined now, its calculation in relevant cases of in-
terest is not. A useful approximation for any path integral is what is known as ”saddle
point approximation”, which is the gaussian integral around a classical solution, which is an
extremum of the action,
1
S = Scl + δqi Sij δqj + O((δq)3 ) (8.11)
2
where Scl = S[qcl : δS/δq(qcl ) = 0]. If we have a free action, this is exact, and not an
approximation. We have performed the gaussian integration as if it∫ were correct, but in
+∞
reality it is only correct for real integrals, which decay exponentially, −∞ dxe−αx , whereas
2


for imaginary integrals, when the integral is over a phase (purely oscillatory), dxe−iαx
2

∫ +Λ ∫ +Λ+C
is much less well defined (since if we take −Λ or −Λ−C where Λ → ∞ and C finite,
the two
∫ C integrals differ by a finite contribution that is highly oscillatory in the value of C,
∼ 2 0 dxe−2iαΛx ). And when it is a path integral instead of a single integral, it becomes
even more obvious this is not so well defined.
So if we could have somehow instead of eiS , e−S , that would solve our problems. Luckily,
this is what happens when we go to Euclidean space.
Consider a time independent Hamiltonian Ĥ, with a complete set of eigenstates {|n >}
(so that 1 = |n >< n|), and eigenvalues En > 0. Then the transition amplitude can be
written as
∑∑
′ ′ ′ −iĤ(t′ −t) ′
H < q , t |q, t >H = < q |e |q >= < q ′ |n >< n|e−iĤ(t −t) |m >< m|q >
∑ n m ∑
′ −iEn (t′ −t) ′
= < q |n >< n|q > e = ψn (q ′ )ψn∗ (q)e−iEn (t −t) (8.12)
n n

′ ′
where we have used < n|e−iĤ(t −t) |m >= δnm e−iEn (t −t) and < q|n >= ψn (q). This expression
is analytic in ∆t = t′ − t.
Now consider the analytical continuation to Euclidean time, called Wick rotation, ∆t →
−iβ. Then we obtain ∑
< q ′ , β|q, 0 >= ψn (q ′ )ψn∗ (q)e−βEn (8.13)
n

Then we note that if we specialize to the case q ′ = q and integrate over this value of q,
we obtain the statistical mechanics partition function of a system at a temperature T , with
kT = 1/β, thus obtaining a relation to statistical mechanics!

72
Indeed, then we obtain
∫ ∑
< q, β|q, 0 >= dq |ψn (q)|2 e−βEn = Tr{e−β Ĥ } = Z[β] (8.14)
n

This corresponds in the path integral to a taking closed paths of Euclidean time length
β = 1/(kT ), since q ′ ≡ q(tE = β) = q(tE = 0) ≡ q.
Let’s see how we write the path integral. The Minkowski space Lagrangean is
( )2
1 dq
L(q, q̇) = − V (q) (8.15)
2 dt

meaning the exponent in the path integral becomes


∫ tE =β [ 1 ( dq )2 ]
iS[q] = i (−idtE ) − V (q) ≡ −SE [q] (8.16)
0 2 d(−itE )

where then by definition, the Euclidean action is


∫ β [ 1 ( dq )2 ] ∫
SE [q] = dtE + V (q) = dtE LE (q, q̇) (8.17)
0 2 dt

We finally obtain the Feynman-Kac formula,



−β Ĥ
Z(β) = Tr{e } = Dqe−SE [q] |q(tE +β)=q(tE ) (8.18)

where the path integral is then taken over all closed paths of Euclidean time length β.
As we know, the partition function in statistical mechanics contains all the relevant in-
formation about the system, so using this formalism we can extract any statistical mechanics
quantity of interest.
To do so, we can introduce currents J(t) as usual and calculate the correlation functions.
That is, we define ∫ ∫β
Z[β; J] = Dqe−SE (β)+ 0 JE (τ )qE (τ ) dτ (8.19)

Note that here, the current term is the usual we encountered before, since
∫ ∫ ∫
i dtJ(t)q(t) = i d(−itE )J(−itE )q(−itE ) ≡ dtE JE (tE )qE (tE ) (8.20)

Let’s see a simple, but very useful quantity, the propagator in imaginary (Euclidean)
time. We immediately find in the usual way

1 δ 2 Z[β; J] 1
= Dq(τ )q(τ1 )q(τ2 )e−SE (β) (8.21)
Z(β) δJ(τ1 )δJ(τ2 ) Z(β)

73
Note that here in J(τ1 ), J(τ2 ), q(τ1 ), q(τ2 ) we have Euclidean time (τ ), but since the formula
for Z[β; J] was obtained by analytical continuation, this is equal to
1
< Ω|T {q̂(−iτ1 )q̂(−iτ2 )}|Ω >β = Tr[e−β Ĥ T {q̂(−iτ1 )q̂(−iτ2 )}] (8.22)
Z(β)

Also note that here Z(β) is a normalization constant, since it is J-independent (now we
have a new parameter, β, but except for this dependence, it is constant). In the above, the
Heisenberg operators are also Wick rotated, i.e.

q̂(t) = eiĤt q̂e−iĤt ⇒


q̂(−iτ ) = eĤτ q̂e−Ĥτ (8.23)

We could ask: why is it not the Euclidean correlator the VEV of the time ordered
Euclidean operators? The answer is that in Euclidean space, space and time are the same,
so we can’t define the ”time ordering operator”. In a certain sense, what we obtain is then
just the VEV of the product of q’s in Euclidean space, for the above < Ω|q̂(τ1 )q̂(τ2 )|Ω >,
just that it is better defined as the continuation from Minkowski space of the VEV of the
time-ordered product.
We can now extend the definition of the free propagator ∆(τ ) to the interval [−β, β] by
the periodicity of q(τ ), q(τ + β) = q(τ ). We can then define

∆(τ ) =< Ω|T {q̂(−iτ )q̂(0)}|Ω >β (8.24)

such that
∆(τ − β) = ∆(τ ) (8.25)
We could of course compute it from the path integral in principle, but consider reversely,
the following problem. The propagator equation
[ d2 ]
− + ω 2
K(τ, τ ′ ) = δ(τ, τ ′ ) (8.26)
dt2
with the above periodicity, where K(τ, τ ′ ) = ∆f ree (τ −τ ′ ), has a unique solution: if τ ∈ [0, β],
the solution is
1
∆f ree (τ ) = [(1 + n(ω))e−ωτ + n(ω)eωτ ] (8.27)

where
1
n(ω) = β|ω| (8.28)
e −1
is the Bose-Einstein distribution. The proof of the above statement is left as an exercise.
We also can check that for β → ∞,

∆f ree (τ > 0) → ∆F (τ > 0) (8.29)

So we obtain the Feynman propagator in the limit of zero temperature (β = 1/(kT ) → ∞).

74
Moreover, in this zero temperature limit, in which we have an infinite period in Euclidean
time, we have

< q ′ , β|q, 0 >= ψn (q ′ )ψn∗ (q)e−βEn → ψ0 (q ′ )ψ0 (q)e−βE0 (8.30)
n

that is, we only get the vacuum contribution.


In conclusion, we can define the sources J(t) to be nonzero on a finite time interval, and
take infinitely long periodic Euclidean time. Then we obtain a definition of the vacuum
functional, where the initial and final states are the vacuum |Ω >, like we defined before in
Minkowski space, and this functional is given by the path integral in Euclidean space.
Because of the statistical mechanics connection, we call Z[J] = Z[β → ∞; J] the partition
function, and we will continue to call it like this from now on. We will also call later
− ln Z[J] = W [J] the free energy, since the same happens in statistical mechanics.
Note that all we said here was for quantum mechanics, and the corresponding statistical
mechanics by the Feynman-Kac formula. But we can generalize this formalism to quantum
field theory, and we will do so next lecture. We can also generalize statistical mechanics to
finite temperature field theory.
Driven harmonic oscillator
Let us now return to our basic example for everything, the harmonic oscillator. The
Euclidean partition function is
∫ { 1 ∫ [ ( dq )2 ] ∫ }
ZE [J] = Dq exp − dt 2 2
+ ω q + dtJ(t)q(t) (8.31)
2 dt
We saw that in Minkowski space the partial integration of the kinetic term can introduce
problematic boundary terms related to the boundary conditions. But now, in Euclidean
space, we have only closed paths, so there are no boundary terms!
Therefore we can write as before,
∫ { 1∫ [ d2 ] ∫ }
ZE [J] = Dq exp − dtq(t) − 2 + ω q(t) + dtJ(t)q(t)
2
2∫ dt
{1 ∫ }
= N exp ds ds J(s)∆E (s, s′ )J(s′ )

(8.32)
2
where the Euclidean propagator is defined by −SE = −1/2q∆−1 q + ..., giving
( )−1 ∫ ′
′ d2 ′ dEE e−iEE (s−s )
∆E (s, s ) = − 2 + ω 2
(s, s ) = (8.33)
ds 2π EE2 + ω 2
Note first that the gaussian integration above is now well defined, as we explained, since
we don’t have oscillatory
∫ terms anymore, but we have the usual integral of a real decay-
ing exponential, dxe−αx . Second, the Euclidean propagator is well defined, since in the
2

integral, we don’t have any singularities anymore, as EE2 + ω 2 > 0, so we don’t need to
choose a particular integration contour avoiding them, as in the case of the Minkowski space
propagator.

75
Let us summarize what we did: we Wick rotated the Minkowski space theory to Euclidean
space in order to better define it. In Euclidean space we have no ambiguities anymore: the
path integral is well defined, Gaussian integration and the propagators are also well defined.
Then, in order to relate to physical quantities, we need to Wick rotate the calculations in
Euclidean space back to Minkowski space.
Let’s see that for the propagator. We have t = −is, where s is Euclidean time. But then
we want to have the object Et be Wick rotation invariant, Et = EE s, so EE = −iE.
The Euclidean space propagator is well defined, but of course when we do the above
Wick rotation to Minkowski space, we find the usual poles, since the Euclidean propagator
has imaginary poles EE = ±iω, corresponding to poles at E = ±ω. Therefore we can’t do
the full Wick rotation, corresponding to a rotation of the integration contour with π/2 in
the complex energy plane, since then we cross (or rather touch) the poles. To avoid that, we
must rotate the contour only by π/2 − ϵ,
EE → e−i( 2 −ϵ) E = −i(E + iϵ′ )
π
(8.34)
Doing this Wick rotation, we obtain
∫ +∞
dE e−iEt
∆E (s = it) = −i = ∆F (t) (8.35)
−∞ 2π −E 2 + ω 2 − iϵ
that is, the Feynman propagator. We see that the π/2 − ϵ Wick rotation corresponds to
the Feynman prescription for the integration contour to avoid the poles. It comes naturally
from the Euclidean space construction (any other integration contour could not be obtained
from any smooth deformation of the Euclidean contour), one more reason to consider the
Feynman propagator as the relevant object for path integrals.
In conclusion, let us mention that the puristic view of quantum field theory is that the
Minkowski space theory is not well defined, and to define it well, we must work in Euclidean
space and then analytically continue back to Minkowski space.
Also, we saw that we can have a statistical mechanics interpretation of the path integral,
which makes the formalism for quantum field theory and statistical mechanics the same. A
number of books take this seriously, and treat the two subjects together, but we will continue
with just quantum field theory.

Important concepts to remember

• Relating the harmonic phase space path integral with the configuration space path
integral, we obtain the boundary conditions that we have a pure creation part at
t = −∞ and a pure annihilation part at t = +∞, compatible with the configuration
space boundary condition q(t = −∞) ∼ eiωt , q(t = +∞) ∼ e−iωt . This defines
rigorously the path integral.
• The Feynman-Kac formula relates the partition function in statistical mechanics, Z(β) =
Tr{e−β Ĥ }, with the Euclidean space path integral over all closed paths in the Euclidean
time, with length β.

76
• The correlators in imaginary time (like the imaginary time propagator) are the ana-
lytical continuation of Minkowski space correlators.

• The free Euclidean propagator at periodicity β involves the Bose-Einstein distribution.

• The vacuum functional, or partition function Z[J], is given by the Euclidean path
integral of infinite periodicity.

• For the harmonic oscillator in Euclidean space, we can rigorously do the gaussian
integration to obtain the exact solution for Z[J], in terms of a well-defined Euclidean
propagator.

• The analytical continuation (Wick rotation) of the Euclidean propagator, by smoothly


deforming the integration contour to avoid the poles in the complex plane, uniquely
selects the Feynman propagator.

Further reading: See chapters 1.5 in [4] and 3.7 in [3].

77
Exercises, Lecture 8

1) Complete the omitted steps in the proof for going from


∫ { ∫ t′ [ α̇∗ (τ ) ] }
∗ ′ ∗
F (α , t ; α, t) = DαDα exp i dτ α(τ ) − H + α∗ (t)α(t) (8.36)
t i
to { }
1
Z[J] ≡< 0, +∞|0, −∞ >J = exp − J · ∆F · J < 0|0 >0 (8.37)
2

2) Prove that
[ d2 ]
− + ω 2
K(τ, τ ′ ) = δ(τ − τ ′ ), (8.38)
dτ 2
where K(τ, τ ′ ) = ∆f ree (τ − τ ′ ), and ∆(τ − β) = ∆(τ ), has a unique solution: if τ ∈ [0, β],
the solution is
1
∆f ree (τ ) = [(1 + n(ω))e−ωτ + n(ω)eωτ ] (8.39)

where
1
n(ω) = β|ω| (8.40)
e −1

78
9 Lecture 9. The Feynman path integral for a scalar
field
In this lecture we generalize what we have learned from quantum mechanics to the case of
quantum field theory.
We have seen that in order to better define the theory, we must do a Wick rotation
to Euclidean space, t = −itE . If we choose closed (periodic) paths, q(tE + β) = q(β) in
Euclidean time, we obtain∫ the statistical mechanics partition function, Z(β) = Tr{e−β Ĥ },
given as a path integral Dqe−SE [q] |q(tE +β)=q(tE ) (Feynman-Kac formula). The Euclidean
action is positive definite if the Minkowski space Hamiltonian was. To obtain the vacuum
functional (transition between vacuum states), we take β → ∞. In Euclidean space we
obtain well defined path integrals; for instance, the driven harmonic oscillator can now be
easily solved. Partial integration has no problems, since periodic paths have no boundary,
and we obtain ZE [J] = N e 2 J·∆E ·J , where the Euclidean propagator is well defined (has
1

no poles). Wick rotation however will give the same problem: a full Wick rotation with
π/2 rotation of the integration contour will touch the poles, so we must rotate only with
π/2 − ϵ, obtaining the Feynman propagator. No other propagator can arise from a smooth
deformation of the integration contour.
To generalize to field theory, as usual we think of replacing qi (t) → ϕ⃗x (t) ≡ ϕ(⃗x, t) ≡ ϕ(x).
Of course, there are issues with the correct regularization involved in this (some type of
discretization of space, we have discussed that a bit already), but we will ignore this, since
we only want to see how to translate quantum mechanics results into quantum field theory
results.
In Minkowski space, the action is
∫ [ 1 ]
1
S[ϕ] = d4 x − ∂µ ϕ∂ µ ϕ − m2 ϕ2 − V (ϕ) (9.1)
2 2
and the Minkowski space n-point functions, or Green’s functions are

Gn (x1 , ..., xn ) =< 0|T {ϕ̂(x1 )...ϕ̂(xn )}|0 >= DϕeiS[ϕ] ϕ(x1 )...ϕ(xn ) (9.2)

To define the theory better, we go to Euclidean space, where we take only periodic paths
with infinite period, obtaining the vacuum functional. The Euclidean action is
∫ [1 ]
1
SE [ϕ] = d4 x ∂µ ϕ∂µ ϕ + m2 ϕ2 + V (ϕ) (9.3)
2 2
where, since we are in Euclidean space, aµ bµ = aµ bµ = aµ bν δ µν , and time is defined as
tM ≡ x0 = −x0 = −itE , tE = x4 = x4 , and so x4 = ix0 .
The Euclidean space Green’s functions are

Gn (x1 , ..., xn ) = Dϕe−SE [ϕ] ϕ(x1 )...ϕ(xn )
(E)
(9.4)

79
We can write their generating functional, the partition function, so called because of
the connection with statistical mechanics which stays the same: at finite periodicity β in
Euclidean time,

−β ĤJ
Z[β, J] = Tr{e } = Dϕe−SE [ϕ]+J·ϕ |ϕ(⃗x,tE +β)=ϕ(⃗x,tE ) , (9.5)

so in the vacuum, ∫
Z[J] = Dϕe−SE [ϕ]+J·ϕ ≡ J < 0|0 >J (9.6)

where in d dimensions ∫
J ·ϕ≡ dd xJ(x)ϕ(x) (9.7)

So the Green’s functions are obtained from derivatives as usual,



δ δ
Gn (x1 , ..., xn ) = ... Dϕe−SE +J·ϕ |J=0 (9.8)
δJ(x1 ) δJ(xn )
Note however the absence of the factors of i in the denominators, since we now have J · ϕ
instead of iJ · ϕ. The partition function sums the Green’s functions as usual,
∑ 1 ∫ ∏ n
Z[J] = dd xi G(x1 , ..., xn )J(x1 )...J(xn ) (9.9)
n≥0
n! i=1

Perturbation theory
We now move to analyzing the perturbation theory, i.e. when S[ϕ] = S0 [ϕ] + SI [ϕ],
where S0 is a free (quadratic) part and SI is an interaction piece. As before, we will try
to perturb in the interaction piece. We can generalize the harmonic oscillator case and
calculate transition amplitudes between states. But even though in physical situations we
need to calculate transitions between physical states, which are usually wave functions over
particle states of given momentum, it is much easier from a theoretical point of view to
calculate the Green’s functions (transitions between vacuum states).
We will see however that the two are related. Namely, we can consider the Green’s func-
tions in momentum space, and the S-matrix elements, of the general type Sf i =< f |S|i >,
where S = UI (+∞, −∞). Then the S-matrix elements will be related in a more precise man-
ner later with the residues at all the mass-shell poles (p2i = −m2i for external lines) of the
momentum space Green’s functions. The proof of this formula (which will be just written
down later in the course), called the LSZ formula, will be given in QFT II, since the proof
requires the use of renormalization.
The momentum space Green’s functions are

G̃n (p1 , ..., pn ) = dd x1 ...dd xn ei(p1 x1 +....pn xn ) Gn (x1 , ..., xn ) (9.10)

But due to translational invariance (any good theory must be invariant under translations,
xi → xi + a),
Gn (x1 , ..., xn ) = Gn (x1 − X, x2 − X, ..., xn − X) (9.11)

80
Choosing X = x1 and changing integration variables xi → xi + X, for i = 2, ..., n, we get
[∫ ]∫
d ix1 (p1 +...+pn )
G̃n (p1 , ..., pn ) = d x1 e dd x2 ...dd xn ei(x2 p2 +...xn pn ) Gn (0, x2 , ..., xn )
= (2π)d δ d (p1 + ... + pn )Gn (p1 , ..., pn ) (9.12)

and we usually calculate Gn (p1 , ..., pn ), where these external momenta already satisfy mo-
mentum conservation.
As we said, it will be much easier theoretically to work with Green’s functions rather
than S-matrices, and the two are related.
Dyson’s formula
This is a formula that was initially derived in the operator formalism, but in the path
integral formalism it is kind of trivial. It is useful in order to define the perturbation theory.
Working in the Euclidean case, we define |0 > as the vacuum of the free theory, and
we consider VEVs in the free theory of some operators, which as usual are written as path
integrals, ∫
< 0|O[{ϕ̂}]|0 >= Dϕe−S0 [ϕ] O[{ϕ}] (9.13)

Now, considering the particular case of O = e−SI [ϕ] , we obtain



Dϕe−S0 [ϕ]−SI [ϕ] =< 0|e−SI [ϕ̂] |0 > (9.14)

If the operator O contains also a product of fields, we obtain the Green’s functions, i.e.

−SI [ϕ̂]
Gn (x1 , ..., xn ) =< 0|ϕ̂(x1 )...ϕ̂(xn )e |0 >= Dϕe−S0 [ϕ] ϕ(x1 )...ϕ(xn )e−SI [ϕ] (9.15)

We note here again that in Euclidean space, there is no distinction between time and space,
so we cannot define an ”euclidean time ordering”, so in a certain sense we have the usual
product. Just when we analytically continue (Wick rotate) from Minkowski space we can
define the time ordering implicitly (via the Minkowski time ordering).
Finally, for the generating functional of the Green’s functions, the partition function, we
can similarly write, by summing the Green’s functions,
∫ d

−SI [ϕ̂]
Z[J] =< 0|e e d xJ(x)ϕ̂(x)
|0 >= Dϕe−S0 [ϕ]+J·ϕ e−SI [ϕ] (9.16)

which is called Dyson’s formula, and as we can see, it is rather trivial in this path integral
formulation, but for S-matrices in the operator formalism in Minkowski space, where it was
originally derived, it is less so.
Solution of the free field theory
We now solve for the free field theory, SI [ϕ] = 0. We have
∫ ∫ { 1∫ }
−S0 [ϕ]+J·ϕ
Z0 [J] = Dϕe = Dϕ − d x[∂µ ϕ∂µ ϕ + m ϕ ] + J · ϕ
d 2 2
(9.17)
2

81
In Euclidean space we have no boundary terms, so we can partially integrate the kinetic
term to obtain
∫ { 1∫ }
Z0 [J] = Dϕ exp − d xϕ[−∂µ ∂µ + m ]ϕ + J · ϕ ,
d 2
(9.18)
2
where [−∂µ ∂µ + m2 ] ≡ ∆−1 and we can then shift the integration variable by first writing
∫ { 1 }
−1 1
Z0 [J] = Dϕ exp − [ϕ − J · ∆]∆ [ϕ − ∆ · J] + J · ∆ · J
∫ 2 2
1 ′ −1 ϕ′
1
′ − ·∆
= e2 J·∆·J
Dϕ e 2 ϕ
(9.19)

and now the path integral that remains is just a normalization constant, giving finally
1
Z0 [J] = e 2 J·∆·J < 0|0 >0 (9.20)
and we can put Z0 [0] =< 0|0 >0 to 1. Note that here the propagator comes from ∆−1 =
−∂µ ∂µ + m2 and is given by

dd p eip·(x−y)
∆(x, y) = (9.21)
(2π)d p2 + m2
and now again it is nonsingular, since in the denominator p2 + m2 > 0. But again, analytical
continuation back to Minkowski space (Wick rotating back) can lead to the usual poles at
|p| = ±im, so we need to rotate the integration contour in the complex p0 plane with π/2 − ϵ
instead of the π/2. Then we obtain the Feynman propagator, since p2E + m2 → p2M + m2 − iϵ.
Note that here, p2E = pE E µν 2 M M µν
µ pν δ , but pM = pµ pν η = −(p0 )2 + p⃗2 . And again, after the
rotation (smooth deformation of the contour),
∆(s = it, ⃗x) = ∆F (t, ⃗x) (9.22)
uniquely chosen.
Wick’s theorem
In the path integral formalism, Wick’s theorem is both simple to state, and easy to prove,
though it will take us some examples (next class) to figure out why it is the same one as in
the operator formalism.
Let’s start out with an example. Consider, e.g., the function F [{ϕ}] = ϕ2 (x1 )ϕ(x2 )ϕ4 (x3 ).
Then its VEV is a path integral as we saw, which we can write as

< 0|F [{ϕ}]|0 > = Dϕe−S0 [ϕ] ϕ2 (x1 )ϕ(x2 )ϕ4 (x3 )
( )2 ( )4 ∫
δ δ δ
= Dϕe−S0 +J·ϕ |J=0 (9.23)
δJ(x1 ) δJ(x2 ) δJ(x3 )
or (as we can easily see) in general, for an arbitrary function F , and in the presence of an
arbitrary source J, [{ }]
δ
J < 0|F [{ϕ}]|0 >J = F Z0 [J]|J (9.24)
δJ

82
In particular, we can use Dyson’s formula, i.e. apply it for

F [{ϕ}] = e− dd xV (ϕ(x))
(9.25)

to obtain ∫ ∫
Z[J] = e− Z0 [J] = e−
δ δ
dd xV ( δJ(x) ) dd xV ( δJ(x) ) 1
e 2 J·∆·J (9.26)
which is the form of Wick’s theorem in the path integral formalism.
This formula seems very powerful. Indeed, we seem to have solved completely the in-
teracting theory, as we have a closed formula for the partition function, from which we can
derive everything.
Of course, we can’t get something for free: if it was tricky in the operator formalism,
it means here also is tricky. The point is that this formal expression is in general not well
defined: there are divergences (infinite integrals that will appear), and apparent singularities
(like what happens for instance when several derivatives at the same point act at the same
time). The above formula then has to be understood as a perturbative expansion: we expand
the exponential, and then calculate the derivatives term by term. We will do some examples
first in the next class, then formulate Feynman rules, which will be the same ones as we have
found in the operator formalism.

Important concepts to remember

• The generalization from quantum mechanics to quantum field theory is obvious: the
vacuum functional is the path integral in euclidean space over periodic paths of infinite
period, with a positive definite euclidean action.

• Green’s functions are obtained from a partition function with sources.

• In perturbation theory, we will study Green’s functions, because they are simpler, but
one can relate them to the S-matrix elements (real scattering amplitudes) via the LSZ
formula, whose precise form will be given later.

• For momentum space Green’s functions, one usually factorizes the overall momentum
conservation, and talks about the Green’s functions with momenta that already satisfy
momentum conservation.

• Dyson’s formula relates < 0|e−SI [ϕ̂] eJ·ϕ̂ |0 > with the partition function (path integral
over the full action, with sources).
1
• In free field theory, Z0 [J] = e 2 J·∆·J .

• Wick’s theorem gives∫ the partition function of the interacting theory as the formal
expression Z[J] = e− V [δ/δJ] Z0 [J].

Further reading: See chapters 2.1,2.2 in [4] and 3.1,3.2 in [3].

83
Exercises, Lecture 9

1) Using the formulas given in this class, compute the 4-point function in momentum
space, in free field theory.

2) If SI = λϕ4 /4!, write down a path integral expression for the 4-point function
G4 (x1 , ..., x4 ) up to (including) order λ2 .

84
10 Lecture 10. Wick theorem for path integrals and
Feynman rules part I
In the previous class we have seen the Euclidean formulation of scalar field theories. We
have written Dyson’s formula for the partition function,
∫ d

−SI [ϕ̂]
Z[J] =< 0|e e d xJ(x)ϕ̂(x)
|0 >= Dϕe−S0 [ϕ]+J·ϕ e−SI [ϕ] (10.1)

calculated the partition function of the free theory,


1 1
Z0 [J] = e 2 J·∆·J < 0|0 >= e 2 J·∆·J (10.2)

where the Euclidean propagator is



dd p eip·(x−y)
∆(x, y) = (10.3)
(2π)d p2 + m2
And finally we wrote Wick’s theorem for path integrals, which is
∫ ∫
Z[J] = e− Z0 [J] = e−
δ δ
dd xV ( δJ(x) ) dd xV ( δJ(x) ) J·∆·J
e (10.4)

It is not completely obvious why this is the same Wick’s theorem from the operator formal-
ism, so we will see this by doing explicit examples.
Let’s consider the theory at zeroth order in the coupling constant, i.e. the free theory,
Z0 [J].
(p)
We will denote the Green’s functions as Gn (x1 , ..., xn ) for the n-point function at order
p in the coupling.
We start with the one-point function. At nonzero J,

(0) δ 1 1
G1 (x1 )J = e 2 J·∆·J = ∆ · J(x1 )e 2 J·∆·J (10.5)
δJ(x1 )

where ∆ · J(x1 ) = dd x∆(x1 , x)J(x). Putting J = 0, the one-point function is zero,
(0)
G1 (x1 ) = 0 (10.6)

We can easily see that this generalizes for all the odd n-point functions, since we have even
numbers of J’s in Z0 [J], so by taking an odd number of derivatives and then putting J = 0
we get zero,
(0)
G2k+1 (x1 , ..., x2k+1 ) = 0 (10.7)
The next is the 2-point function,

(0) δ δ 1 δ 1
G2 (x1 , x2 )J = e 2 J·∆·J = [∆ · J(x2 )e 2 J·∆·J ]
δJ(x1 ) δJ(x2 ) δJ(x1 )
1
= [∆(x1 , x2 ) + (∆ · J(x2 ))(∆ · J(x1 ))]e 2 J·∆·J (10.8)

85
and by putting J = 0 we find
(0)
G2 (x1 , x2 ) = ∆(x1 , x2 ) (10.9)

The corresponding Feynman diagram is obtained by drawing a line connecting the external
points x1 and x2 .
For the next nontrivial case, the 4-point function is

(0)

4
δ 1
G4 (x1 , x2 , x3 , x4 ) = e 2 J·∆·J |J=0
i=1
δJ(x1 )
= ∆(x1 , x2 )∆(x3 , x4 ) + ∆(x1 , x3 )∆(x2 , x4 ) + ∆(x1 , x4 )∆(x2 , x3 )
(10.10)

which can be represented as the sum of Feynman diagrams with lines connecting (12) and
(34); (13) and (24); (14) and (23), as in Fig.22. The (simple) details are left as an exercise.

x1 x2 x1 x2
x1 x2 + +

x3 x4 x3 x4 x3 x4

Figure 22: Free contractions for the 4-point function.


(0)
We then get the general rule for computing Gn (x1 , ..., xn ). Write all the Feynman
diagrams by connecting pairwise all external points in all possible ways.
We now move to the first nontrivial example of interaction. Consider the theory with
V (ϕ) = λϕ3 /3!. Of course, such a theory is not so good: the Hamiltonian is unbounded from
below (the energy becomes arbitrarily negative for large enough negative ϕ), so the system
is unstable. But we just want to use this as a simple example of how to calculate Feynman
diagrams.
Consider the theory at first order in λ, i.e. we replace
∫ ∫ ( )3
d λ δ
3
− dd x 3!
λ
( δ
)
e δJ(x) →− d x (10.11)
3! δJ(x)

Then we have
∫ ( )3 ∫ ( )2
λ δ 1 λ δ 1
Z (1)
[J] = − d x d
e 2
J·∆·J
=− d
d x [∆ · J(x)e 2 J·∆·J ]
∫ 3! δJ(x) 3! δJ(x)
λ δ 1
= − dd x [∆(x, x) + (∆ · J(x))2 ]e 2 J·∆·J
3! δJ(x)

86

λ 1
= − dd x[3∆(x, x)(∆ · J)(x) + (∆ · J(x))3 ]e 2 J·∆·J (10.12)
3!
From this expression, which contains only odd powers of J, we see that the even n-point
functions at order 1 are zero,
(1)
G2k (x1 , ..., x2k ) = 0 (10.13)
The one-point function is

(1) δ λ
G1 (x1 ) = Z (1) [J]|J=0 = − dd x∆(x, x)∆(x, x1 ) (10.14)
δJ(x1 ) 2

which diverges due to ∆(x, x), and has as Feynman diagram a propagator from the external
point x1 to the point x to be integrated over, followed by a loop starting and ending at x,
as in Fig.23.

x1 x

Figure 23: One point function tadpole diagram.

Next, we calculate the 3-point function,

(1) δ δ δ
G3 (x1 , x2 , x3 ) = Z (1) [J]|J=0
δJ(x
∫ 1 ) δJ(x2 ) δJ(x3 )
1
= −λ dd x{∆(x, x1 )∆(x, x2 )∆(x, x3 ) + ∆(x, x)[∆(x, x1 )∆(x2 , x3 )
2
+∆(x, x2 )∆(x1 , x3 ) + ∆(x, x3 )∆(x1 , x2 )]} (10.15)

which can be represented as 4 Feynman diagrams: one with a 3-vertex connected with the 3
(1)
external points, one with x2 connected with x3 and a G1 (x1 ) contribution, i.e. a line with
a loop at the end connected to x1 , and the 2 permutations for it, as in Fig.24.

x1 x1 x1
x1
+ + +

x2 x3 x2 x3 x3 x2 x3
x2

Figure 24: Diagrams for the 3-point function at order λ.

87
We now exemplify the diagrams at second order with the 0-point function (vacuum
bubble), at second order O(λ2 ), i.e. Z (2) [J = 0]. We have
∫ ( )3 ∫ ( )3
(2) 1 d λ δ d λ δ 1
Z [J] = d x d y e 2 J·∆·J (10.16)
2! 3! δJ(x) 3! δJ(y)
But when we put J = 0, only terms with 6 J’s contribute, so we have
∫ ( )3 ∫ ( )3 ( )3 ∫
1 d λ δ d λ δ 1 1
(2)
Z [J = 0] = d x d y dd z1 dd z2 dd z3 dd z1′ dd z2′ dd z3′
2! 3! δJ(x) 3! δJ(y) 3! 2
J(z1 )∆(z1 , z1′ )J(z1′ )J(z2 )∆(z2 , z2′ )J(z2′ )J(z3 )∆(z3 , z3′ )J(z3′ )|J=0 (10.17)
When we do the derivatives, we obtain a result that is the sum of two Feynman diagrams,
∫ ∫
(2) λ2 d d λ2
Z [J = 0] = 3 d xd y∆(x, x)∆(x, y)∆(y, y) + dd xdd y∆3 (x, y) (10.18)
2 2 · 3!
the two Feynman diagrams being one with two loops (circles) connected by a line, and one
with 3 propagators connecting the same two points (integrated over), x and y, as in Fig.25.
The details (which derivatives give which diagram) are left as an exercise.

Figure 25: Diagrams for the vacuum bubble (zero point function) at order λ2 .

Wick’s theorem: second form


We now formulate a second form for the Wick theorem,
{ ∫ d }
Z[J] = e 2 δϕ ·∆· δϕ e− d xV (ϕ)+J·ϕ |ϕ=0
1 δ δ

[1 ∫ δ δ ]{ − ∫ dd xV (ϕ)+J·ϕ }
= exp d xd y∆(x − y)
d d
e |ϕ=0 (10.19)
2 δϕ(x) δϕ(x)
To prove it, we use the previous form of the Wick theorem, and the following
Lemma (Coleman)
Consider two functions of multi-variables,
F (x) = F (x1 , ..., xn ); G(y) = G(y1 , ..., yn ) (10.20)
Then we have ( ) ( ){ }
∂ ∂ x·y
F G(x) = G F (y)e (10.21)
∂x ∂y y=0

88
To prove it, because of the Fourier decomposition theorem, we only have to prove the
lemma for
F (x) = ea·x ; G(y) = eb·y (10.22)
Let’s do that, first on the left hand side of the lemma,
( ) ∑ 1 ( )m ∑
∂ ∂ ∂
F G(x) = e a· ∂x b·x
e = a· eb·x = (a · b)m eb·x = ea·b eb·x = eb·(a+x)
∂x m
m! ∂x 1
m!
(10.23)
and then on the right hand side of the lemma,
( ){ } { }
∂ x·y ∂
b· ∂y a·y x·y
G F (y)e =e e e (10.24)
∂y y=0 y=0

Using the result for the lhs of the lemma, we have


( ){ }

G F (y)ex·y = eb·(a+x) e(a+x)·y |y=0 = eb·(a+x) (10.25)
∂y y=0

which is the same as the left hand side of the lemma. Q.E.D.
We can now apply this lemma on the previous form of the Wick’s theorem, with x → J(x)
and y → ϕ(x), so we generalize between a discrete and finite number of variables in the above
to a continuum of variables. Then we obtain
∫ { ∫ d }
− dd xV [ δJ
δ
] 1
J·∆·J 1 δ
·∆· δϕ
δ
− d xV (ϕ)+J·ϕ
Z[J] = e e 2 =e 2 δϕ e (10.26)
ϕ=0

Q.E.D.
Feynman rules in x-space
Consider the general polynomial potential
V (ϕ) = λϕp (10.27)
We write down the Feynman rules for the n-point functions, in order λN . But note that
now we will consider Feynman diagrams that give the same result as different. This is an
algorithmic method, so when we can’t compute the symmetry factors for diagrams, we can
use this construction. We will return to the usual Feynman rules in the next class.
The rules are: we write down the n external points, x1 , ..., xn , each with a line (leg)
sticking out, and then N vertices, i.e. internal (to be integrated over) points y1 , ..., yN , each
with p legs sticking out. We then connect all the lines sticking out in all possible ways by
propagators, thus constructing all the Feynman diagrams. Then
-For each p-legged vertex, we write a factor of −λ, see Fig.26. (Note the missing factor
of i with respect to Minkowski space. But now we are in Euclidean space.)
-For each propagator from z to w, we write a ∆(z − w).
-The resulting expression ID (x1 , ..., xn ; y1 , ..., yN ) is integrated over the vertices to obtain
the result for the Feynman diagram,

(N )
FD (x1 , ..., xn ) = dd y1 ...dd yN ID (x1 , ..., xn ; y1 , ..., yN ) (10.28)

89
i
2
=−g
p legs 1

Figure 26: Feynman diagram for the p-vertex.

-Then the n-point function is given by


∑ ∑ 1 ∑ (N )
Gn (x1 , ..., xn ) = Gn(N ) (x1 , ..., xn ) = F (x1 , ..., xn ) (10.29)
N ≥0 N ≥0
N! D D

As an example, we take the 2-point function to order λ1 for a potential with p = 4.


Then we draw the 2 points x1 , x2 with a line sticking out and a vertex y with 4 legs sticking
out, as in Fig.27a.

y
y
x1 y x2 x1 x2

x1 x2
a) b) c)

x1 x2 y
y
x1 x2
d) e)

Figure 27: We can draw all the possible Feynman diagrams at order λ in the 2-point funtion
by drawing the vertex and the external lines (a). Then we can connect them in all possible
ways, obtaining diagrams b,c,d,e.

We can connect x1 with x2 , and we have 3 ways of connecting the 4 legs of the vertex to
each other, as in Figs.27b,c,d, leading to

FD1 = FD2 = FD3 = −λ dd y∆(x1 − x2 )∆(y − y)∆(y − y) (10.30)

We can also connect each of the x1 and x2 with one of the legs of the vertex, and the
remaining two legs of the vertex to each other, as in Fig.27e. That gives 12 diagrams, since

90
we have 6 ways of choosing 2 lines out of the 4, and 2 ways of choosing which of the two to
connect with x1 . Thus we have

FD4 = ... = FD15 = −λ dd y∆(x1 − y)∆(x2 − y)∆(y − y) (10.31)

Important concepts to remember

• In free field theory (zeroth order in perturbation), the odd n-point functions are zero,
(0)
G2k+1 (x1 , ..., x2k+1 ) = 0.
(0)
• For G2k (x1 , ..., x2k ), we write all the Feynman diagrams by connecting pairwise all
external legs in all possible ways.

• We can write down { ∫a second }form of the Wick theorem, using Coleman’s lemma, as
1 δ
·∆· δ
Z[J] = e 2 δϕ δϕ e− V (ϕ)+J·ϕ |ϕ=0 .

(N )
• The Feynman rules in x-space, for Gn in λϕp theory, the long form, are: write n
external points, with a leg sticking out, and N vertices, with p legs sticking out, then
connect all the legs in all possible ways. For a vertex write −λ, for a propagator write
∆(x, y). Then integrate over the vertices, sum over diagrams (now we will have many
diagrams giving the same result) and sum over N with 1/N !.

• This prescription is equivalent to the usual one of writing only inequivalent diagrams,
and writing a symmetry factor S.

Further reading: See chapters 2.3 and 4.1 in [4].

91
Exercises, Lecture 10

(0)
1) Prove that G4 (x1 , ..., x4 ) is the sum of the three diagrams with connections of
(0)
(12),(34); (13),(24); (14),(23) in Fig.22 and then generalize to write down G6 (x1 , ..., x6 ).

2) Explain which derivatives give which Feynman diagram in the calculation of the 0-
point function to O(λ2 ) in λϕ3 theory (two circles connected by a line; and two points
connected by 3 propagators as in Fig.25), and then write down the Feynman diagrams for
(2)
G4 (x1 , ..., x4 ) in λϕ4 theory.

92
11 Lecture 11. Feynman rules in x-space and p-space
Proof of the Feynman rules
Let’s compute now the Green’s functions,

δ n Z[J] ∫ d
|J=0 = e 2 δϕ ·∆· δϕ {ϕ(x1 )...ϕ(xn )e− d x[V (ϕ(x))−J(x)ϕ(x)] }|ϕ=0 |J=0
1 δ δ
G(x1 , ..., xn ) =
δJ(x1 )...δJ(xn )
(11.1)
Note that these Green’s ∫ functions are the Green’s functions corresponding to
< 0|T {ϕ(x1 )...ϕ(xn )ei Hint }|0 > (in the free vacuum of the theory) in Minkowski space,
and as a result they still contain vacuum bubbles. The vacuum bubbles will be seen later
to factorize exactly as in the operator formalism, and then we will calculate the Green’s
functions corresponding to < Ω|T {ϕ(x1 )...ϕ(xn )}|Ω > from W = − ln Z[J].
We next note that in the above we can drop the J · ϕ term, since by taking all the
derivatives we will have only terms linear in J from this contribution, and putting J = 0
these terms vanish. The result is then

G(x1 , ..., xn ) = e 2 δϕ ·∆· δϕ {ϕ(x1 )...ϕ(xn )e−
1 δ δ
dd xV (ϕ)
}|ϕ=0 (11.2)

This means that in order N (i.e.,O(λN )) we find



1 δ
·∆· δϕ
δ (−λ)N
G (N )
(x1 , ..., xn ) = e 2 δϕ {ϕ(x1 )...ϕ(xn ) dd y1 ...dd yN ϕp (y1 )...ϕp (yN )}|ϕ=0 (11.3)
N!
Consider now that there are Q = n + pN ϕ’s, on which we act with the derivatives on the
exponential. If we have more derivatives than fields, we obviously get zero, but we also get
zero if we have fewer derivatives, because at the end we must put ϕ = 0, so if we have ϕ’s
left over, we also get zero. Therefore we only get a nonzero result if Q = 2q (even), and then
the result is

1
(N )
G (x1 , ..., xn ) = dd z1 dd w1 ...dd zq dd wq ×
q!2q
δ δ δ δ
× ∆(z1 − w1 ) ... ∆(zq − wq ) ×
δϕ(z1 ) δϕ(w ∫ 1) δϕ(z q) δϕ(w q)
(−λ)N
×{ϕ(x1 )...ϕ(xn ) dd y1 ...dd yN ϕp (y1 )...ϕp (yN )} (11.4)
N!

Then when acting with a factor of dd zdd w δϕ(z) δ
∆(z − w) δϕ(w) δ
on some ϕ(x)ϕ(z) we obtain

∫ d i.e.d we replace it with ∆(x − y). But with a factor of 2 in


a Wick contraction ϕ(x)ϕ(y),
front, since it comes from d zd w[δ(z − x)δ(w − y) + δ(z − y)δ(w − x)]∆(z − w). Since all
the q factors above give the same factor of 2, in total we get a 2q cancelling the 1/2q . Also,
we have q factors as above with which we can act on the Q = 2q ϕ’s, and so by permuting
them, we get the same contribution. Then the resulting factor of q! will cancel the 1/q! in
front.
Statistical weight factor (symmetry factor)

93
As explained before, from most diagrams, we get a lot of contributions that are identical,
almost giving an p! factor for each vertex, and an overall N !, so it is customary to redefine
λ = λp /p!, so as to (almost) cancel this factor. Then, we have in front of the result for
a Feynman diagram a factor of 1/S, where by definition the statistical weight factor, or
symmetry factor S is
N !(p!)N
S= (11.5)
#of equivalent diagrams
Then we can construct only the topologically inequivalent diagrams, and associate a statis-
tical weight factor, or symmetry factor S, corresponding to all possible symmetries of the
diagram, that leave it invariant. If there is no symmetry, then S = 1.
For tree digrams, always Stree = 1. For instance, we can write a Feynman diagram in ϕ4
theory, like in Fig.28.

Figure 28: Example of a tree diagram. It has S = 1.

There are N ! ways to label the vertices as y1 , ..., yN . Many topologically equivalent
diagrams obtained by connecting the p = 4 legs on the vertices with other legs from vertices
or from the external lines, can be thought of as just the above relabelling the vertices. Then
there are p! = 4! ways of attaching given propagators at the p = 4 legs of the vertex. In
total, we cancel the 1/(N !(p!)N ) factor in front, giving Stree = 1.
We can do more symmetry factors. The figure 8 vacuum bubble diagram of Fig.29a has
a symmetry factor of 8, which matches with the number of diagrams. One vertex with 4 legs
has 3 ways of connecting them 2 by 2, giving 3 × 1/(1!(4!)1 ) = 1/8.
The propagator from x1 to x2 with a loop at an intermediate point x, as in Fig.29b has
a symmetry factor of 2. From the vertex there are 4 ways to connect one leg to x1 , then 3
ways to connect to x2 , and one way to connect the remaining 2 legs between themselves, for
a total of 4 · 3/(1!(4!)1 ) = 1/2.
The ”setting sun” diagram, propagating from x1 to x2 , with 3 propagators between two
intermediate points x and y, as in Fig.29c has a symmetry factor of 3!, from permuting the
3 propagators between x and y. On the other hand, we draw 2 vertices with 4 legs, and two
external lines. For the first contraction there are 8 ways to connect x to a vertex, then 4
ways to connect y with the other vertex, and then 3! = 3 × 2 × 1 to connect the 3 internal
propagators, for a total of 8 · 4 · 3!/(2!(4!)2 ) = 1/3!, as expected.

94
b) c)
a)

Figure 29: Computing the symmetry factors for 3 loop diagrams (a,b,c).

Feynman rules in p space


As we already explained, we can define the Fourier transform of the x-space Green’s
function, but by translational invariance we can always factor out a delta function, with the
remaining Green’s function, already satisfying momentum conservation, being the one we
usually calculate,
∫ ∏ ∑
G̃(p1 , ..., pn ) = dd xi ei xj pj G(x1 , ..., xn ) = (2π)d δ d (p1 + ... + pn )G(p1 , ..., pn ) (11.6)
i

where in G(p1 , ..., pn ) the momenta already satisfy momentum conservation.


Since the Euclidean propagator is

dd p eip(x−y)
∆(x − y) = (11.7)
(2π)d p2 + m2
in momentum space we have
1
∆(p) = (11.8)
p2 + m2
By convention, a momentum p going out of a point x means a factor of eipx and going in
means a factor of e−ipx , as in Fig.30.

ipx −ipy
e = e =
x y

Figure 30: Momentum convention for external lines.

We write a version of the Feynman diagrams where we already include the momentum
conservation. But for that, we need to know the number of loops, i.e. independent loops

95
(circles) in our diagrams, or the number of integrations left over after we use the momentum
conservation delta functions.
We denote by V the number of vertices, I the number of internal lines, E the number
of external lines, and L the number of loops. There are I momentum variables (one for
each internal lines), but there are V constraints on them = momentum conservation delta
functions. However, one of these is the overall one which we put outside G to form G̃. So
for the calculation of G, we have L = I − V + 1 loops (independent integrations). But we
must express this in terms of things that are independent of the diagram, and depend only
on having G(N ) (p1 , ..., pn ), so we must find another formula. If we cut all propagators (or
equivalently, consider the diagram before we make the connections of the p legs on vertices
with other legs on them and on external lines), the legs on the vertices come as follows. For
each external line, one of the two pieces of propagator stays on the external point, but one
comes from a vertex. For internal lines however, both come from the vertex, so all in all,
E + 2I = pV . Replacing this formula in the above for L, we finally find
(p ) E
L=V −1 − +1 (11.9)
2 2
So after using all the momentum conservation delta functions, we are left with L inde-
pendent integrations, and the Feynman rules become:
-draw all the topologically inequivalent Feynman diagrams with n external lines and N
vertices.
-label external momenta p1 , ..., pn∫ with arrows for them, introduce l1 , ..., lL independent
loop momenta, integrated over with dd l1 /(2π)d ...dd lL /(2π)d , and label the propagator lines
by qj , using the li and momentum conservation for them.
-for an external line (between two points), put a propagator 1/(p2i + m2 ).
-for an internal line (between two points), put a propagator 1/(qj2 + m2 ).
-for a vertex, we have a factor of −λp
-calculate the statistical weight factor (symmetry factor) S and multiply the diagram by
1/S.
-sum over Feynman diagrams.
Let’s understand better the p space rules. Consider the x-space ”setting sun” diagram
consider above, and Fourier transform it to p space. This gives G̃(p1 , p2 ), with two external
points, but without x labels, so it has a ∆(p1 ) and a ∆(p2 ) factor. The external line rule for
a momentum p going out of an external point labelled by x, giving a factor of eipx is just a
wavefunction for a more physical case: the only way to introduce an x dependence into a p
space object is an explicit wavefunction for the external states of momenta pi . If there is no
label x, we have no wavefunction. But also, if there is no external point (what is known as a
amputated diagram, to be discussed later in the course), we have no corresponding external
propagator. The final thing to observe is of course that in general, we are not interested
in G̃(p1 , p2 ), but in G(p1 , p2 ) = G(p1 , −p1 ) ≡ G(p1 ), in which momentum conservation is
already taken into account.
For instance, consider the free propagator, a line between the two external points. We
could in principle consider momentum p1 going in at one point, and p2 going in at the other.

96
But then
∫ ∫ ∫
d d i(p1 x1 +p2 x2 ) d d i(p1 x1 +p2 x2 ) dd p eip(x1 −x2 )
G̃(p1 , p2 ) = d x1 d x2 e G(x1 , x2 ) = d x1 d x2 e
∫ (2π)d p2 + m2
eip2 x1 1
= dd x1 eip1 x1 2 2
= (2π)d δ d (p1 + p2 ) 2 = (2π)d δ d (p1 + p2 )G(p2 )
p2 + m p 2 + m2
(11.10)
Most general bosonic field theory
We now deal with a general bosonic field theory. Consider fields ϕr (x), where r is some
general label, which could signify some Lorentz index, like on a vector field Aµ (x), or some
label for several scalar fields, or a combination of both.
Consider the kinetic term
∫ ∑
1
S0 = dd x ϕr (x)∆−1
r,s ϕs (x) (11.11)
2 r,s

where ∆−1 r,s is some differential operator. We invert this object, obtaining the propagator ∆r,s .
If it is not possible, it means that there are redundancies, and we must find the independent
fields. For instance, for a gauge field Aµ (x), there is a gauge invariance, δAµ = ∂µ λ(x), which
means that we must introduce a gauge fixing (and also introduce ”ghosts”, to be defined
later) in order to define independent fields. Then we can invert the propagator.
Then we can derive a generalized Wick’s theorem, for a general interaction term

Sr1 ...rp = dd xAr1 ...rp ϕr1 (x)...ϕrp (x) (11.12)

where Ar1 ...rp can contain couplings and derivatives, so it’s more general than the construction
of ϕp theory above.
Then, the vertex is

δ δ
− dd x1 ...dd xp ei(k1 x1 +...kp xp ) ... Sr ...r (11.13)
δϕr1 (x1 ) δϕrp (xp ) 1 p
(Note that we don’t really need to define Ar1 ...rp , all we need is the form of the interaction
term in the action, Sr1 ...rp .) Let’s check that this gives the correct result for the λ4 /4!ϕ4
theory. Indeed, then we have the interaction term

λ4
SI = dd xϕ4 (x) (11.14)
4!
We easily find that applying the above formula we get
−(2π)d δ d (k1 + ...k4 )λ4 (11.15)
which is the correct vertex (it has the −λ factor, as well as the momentum conservation at
the vertex that we need to take into account).

Important concepts to remember

97
• We can write the Feynman rules in p space directly in terms of independent integrations
= loop momenta. The number of loops or independent integrations is L = V (p/2 −
1) − E/2 + 1.

• The Feynman rules then consist of labelling the internal lines using the loop momenta
and momentum conservation, with ∆(pi ) for external line (ending at an unlabelled
point) and ∆(qj ) for internal line, and −λp for vertices. Then we must divide by the
statistical weight factor (symmetry factor for the diagram).

• For the most general bosonic theory, we need to invert the kinetic term to find the
propagator (if there are redundancies, like for gauge fields, we must first find the
correct kinetic term for the independent fields). For the interaction term Sr1 ...rp , we
find the vertex as the Fourier transform of −δ n /δϕr1 ...ϕrp Sr1 ...rp .

Further reading: See chapters 2.4.2, 2.4.3, 2.7 in [4], 4.1 in [3] and 3.4 in [1].

98
Exercises, Lecture 11

1) Write down the statistical weight factor (symmetry factor) for the Feynman diagram
in Fig.31 in x-space, and then write down an integral expression for it, applying the Feynman
rules.

Figure 31: x-space Feynman diagram.

2) Consider the interaction term


∫ ∑
N
d
SI = d x (∂ 2 ϕi1 )(∂µ ϕi2 )ϕi1 (∂ µ ϕi2 ) (11.16)
i1 ,i2 =1

for N scalars ϕi , i = 1, ..., N , each with the usual massless kinetic term. Write down the
Feynman rules for this model.

99
12 Lecture 12. Quantization of the Dirac field and
fermionic path integral
The Dirac equation
We start by reviewing the classical Dirac field.
Spinors are fields in a Hilbert space acted on by gamma matrices. The gamma matrices
are objects satisfying the Clifford algebra

{γ µ , γ ν } = 2g µν 1 (12.1)

Therefore the spinors are representations of the Clifford algebra (Hilbert spaces on which we
represent the abstract algebra). Since then
1
Sµν = [γµ , γν ] (12.2)
4
satisfy the Lorentz algebra

[J µν , J ρσ ] = i[g νρ J µσ − g µρ J νσ − g νσ J µρ + g µσ J νρ ], (12.3)

the spinors are also a representation of the Lorentz algebra, called spinor representation.
For instance, in 3 Euclidean dimensions, the gamma matrices are the same as the Pauli
matrices, γ i = σ i , since indeed then {γ i , γ j } = 2δ ij . Remember that the Pauli matrices,
( ) ( ) ( )
1 0 1 2 0 −i 3 1 0
σ = ; σ = ; σ = (12.4)
1 0 i 0 0 −1

satisfy the relation


σ i σ j = δ ij + iϵijk σ k , (12.5)
from which follows that they indeed satisfy the Clifford algebra in 3 Euclidean dimensions.
We can make linear transformations on the spinor space (Hilbert space for the Clifford
algebra), via ψ → Sψ and γ µ → Sγ µ S −1 , and thus choose various representations for the
gamma matrices. A particularly useful one such representation is called the Weyl (or chiral)
representation, which is defined by
( ) ( )
0 1 0 σi
γ = −i
0
; γ = −i
i
; i = 1, 2, 3, (12.6)
1 0 −σ i 0

where 1 and 0 are 2 × 2 matrices. For this representation, we can immediately check that
(γ 0 )† = −γ 0 and (γ i )† = γ i . We can also define the 4-vector 2 × 2 matrices

σ µ = (1, σ i ); σ̄ µ = (1, −σ i ), (12.7)

such that in the Weyl representation we can write


( )
0 σµ
γ = −i µ
µ
(12.8)
σ̄ 0

100
We can also define the matrix
γ5 = −iγ 0 γ 1 γ 2 γ 3 , (12.9)
which in the Weyl representation becomes just
( )
1 0
γ5 = . (12.10)
0 −1
This is not a coincidence, in reality the Weyl representation was chosen such that γ5 , the
product of gamma matrices, has this form. Also, the notation γ5 is not coincidental: if we
go to 5 dimensions, the first 4 gamma matrices are the same, and the 5th is γ5 .
Also, as a side remark, the pattern of obtaining the gamma matrices as tensor products
of σ i and 1 continues to be valid in all dimensions. The gamma matrices in 2 Euclidean
dimensions can be chosen to be σ 1 , σ 2 (note that then σ 3 = −iσ 1 σ 2 ) and in higher dimensions
we can always build the gamma matrices in terms of them.
We now write down the Dirac equation for a Dirac spinor,
(γ µ ∂µ + m)ψ = 0. (12.11)
We can define the Dirac conjugate,
ψ̄ = ψ † β; β = iγ 0 (12.12)
which is defined such that ψ̄ψ is a Lorentz invariant. Note that there are several conventions
possible for β (for instance, others use β = iγ0 = −iγ 0 , or β = γ 0 or γ0 ). With these
conventions, the Dirac action is written as

Sψ = − d4 xψ̄(γ µ ∂µ + m)ψ. (12.13)

We will use the notation ∂/ ≡ γ µ ∂µ , which is common. Since the Dirac field ψ is complex, by
varying the action with respect to ψ̄ (considered as independent from ψ), we get the Dirac
equation. Note that shortly we will find for Majorana (real) spinors, for which ψ̄ is related
to ψ, there is a factor of 1/2 in front of the action, since by varying with respect to ψ we get
two terms (one where we vary ψ proper, and one where we vary the ψ from ψ̄, giving the
same result).
Weyl spinors
In 4d Minkowski dimensions, the Dirac representation is reducible as a representation
of the Lorentz algebra (or of the Clifford algebra). The irreducible representation (irrep) is
found by immposing a constraint on it, and the spinors are called Weyl (or chiral) spinors.
In the Weyl representation for the gamma matrices, the Dirac spinors splits simply as
( )
ψL
ψD = (12.14)
ψR
which is why the Weyl representation for gamma matrices was chosen such. In general, we
have
1 + γ5 1 − γ5
ψL = ψD ⇒ ψL = 0;
2 2
101
1 − γ5 1 + γ5
ψR = ψD ⇒ ψR = 0 (12.15)
2 2
and we note that we chose the Weyl representation for gamma matrices such that
( ) ( )
1 + γ5 1 0 1 − γ5 0 0
= ; = (12.16)
2 0 0 2 0 1
Another possible choice for irreducible representation, completely equivalent (in 4 Minkowski
dimensions) to the Weyl representation, is the Majorana representation. Majorana spinors
satisfy the reality condition
ψ̄ = ψ C ≡ ψ T C (12.17)
where ψ C is called Majorana conjugate (thus the reality condition is ”Dirac conjugate equals
Majorana conjugate”), and C is a matrix called charge conjugation matrix. Note that since
ψ T is just another ordering of ψ, whereas ψ̄ contains ψ † = (ψ ∗ )T , this is indeed a reality
condition ψ ∗ = (...)ψ.
The charge conjugation matrix in 4 Minkowski dimensions satisfies
C T = −C; Cγ µ C −1 = −(γ µ )T (12.18)
In other dimensions and/or signatures, the definition is more complicated, and it can involve
other signs on the right hand side of the two equations above.
In the Weyl representation, we can choose
( αβ )
−ϵ 0
C= (12.19)
0 ϵαβ
where we have ( )
αβ 0 1
ϵ = = iσ 2 (12.20)
−1 0
and so ( )
−iσ 2 0
C= = −iγ 0 γ 2 (12.21)
0 iσ 2
Then we have also ( 2 )
−1 iσ 0
C = = −C (12.22)
0 −iσ 2
We can now check explicitly that this C is indeed a representation for the C-matrix, i.e. that
it satisfies (12.18).
As we mentioned, the action for Majorana fields, with ψ̄ related to ψ, is

1
Sψ = − d4 xψ̄(∂/ + m)ψ (12.23)
2
The Dirac equation implies the Klein-Gordon (KG) equation; more precisely, the Dirac
equation is a sort of square root of the KG equation, which is roughly how Dirac thought
about deriving it. Let’s see this. First, we note that
1
∂/∂/ = γ µ γ ν ∂µ ∂ν = {γ µ , γ ν }∂µ ∂ν = g µν ∂ν ∂ν = 2 (12.24)
2
102
Then, it follows that
(∂/ − m)(∂/ + m)ψ = (∂ 2 − m2 )ψ, (12.25)
so indeed, the KG operator is the product of the Dirac operator with +m and the Dirac
operator with −m.
Solutions of the free Dirac equation
Since a solution of Dirac is also a solution of KG, solutions of Dirac have to be solutions
of KG, i.e. e±ip·x , with p2 + m2 = 0, times some matrices (column vectors) depending on p.
One set of solutions can then be written as

ψ(x) = u(p)eip·x (12.26)

where p2 + m2 = 0 and p0 > 0. Since this is basically a Fourier transform, u(p) satisfies the
Fourier transformed equation,
(ip/ + m)u(p) = 0 (12.27)
where /p ≡ γ µ pµ (note that pµ = gµν pν , and for the same pµ , the pµ changes by a sign between
our ”mostly plus” metric convention and the ”mostly minus” one). The two solutions for
the above equation for u(p) can be written compactly and formally as
(√ )
−p · σξ s
u (p) = √
s
(12.28)
−p · σ̄ξ s
( ) ( )
1 0 √ √
where s = 1, 2 and ξ =1 2
,ξ = . Here −p · σ = −pµ σ µ is understood in matrix
0 1
2

sense (if A = B, then A = B). In the case the matrix is diagonal, we can take square
root of the diagonal elements, but in general we can’t. For instance, in the rest frame, where
p⃗ = 0 and p0 = m,    
0 1
 1  
us (p) ∝   or 0 (12.29)
0 1
1 0
The other two solutions are of type

ψ(x) = v(p)e−ip·x (12.30)

where p2 + m2 = 0, p0 > 0 and v(p) satisfies

(−ip/ + m)v(p) = 0 (12.31)

The two v s (p) can be similarly written as


(√ )
−p · σξ s
s
v (p) = √ (12.32)
− −p · σ̄ξ s

103
and in the rest frame we have
  
0 1
1 0
v s (p) ∝ 
0
 or 
−1
 (12.33)
−1 0
The normalization conditions for the u(p) and v(p) are written in Lorentz invariant form
as
ūr (p)us (p) = 2mδ rs
v̄ r (p)v s (p) = −2mδ rs (12.34)
or using u† and v † as
ur† (p)us (p) = 2Ep δ rs
v r† (p)v s (p) = 2Ep δ rs (12.35)
The u and v solutions are orthogonal, i.e.
ūr (p)v s (p) = v̄ r (p)us (p) = 0 (12.36)
Note that now
ur† (p)v s (p) ̸= 0; v r† (p)us (p) ̸= 0 (12.37)
however, we have also
ur† (⃗p)v s (−⃗p) = v r† (−⃗p)us (⃗p) = 0 (12.38)
Quantization of the Dirac field
The spin-statistics theorem says that fields of spin S = (2k + 1)/2 (k ∈ N) obey Fermi
statistics, which means that in their quantization we must use anticommutators, {, }, instead
of commutators [, ]. On the other hand, for S = k, we have the Bose-Einstein statistics, with
commutators [, ] for their quantization.
The Lagrangean for the Dirac field is
L = −ψ̄γ µ ∂µ ψ − mψ̄ψ + ... = −ψ † iγ 0 γ µ ∂µ ψ + ... = +iψ † ∂0 ψ + ... (12.39)
(since (γ 0 )2 = −1). Then the canonical conjugate to ψ is
∂L
pψ = = iψ † (12.40)
∂ ψ̇
Then the Hamiltonian is

H = pq̇ − L = i d3 xψ † (+γ 0 γ i ∂i + mγ 0 )ψ (12.41)

When we quantize, we write anticommutation relations at equal time, {, }P.B. → 1/i~{, },


namely (after cancelling the i)
{ψα (⃗x, t), ψβ† (⃗y , t)} = δ 3 (⃗x − ⃗y )δαβ

104
{ψα (⃗x, t), ψβ (⃗y , t)} = {ψα† (⃗x, t), ψβ† (⃗y , t)} = 0 (12.42)

are the equal time anticommutation relations.


The quantization proceeds exactly as for the complex scalar field, just that instead of the
properly normalized solution of the KG equation, e±ip·x , we have the above solutions of the
Dirac equation, and again, their coefficients are harmonic oscillator operators. Therefore we
have

d3 p 1 ∑ s s −ip·x
ψ(x) = √ (a u (p)eip·x + bs† s
⃗ v (p)e )
(2π)3 Ep s p⃗ p

d3 p 1 ∑ s s −ip·x
ψ̄(x) = 3
√ (bp⃗ v̄ (p)eip·x + as† s
⃗ ū (p)e
p ) (12.43)
(2π) Ep s

where the a’s, a† ’s, and b’s and b† ’s are the annihilation/creation operators obeying the
anticommutation relations of the fermionic harmonic oscillator, i.e.

{arp⃗ , a†s r †s
q⃗ } = {bp
⃗ , bq⃗ } = (2π) δ (⃗
3 3
p − ⃗q)δrs (12.44)

and the rest of anticommutators are zero. There is a Fock vacuum |0 >, satisfying

asp⃗ |0 >= bsp⃗ |0 >= 0 (12.45)

and then Fock states created by acting with aps†


⃗ and bp
s†
⃗ on it. But, due to the anticom-
mutation relations, which mean in particular that (ap⃗ ) = (bs†
s† 2 2
⃗ ) = 0, we can’t have two
p
excitations in the same Fock state,

⃗ ) |ψ >= (bp
(as† ⃗ ) |ψ >= 0
2 s† 2
p (12.46)

which is a manifestation of the Pauli exclusion principle.


The Hamiltonian is then written as

d3 p ∑
H = Ep (a†s
⃗ ap
p
s †s
⃗ − bp
s
⃗ bp
⃗ )
(2π)3 s

d3 p ∑
= Ep (a†s s
⃗ ap
p
s† s
⃗ + bp
⃗ bp
†s s
⃗ − {bp ⃗ })
⃗ , bp (12.47)
(2π)3 s

where the anticommutator in the integral in = 1, giving a negative infinite constant. As in


the bosonic case, the infinite constant is removed by normal ordering. The difference is that
now, since arp⃗ asq⃗ = −as⃗qarp⃗ , ar†
⃗ aq⃗ = −a⃗
p
s† s† r†
q ap
⃗ and similar for b’s, we have

: arp⃗ as⃗q := arp⃗ asq⃗ = −asq⃗arp⃗ , etc. (12.48)

and so also
q⃗ := −aq⃗ ap
: arp⃗ as† s† r
⃗ (12.49)
So the net effect of normal ordering is to remove the anticommutator terms, moving all the
creation operators to the right with the anticommutation rules. Note that since the fermions

105
have a negative infinite constant, whereas the bosons have a positive infinite constant, the
infinite constant can cancel in a theory with an equal number of bosonic and fermionic
modes, which is called a supersymmetric theory. These supersymmetric theories therefore
possess a better behaviour with respect to unphysical infinities.
Also, note that we can interpret ap†⃗ as creating positive energy particles and bp⃗ as creation
operator for a negative energy particle, as the second term in the first line in (12.47) shows
(compare with the first). Dirac introduced the concept of Dirac sea, which is that there is
a full ”sea” of occupied states of negative energy (remember that for fermions a state can
only be occupied by a single particle). Then bp†⃗ destroys a particle of negative energy, which
is equivalent to creating a ”hole” in the Dirac sea (unoccupied state in a sea of occupied
states), which acts as a state of positive energy (consider the second line in (12.47)).
The fermionic path integral
How do we write a path integral? The path integral is used to describe quantum mechan-
ics, but it involves classical objects (functions), just going over all possible paths instead of
just the classical path. These classical objects are obtained in the classical limit of quan-
tum operators, i.e. when ~ → 0. For bosons, for instance for the harmonic oscillators
[a, a† ] = ~ → 0, so in the path integral over the harmonic phase space we used the classical
objects α and α∗ , for the quantum objects a and a† .
But now, we have anticommutation relations, in particular for the fermionic harmonic
oscillators, {â, ↠} = ~ → 0. So in the classical limit for these operators we don’t obtain
the usual functions, but we obtain objects that anticommute, forming what is known as a
Grassmann algebra. Considering the a, a† of a fermionic harmonic oscillator, we have

{a, a† } = {a, a} = {a† , a† } = 0 (12.50)

The objects a, a† are known as the ”odd” part of the algebra, since they anticommute,
whereas products of odd objects are called ”even”, since they commute. For instance, we
see that [aa† , aa† ] = 0. Thus in general we have bose × bose = bose, f ermi × f ermi = bose
and bose × f ermi = f ermi, where bose stands for even and f ermi stands for odd.
However, what is the meaning of a ”classical” Grassmann field if even in the classical ~ →
0 limit we can’t put more than one particle in the same state: {a† , a† } = 0 ⇒ a† a† |0 >= 0,
where a† is an object that appears in the expansion of the ”classical” Grassmann field ψ(x).
The meaning is not clear, but this is one of those instances where in quantum mechanics
there are questions which are meaningless. The point is that this is a formalism that works
for the path integral (i.e., we obtain the right results, which agree with experiments and
also with the operator formalism of quantum field theory), and the path integral is quantum
mechanical in any case, even if it is written as an integral over ”classical” Grassmann fields.
Definitions
Let’s now define better the calculus with Grassmann fields. The general Grassmann
algebra of N objects xi , i = 1, ..., N is {xi , xj } = 0, together with the element 1, which
commutes with the rest, [xi , 1] = 0, and with complex number coefficients (i.e., it is an
algebra over C).
Since (xi )2 = 0, the Fourier expansion of a general function stops after a finite number

106
of terms, specifically after N terms:
∑ (1) ∑ (2) ∑ (3) (N )
F ({xi }) = F (x0 ) + fi xi + fij xi xj + fijk xi xj xk + ... + f12...N x1 ...xN (12.51)
i i<j i<j<k

Only for an infinite number of generators xi is the Taylor expansion infinite in extent.
Note that here, since we cannot add bosons (even objects) with fermions (odd objects), the
functions must be either even, in which case there is always an even number of x’s in the
expansion, or odd, in which case there is an odd number of x’s in the expansion.
However, we will often think of only a subset of the x’s in making the expansion. For
instance, we can expand in terms of a single x (even though there are more objects in the
Grassmann algebra), and write for a general even function of x the expansion

f (x) = a + bx (12.52)

where, since x is odd, a is even and b is odd (fermionic). This is possible, since there is
at least one other Grassmann object, for instance called y, so we could have b = cy, i.e.
f (x) = a + cyx.
In fact, we will often consider (at least) an even number of x’s, half of which are used
for the expansion, and half for the coefficients, allowing us to write the general expansion
(12.51) with coefficients f (2k) being even (bosonic) but f (2k+1) being odd (fermionic). (This
is what one does in the case of supersymmetry, where one expands a ”superfield” Φ in terms
of auxiliary θ’s like the x’s above, with odd coefficients which are still Grassmann functions
corresponding to spinor fields.)
We define differentiation by first writing
( )

xj = δij (12.53)
∂xi

but since differential operators must also be Grassmann objects, we have


∂ ∂
(xj ...) = δij (...) − xj (...) (12.54)
∂xi ∂xi
Note that for bosons we would have a plus sign in the second term, but now ∂/∂xi anticom-
mutes past xj .
As an example, we have

(x1 x2 ...xn ) = δi1 x2 x3 ...xn − δi2 x1 x3 ...xn + δi3 x1 x2 x4 ...xn + ... + (−)n−1 δin x1 x2 ...xn−1
∂xi
(12.55)
Then, for instance, differentiation of a product of even functions f (x) and g(x) is done
in the usual way,
( ) ( )
∂ ∂ ∂
(f (x)g(x)) = f (x) g(x) + f (x) g(x) (12.56)
∂xi ∂xi ∂xi

107
but if f (x) is an odd function, we have the modified rule
( ) ( )
∂ ∂ ∂
(f (x)g(x)) = f (x) g(x) − f (x) g(x) (12.57)
∂xi ∂xi ∂xi

As another example, consider the function e i xi , which is defined in the usual way, just
that now we substitute (xi )2 = 0 in the Taylor expansion of the exponential. Then we have
∂ ∑i xi yi ∑
e = yk e i xi y i
∂xk
∂ ∑i x i y i ∑
e = −xk e i xi yi (12.58)
∂yk
Next, we define integration, which is not defined as a Riemann sum in the usual way
for functions over complex numbers. Indeed, due to the Grassmann properties, we cannot,
but rather, we must define it as a linear operator, and in particular we can only define the
indefinite integral, not a definite one.
Since the two basic elements of the algebra with an element x is 1 and x, we must define
what happens for these two elements. We define
∫ ∫
dx1 = 0; dxx = 1 (12.59)

For several xi ’s, we define the anticommutation relations for i ̸= j,

{dxi , dxj } = 0; {xi , dxj } = 0 (12.60)

For instance, we have


∫ ∫ ∫
dx1 dx2 x1 x2 = − dx1 x1 dx2 x2 = −1 (12.61)

Then the integral operation is translational invariant, since


∫ ∫ ∫ ∫
0 1 1
dxf (x + a) = dx[f + f (x + a)] = dxf x = dxf (x) (12.62)

In view of the above relations, we see that integration is the same as differentiation
(satisfies the same rules). For instance,
∫ ∫
dx2 x1 x2 x3 = − dx2 x2 x1 x3 = −x1 x3 (12.63)

the same as
∂ ∂
x1 x2 x3 = − x2 x1 x3 = −x1 x3 (12.64)
∂x2 ∂x2
We now define the delta function on the Grassmann space. We have

δ(x) = x (12.65)

108
To prove this, consider a general even function of x, f (x) = f 0 + f 1 x, where f 0 is even and
f 1 is odd. Then
∫ ∫ ∫
dxδ(x − y)f (x) = dx(x − y)(f + f x) = dx(xf 0 − yf 1 x)
0 1


= f + y dxf 1 x = f 0 − yf 1 = f 0 + f 1 y = f (y)
0
(12.66)

Finally, let us consider the change of variables from the Grassman variable x to the
variable y = ax, where a ∈ C. But then we must have
∫ ∫ ∫
1 = dxx = dyy = a dyx (12.67)

so it follows that
1
dy = dx (12.68)
a
which is not like the usual integration, but is rather like differentiation.

Important concepts to remember

• Spinor fields are representations of the Clifford algebra {γ µ , γ ν } = 2g µν , and spinorial


representations of the Lorentz algebra.

• In the Weyl (chiral) representation, γ5 is diagonal and has ±1 on the diagonal.



• The Dirac action is − ψ̄(∂/ + m)ψ, with ψ̄ = ψ † iγ 0 .

• The irreducible spinor representations are, in 4d Minkowski space, either Weyl spinors,
or Majorana spinors. Weyl spinors satisfy (1 ± γ5 )/2ψ = 0 and Majorana spinors
satisfy ψ̄ = ψ T C, with C the charge conjugation matrix.

• The Dirac operator is a kind of square root of the KG operator, since (∂/ − m)(∂/ + m) =
2 − m2 .

• The solutions of the Dirac equation are u(p)eip·x and v(p)e−ip·x , with u(p) and v(p)
are orthornormal, and in canonical quantization we expand in these solutions, with
fermionic harmonic oscillator coefficients.

• The momentum conjugate to ψ is iψ † , giving the nontrivial anticommutator


{ψα (⃗x, t), ψβ† (⃗y , t)} = δ(⃗x − ⃗y )δαβ .

• The infinite constant (zero point energy) in the Hamiltonian is negative.

• We can interpret as: a† creates positive energy states, b creates negative energy states,
and b† destroys negative energy states, thus effectively creating ”holes” in the Dirac
sea of positive energy.

109
• For the fermionic path integral, we use Grassmann algebra-valued objects.

• The Taylor expansion of a function of Grassmann-algebra objects ends at order N , the


number of xi ’s.

• Grassmann differentiation is also anticommuting with others and with x’s.

• Grassmann integration is the same as Grassmann differentiation and δ(x) = x for


Grassmann variables.

Further reading: See chapters 3.1.1, 3.1.2, 3.3 in [4], 7.1 in [1], 3.2 and 3.3 in [2] and
5.2 and 5.3 in [3].

110
Exercises, Lecture 12

1) Consider the Rarita-Schwinger action for a vector-spinor field ψµα in Minkowski space,

1
S=− d4 xψ̄µ γ µνρ ∂ν ψρ (12.69)
2

where (ψµ )α is a Majorana spinor and γ µνρ = γ [µ γ ν γ ρ] . Calculate the variation δS under a
variation δψ, and then the equations of motion, in x and p space.

2) Consider the Lagrangean for a Dirac field in Minkowski space,



S = − d4 x[ψ̄(∂/ + m)ψ + α(ψ̄ψ)2 ] (12.70)

and use the free field quantization in the lecture. Compute the quantum Hamiltonian in
terms of a, a† , b, b† and then the normal ordered Hamiltonian.

111
13 Lecture 13. Wick theorem, gaussian integration
and Feynman rules for fermions
Last lecture we saw that the path integral for fermions is done using integrals of objects
in a Grassmann algebra. We defined the calculus on the Grassmann algebra, defining an
anticommuting differential operator, and an integral operator (defined as a linear operation,
not a Riemann sum) which turned out to have the same properties as the differential operator.
We then saw that changing variables in the Grassmann algebra, between x and y = ax,
where a is a c-number, give us dy = dx/a, like for the differential operator. The obvious
generalization to a n-dimensional Grassmann algebra is: y = A · x implies dn y = dn x/ det A.
Gaussian integration - the real case.
We now define Gaussian integration over a Grassmann algebra over real numbers.
Theorem 1
Consider a real n × n antisymmetric matrix {Aij }, such that the (symmetric) matrix A2
has negative, nonzero eigenvalues. Then n = 2m, and for x1 , ..., xn Grassmann variables,
∫ √
T
dn xex Ax = 2m det A (13.1)

Proof:
Consider first the case m = 1, i.e. n = 2. Then
( ) ( 2 )
0 λ −λ 0
A= ⇒A =2
(13.2)
−λ 0 0 −λ2

and det A = λ2 > 0, whereas xT Ax = λ(x1 x2 − x2 x1 ) = 2λx1 x2 . Since (xi )2 = 0, we then


have ∫ ∫
xT Ax xT Ax
e = 1 + 2λx1 x2 ⇒ d xe
2
= dx2 dx1 [1 + 2λx1 x2 ] = 2λ (13.3)

Note that here we had to choose a specific order for the integrals in d2 x, namely dx2 dx1 , in
order to get a plus sign.
Next, we consider a version of the
Jordan lemma for the Grassmann algebra, namely that there is a transformation matrix B
(real, orthogonal) acting on the Grassmann algebra space such that B T AB is block-diagonal
in Jordan blocks, i.e.
 
0 λ1
−λ1 0 
 
 0 λ 
 2 
B AB = 
T
 −λ 2 0 
 (13.4)
 ... 
 
 0 λm 
−λm 0

which gives then det A = det(B T AB) = λ21 ...λ2m .

112
Proof of lemma:
We need to show that there is a basis of vectors {⃗e1 , ⃗e−1 , ⃗e2 , ⃗e−2 , ..., ⃗em , ⃗e−m } such that

A⃗ek = λk⃗ek ; A⃗e−k = −λk⃗ek (13.5)

Then we have A2⃗e±k = −λ2k⃗e±k . We see then that it is sufficient to take a basis of eigenvectors
of A2 , since we already know A2 has nonzero negative eigenvalues. We need to check that
there is a two-fold degeneracy for the basis, which is true, since if ⃗e has eigenvalue −λ2 , so
does ⃗e′ = A⃗e/λ, as we can easily check: A2⃗e′ = −λA⃗e = −λ2⃗e′ . The basis of vectors is linearly
independent, since if there would be a number f such that ⃗e′ = f⃗e, on one hand, we could
write what ⃗e′ is and obtain A⃗e = λf⃗e, and applying twice we obtain A2⃗e = λ2 f 2⃗e. On the
other hand, we know that A2⃗e = −λ2⃗e, but since f is real, f 2 > 0, giving a contradiction.
Therefore ⃗e and ⃗e′ are linearly independent, and we have found the Jordan basis for the
matrix A. Finally, ⃗e · ⃗e′ = λ1 ⃗eA⃗e = 0, since A is antisymmetric, hence the eigenvectors are
also orthogonal. q.e.d. lemma.
Then we can write
∫ ∫ ∏
m

n xT Ax n ′ (B T x)T B T AB(B T x)
d xe = d xe = [m = 1 case]i = 2m det A (13.6)
i=1

where x′ = B T x. q.e.d. theorem 1.


Theorem 2. Complex gaussian integration.
We write independent complex variables xi , yi , without putting yi = x∗i , and an n × n
antisymmetric matrix A, and then we have

T
dn xdn yey Ax = det A (13.7)

We consider
∫ this since in the path integral we have something exactly of this form, for
instance DψDψ̄e−ψ̄(∂/+m)ψ .
Proof:
T
In ey Ax , the terms that are nonzero under the integration are only terms of order n
(since (xi )2 =∫ 0, so for n + 1 terms necessarily one of the x’s is repeated, and for < n terms
we get some dxi 1 = 0). Specifically, these terms will be:

1 ∑∑
(yP (1) AP (1)Q(1) xQ(1) )...(yP (n) AP (n)Q(n) xQ(n) )
n! P Q
1 ∑∑
= (y1 A1,QP −1 (q) xQP −1 (1) )...(yn An,QP −1 (n) xQP −1 (n) )
n! P Q

= 1× (y1 A1,Q′ (q) xQ′ (1) )...(yn An,Q′ (n) xQ′ (n) )
Q′

= ϵ(y1 ...yn )(x1 ...xn ) ϵQ A1Q(1) ...AnQ(n)
Q
= ϵ(y1 ...yn )(x1 ...xn ) det A (13.8)

113
where in the first line we wrote the sums over permutations ∑ P, Q of 1, 2, ..., n, in the second
we have rewritten the sum over permutations, such that the P is done trivially as a factor
of n!, then we commuted the x’s and the y’s, obtaining y1 x1 y2 x2 ...yn xn = ϵy1 ...yn x1 ...xn ,
with ϵ a sign that depends on n, but not on Q. We can fix the sign of the integral
dn xdn y(y1 ...yn )(x1 ...xn ) to be ϵ by properly defining the order of dx’s and dy’s, in which
case we obtain just det A for the gaussian integral. q.e.d. theorem 2.
Real vs. complex integration
We can also check that complex Gaussian integration is consistent with the real integra-
tion case. If the complex objects xi and yi are written as xi = ai + ibi and yi = ai − ibi , then
y T Ax = aT Aa + bT Ab. The Jacobian of the transformation dn xdn y = Jdn adn b is J = 2−m
(for usual complex numbers we can check that it would be 2m , and the Grassmann integral
works as the inverse).
√ Then the result of the integral done as the double real integral over
d ad b is J2 ( det A)2 = det A, as we want.
n n m

The fermionic harmonic oscillator


For the fermionic harmonic oscillator, the quantum hamiltonian is
( )
† 1
ĤF = ω b̂ b̂ − (13.9)
2

We consider then as in the bosonic case coherent states |β > defined by (β is a Grassmann
valued number)

|β >= eb̂ β |0 >= (1 + b̂† β)|0 >= (1 − β b̂)|0 >⇒ b̂|β >= β|0 >= β(1 − β b̂† )|0 >= β|β >
(13.10)
that satisfy the completeness relation


1 = dβ ∗ dβ|β >< β ∗ |e−β β (13.11)

Then consider the Hamiltonian with currents

H(b† , b; t) = ωb† b − b† η(t) − η̄(t)b (13.12)

and define as in the bosonic case the transition amplitude

F (β ∗ , t′ ; β, t) =< β ∗ , t′ |β, t > (13.13)

for which we do the same steps as for the bosonic case and find the path integral
∫ { ∫ t′ }
∗ ′ ∗
F (β , t ; β, t) = Dβ Dβ exp i dτ [−iβ̇ ∗ (τ )β(τ ) − H] + β ∗ (t)β(t) (13.14)
t

The equations of motion of the Hamiltonian with sources are

β̇ ∗ − iωβ ∗ + iη̄ = 0
β̇ + iωβ − iη = 0 (13.15)

114
with solutions
∫ τ
iω(t−τ )
β(τ ) = βe +i eiω(s−τ ) η(s)ds
t∫
t′
∗ ∗ iω(τ −t′ )
β (τ ) = β e +i eiω(τ −s) η̄(s)ds (13.16)
τ

For the case β = β ∗ = 0, and t → −∞ and t′ → +∞ we obtain the vacuum functional


(vacuum expectation value).
In the usual manner, we obtain
{ ∫ +∞ ∫ +∞ }
Z[η, η̄] =< 0|0 > exp − dτ dseiω(τ −s) η̄(s)η(τ ) (13.17)
−∞ τ


∫ β → ψ and
The details are left as an exercise.∫ Renaming

∫ β → ψ̄ for easy generalization to
the field theory case, and writing β̇ β → ∂t ψ̄ψ = − ψ̄∂t ψ, the path integral becomes
∫ { ∫ }
Z[η, η̄] = Dψ̄Dψ exp i dt[ψ̄(i∂t − ω)ψ + ψ̄η + η̄ψ]
{ ∫ }
= Z[0, 0] exp − dsdτ η̄(s)DF (s, τ )η(τ ) (13.18)

Here we have defined the Feynman propagator such that (as usual) iS M = −ψ̄∆−1 ψ, i.e.

−1 dE e−iE(s−τ )
DF (s, τ ) = (−i(i∂t − ω)) (s, τ ) = i = θ(s − τ )e−iω(s−τ ) (13.19)
2π E − ω + iϵ
where we have first written the Fourier transform, then we have done the complex integral
with the residue theorem: there is now a single pole at E = ω − iϵ. We get a nonzero result
only if we close the contour below (in the lower half plane), which we can do only if s−τ > 0,
so that −iE(s − τ ) = −|ImE|(s − τ )+imaginary, otherwise the result is zero. Substituting
this DF (s, τ ) we indeed obtain (13.17).
In the Euclidean case, we write tM = −itE as usual, and substituting in Z[η, η̄] we obtain
∫ [ ∫ ( ) ∫ ]

ZE [η, η̄] = Dψ̄Dψ exp − dtE ψ̄ + ω ψ + dtE (ψ̄η + η̄ψ)
∫ ∂tE
{ }
= Z[0, 0] exp dτ dsη̄(s)D(s, τ )η(τ ) (13.20)

where we define −SE = −ψ̄D−1 ψ + ..., i.e.



−1 dE e−iE(s−τ )
D(s, τ ) = (∂τE + ω) =i (13.21)
2π E + iω
and as usual this Euclidean propagator is well defined.

115
The Wick rotation to Minkowski space also proceeds as usual, namely τE → iτM and in
order to have the same Eτ product, EE = (−i + ϵ)EM , where we only rotate with π/2 − ϵ to
avoid crossing the Minkowski space pole. Then we obtain the Feynman propagator as usual,

dEM e−iEM (sM −τM )
D(isM , iτM ) = i = DF (sM , τM ) (13.22)
2π EM − ω + iϵ′
We are now ready to generalize to field theory. The Minkowski space Lagrangean is
(M )
LF = −ψ̄(∂/ + m)ψ (13.23)

The Euclidean
∫ action
∫ is defined by the Wick rotation tE = itM , with iS M = −S E as usual,
with dtM = −i dtE , thus giving
(E)
LF = +ψ̄(∂/ + m)ψ (13.24)

Note however that since the gamma matrices must satisfy the Clifford algebra {γ µ , γ ν } =
2g µν , in the Minkowski case we had (γ 0 )2 = 1, but in the Euclidean case we have (γ 4 )2 = +1,
which is satisfied by choosing the Wick rotation γ 4 = iγ 0 (same rotation as for t), and as a
result we also write
ψ̄ = ψ † iγ 0 = ψ † γ 4 (13.25)
Also, in Euclidean space with γ 4 = iγ 0 we can check that γ µ = γµ = (γ µ )† .
Then the free fermionic partition function is
∫ { ∫ ∫ }
(0)
ZF [η̄, η] = Dψ̄Dψ exp − d xψ̄(∂/ + m)ψ + d4 x(η̄ψ + ψ̄η)
4

−1 η
= Z[0, 0]eη̄(∂/+m) (13.26)

The euclidean propagator is defined by −SE = −ψ̄∆−1 ψ, giving



−1 d4 p eip·(x−y)
SF (x, y) = (∂/ + m) = i (13.27)
(2π)4 −p
/ + im
Note that then

d4 p i
(∂/ + m)x SF = (ip
/ + m)eip·(x−y)
= δ 4 (x − y) (13.28)
(2π)4 −p/ + im
and note also that
1 −p/ − im
= 2 (13.29)
−p/ + im p + m2
The relation to Minkowski space operators is also as in the bosonic case, namely consider
the general N + M point function
ˆ ˆ
∫ 0|T {ψ̂α1 (x1 )...ψ̂αN (xN )ψ̄β1 (y1 )...ψ̄βM (yM )}|0 >
<
(M )
Dψ̄DψeiSF {ψα1 (x1 )...ψαN (xN )ψ̄β1 (y1 )...ψ̄βN (yN )} (13.30)

116
In particular, the propagator in the free theory is
∫ ∫
(M ) d4 p eip·(x−y) d4 p −p/ − im ip·(x−y)
SF (x−y) =< 0|T {ψ(x)ψ̄(y)}|0 >= i = e
(2π)4 −ip
/ − m + iϵ (2π)4 p2 + m2 − iϵ
(13.31)
As usual, by Wick rotation (tE = itM , EE = (−i + ϵ)EM ),we get

SF (itM − it′M ; ⃗x − ⃗y ) = iSF (x − y)


(E) (M )
(13.32)

The Wick theorem for Fermi fields, for instance for some interaction with a scalar ϕ
with source J and interaction term SI [ψ̄, ψ, ϕ], is

Z[η̄, η, J] = e−SI (− δη , δη̄ , δJ ) ZF [η̄, η]Zϕ [J]


δ δ δ (0) (0)
(13.33)

Note that the only difference from ∫the bosonic case is the minus sign for δ/δη, due to the
fact that we have the source term d4 x[η̄ψ + ψ̄η + ϕJ], so we need to commute the δ/δη
past ψ̄ to act on η.
Coleman’s lemma for fermions is also the same, except for the same minus sign, i.e.
( ) [ ]
δ δ δ δ ( )
F − , Z[η̄, η] = Z − , F (ψ̄, ψ)eψ̄η+η̄ψ |ψ̄=ψ=0 (13.34)
δη δ η̄ δψ δ ψ̄

Then applying it to the first form of the Wick theorem we get the second form of the Wick
theorem,

− δψ
δ δ
Z[η̄, η, J] = e−SI (− δη , δη̄ , δJ ) eη̄SF η Zϕ [J] = e
δ δ δ
e−SI (ψ̄,ψ,ϕ)+
(0) SF (ψ̄η+η̄ψ) (0)
δ ψ̄ Zϕ [J]|ψ=ψ̄=ϕ=0
(13.35)
Feynman rules for Yukawa interaction
We are now ready to write down the Feynman rules for the Yukawa interaction between
two fermions and a scalar,
LY = g ψ̄ψϕ (13.36)
Then the second form of the Wick theorem is
∫ ∫
1 δ δ − δψ
δ δ
e−g d4 xψ̄ψϕ+ d4 x[ψ̄η+η̄ψ+Jϕ]
SF
Z[η̄, η, J] = e 2 δϕ ∆ δϕ e δ ψ̄ |ψ=ψ̄=ϕ=0 (13.37)

For the free two point function, we must take two derivatives, and consider only the order
1 = g 0 term, i.e.
( )
δ δ
< 0|ψ(x)ψ̄(y)|0 >= − Z[η̄, η, J]|η=η̄=J=0
δ η̄(x) δη(y)
δ δ
= − SF {ψ(x)ψ̄(y)}|ψ=ψ̄=ϕ=0 = SF (x − y) (13.38)
δψ δ ψ̄

Therefore the first Feynman rule is that for propagator we have a SF (x − y), represented
by a solid line with an arrow between y (ψ̄(y)) and x (ψ(x)), see Fig.32. Note that now,

117
=SF(x−y) =−g
y x =−1
x 2
1
N

Figure 32: x-space Feynman rules for Yukawa interaction.

because fermions anticommute, the order matters, and if we interchange two fermions, for
instance these ψ and ψ̄ (interchange the order of the arrow), then we have a minus sign.
A scalar is now represented by a dotted line without an arrow on it.
Also, the vertex is standard. Since all the fields in the interaction term are different (no
field is repeated), we have just a factor of −g for the vertex (in the pure scalar case, for a
ϕn /n! interaction we have a factor of n! in front, which comes from permuting various ϕ’s,
but now we don’t have this).
The interchange of two fermions means also the only real difference from the bosonic
rules. If we have a fermion loops with N lines sticking out of it (in this case, ϕ lines), it
means that we must act with − δψ δ
SF δδψ̄ on factors of ψ ψ̄, obtaining SF as above. But the

factors of ψ ψ̄ come from expanding the e−g ψ̄ψϕ , and since the −g is part of vertex, the
important part is ( ) ( )
δ δ δ δ
− SF ... − SF [(ψ̄ψ)...(ψ̄ψ)] (13.39)
δψ δ ψ̄ δψ δ ψ̄
But in order to form the ψ ψ̄ combinations giving us the SF propagators, which by our
assumption are cyclically linked (form a loop) as SF (xN − x1 )SF (x1 − x2 )...SF (xN −1 − xN ),
we must commute the last ψ(xN ) past the other 2(N − 1) + 1 fermions, to put it in front,
giving a minus sign.
Therefore a fermion loop gives a minus sign.
The p-space Feynman rules are then
-the scalar propagator is as usual
1
(13.40)
p + M2
2

-the fermion propagator is

1 i −ip/+m
= = 2 (13.41)
ip/ + m −p
/ + im p + m2

-the vertex between two fermions and a scalar is −g.


-there is a minus sign for a fermion loop, and for interchanging a ψ with a ψ̄.

118
Then for a general interaction between fermions and a number of scalars, with
interaction term
SI (ψ̄, ψ, ϕ1 , ..., ϕn ) (13.42)
the vertex is found exactly as in the pure bosonic case, with the exception of the usual minus
sign in the functional derivative for fermions, namely


d4 xd4 yd4 z1 ...d4 zn e−i(px+p y+q1 z1 +...qn zn ) ×
( )
δ δ δ δ
× ... − {−SI [ψ̄, ψ, ϕ1 , ..., ϕn ]} (13.43)
δϕ1 (z1 ) δϕn (zn ) δ ψ̄(y) δψ(x)

=−i/(p^2−M^2−i )
=−i(−ip+m)/(p^2+m^2−i )
p
p
=−ig

Figure 33: p-space Feynman rules in Minkowski space.

The Minkowski space Feynman rules in p-space are (see Fig.33)


-The scalar propagator is
−i
(13.44)
p + M 2 − iϵ
2

-The fermion propagator


−ip/ + m
−i (13.45)
+ m2 − iϵ
p2
-The vertex is −ig.
-There is a minus sign for a fermion loop or for changing a ψ with ψ̄.

Important concepts to remember



• Real gaussian integration over n
∫ n n yT Ax Grassmann variables gives 2 n/2
det A, and complex
gaussian integration, d xd ye , gives det A.

• For the fermionic harmonic oscillator, the vacuum functional (vacuum expectation
value) gives the same expression as in the bosonic case, just in terms of the fermion
propagator DF (s, τ ).

• Again the Euclidean propagator is Wick rotated to the Minkowski space propagator.

119
• The generalization to field theory contains no new facts.

• We can now have N + M point functions of ψ’s and ψ̄’s, again given by the usual path
integrals.

• In the Wick theorem for fermions the only new thing is that we replace ψ̄ by −δ/δη,
due to the anticommuting nature of the sources and fields.

• In the Feynman rules for Yukawa theory, the only new thing is the fact that for a
fermion loop we have a minus sign, as we have for interchanging ∫ a ψ with a ψ̄, or
equivalently, the direction of the arrow. The vertex factor for −g ψ̄ψϕ is −g.

Further reading: See chapters 3.1.3, 3.1.4, 3.3.2, 3.3.3 in [4], 7.2 in [1], 5.3 in [3] and
4.7 in [2].

120
Exercises, Lecture 13

1) a) Consider θα , α = 1, 2 Grassmann variables and the even function



Φ(x, θ) = ϕ(x) + 2θα ψα (x) + θ2 F (x) (13.46)

where θ2 ≡ ϵαβ θα θβ . Calculate



d2 θ(a1 Φ + a2 Φ2 + a3 Φ3 ), (13.47)

where ∫
1
d θ=−
2
dθα dθβ ϵαβ (13.48)
4
and x, a1 , a2 , a3 ∈ R.
b) Fill in the details missed in the text for the calculation of
{ ∫ }
Z[η̄, η] = Z[0, 0] exp −i dτ dsη̄(s)DF (s, τ )η(τ ) (13.49)

2) Write down an expression for the Feynman diagram in Yukawa theory in Fig.34.

p2

p3

p4

p1

Figure 34: Feynman diagram in Yukawa theory.

Note: there are no external points on the pi lines.

121
14 Lecture 14. Spin sums, Dirac field bilinears and
C,P,T symmetries for fermions
In the previous 2 lectures we have defined the quantization, path integrals and Feynman
rules for fermions. This lecture is dedicated to some technical details for fermions that we
will use later on.
Often when doing Feynman diagrams we will need to do sums over polarizations of
fermions. For instance, if we do an experiment where we can’t measure the spin of external
particles, we should sum over this spin. It is therefore important to compute the relevant
spin sums.
The first one is
∑ ∑ (√−p · σξ s ) ( √ √ )
s s
u (p)ū (p) = √ ξ s† −p · σ̄ ξ s† −p · σ
−p · σ̄ξ s
s=1,2
(s√ √ √ √ )
−p · σ −p · σ̄ −p · σ −p · σ
= √ √ √ √
( −p · σ̄ −p ·)σ̄ −p · σ̄ −p · σ
m −p · σ
= = −ip/ + m (14.1)
−p · σ̄ m

where we have used ū = u† iγ 0 , s=1,2 ξ s ξ s† = 1 and {σi , σj } = 2δij implying {σ µ , σ̄ ν } =
−η µν , and so
1
(p · σ)(p · σ̄) = pµ pν {σ µ , σ̄ ν } = −p2 = m2 (14.2)
2
The second sum is over v’s, giving similarly

v s (p)v̄ s (p) = −ip
/−m (14.3)
s

Dirac field bilinears


A single fermion is classically unobservable (it is a Grassmann object, which cannot
be measured experimentally). Classically observable objects are fermion bilinears, which are
commuting variables, and can have a vacuum expectation value (VEV) that one can measure
experimentally. Even in the quantum theory, the fermion bilinears are objects that appear
very often, so it is important to understand their properties, in particular their properties
under Lorentz transformations.
We have defined ψ̄ by the condition that ψ̄ψ is a Lorentz scalar, so that is given. The
spinorial representations of spin 1/2 are the fundamental, or 1/2, to which ψ belongs, and
the conjugate representation, or 1/2, to which ψ̄ belongs, such that ψ̄ψ is a scalar. The
Lorentz transformation of ψ is

x′µ = Λµ ν xν ⇒ ψ ′ = S α β (Λ)ψ β
α
(14.4)

The next bilinear is ψ̄γ µ ψ. γ µ is a vector under Lorentz transformations, meaning that

S α γ (γ µ )γ δ (S −1 )δ β = Λµ ν (γ ν )α β (14.5)

122
and therefore ψ̄γ µ ψ is a vector, i.e.

ψ̄γ µ ψ → Λµ ν ψ̄γ ν ψ (14.6)

What other bilinears can we form? The most general would be ψ̄Γψ, where Γ is a general
4 × 4 matrix, which can be decomposed in a basis of 16 4 × 4 matrices. Such a basis is given
by
1
Oi = {1, γ µ , γ5 , γ µ γ5 , γ µν = γ [µ γ ν] } (14.7)
2
which is a total number of 1 + 4 + 1 + 4 + 4 · 3/2 = 16 matrices. Then this basis is normalized
as
Tr{Oi Oj } = 4δij (14.8)
since for instance Tr{1 × 1} = 4 and
1
Tr{γ µ γ ν } = Tr{γ µ , γ ν } = 4η µν (14.9)
2
Moreover, this basis is complete, satisfying the completeness relation
1
δαβ δγδ = (Oi )δα (Oi )βγ (14.10)
4
which is indeed a completeness relation, since by multiplying by Mβγ we get

1
M δα = Tr(M Oi )(Oi )δ α (14.11)
4
which is indeed the expansion in a complete basis. The coefficient is correct, since by
multiplying again with Oj αδ we get an identity.
Now by multiplying (14.10) by arbitrary spinors χβ , ψ̄ γ and N ϕδ , where N is a matrix,
we obtain the relation (after multiplying it by yet another matrix M )
1∑
M χ(ψ̄N ϕ) = − M Oj N ϕ(ψ̄Oj χ) (14.12)
4 j

which is called Fierz identity or Fierz recoupling. Note the minus sign which appears be-
cause we interchanged the order of the fermions. The Fierz identity is useful for simplifying
fermionic terms, allowing us to ”recouple” the fermions and hopefully obtain something
simpler.
We might think that we could introduce other Lorentz structures made up from gamma
matrices, but for instance
γ[µνρσ] = γ[µ γν γρ γσ] ∝ ϵµνρσ γ5 (14.13)
since γ5 = −iγ 0 γ 1 γ 2 γ 3 , and the gamma matrix products are antisymmetric by construction,
since gamma matrices anticommute. Then we also have

γ[µνρ] ∝ ϵµνρσ γ σ γ5 (14.14)

123
and there are no higher antisymmetric bilinears, since the antisymmetric products of 5 γµ is
zero.
As terminology, 1 is a scalar, γ5 is a pseudoscalar, γ µ is a vector and γ µ γ5 is a pseu-
dovector, and γ µν is an antisymmetric tensor. Here for instance pseudoscalar means that
it transforms like a scalar under usual Lorentz transformations, except under parity gets
an extra minus. Similarly, pseudo-something means it transforms usually under Lorentz
transformations, except under parity gets an extra minus sign.
If ψ(x) satisfies the Dirac equation, then the vector current j µ = ψ̄(x)γ µ ψ(x) is conserved:

∂µ j µ = (∂µ ψ̄)γ µ ψ + ψ̄γ µ ∂µ ψ = (mψ̄)ψ + ψ̄(−mψ) = 0 (14.15)

and the axial vector current j µ5 = ψ̄(x)γ µ γ5 ψ(x)

∂µ j µ5 = (∂µ ψ̄)γ µ γ5 ψ + ψ̄γ µ γ5 ∂µ ψ = mψ̄γ5 ψ − ψ̄γ5 ∂/ψ = 2mψ̄γ5 ψ (14.16)

is conserved ∂µ j µ5 = 0 if m = 0 only.
C,P,T symmetries for fermions
The continuous Lorentz group L↑+ , the proper orthochronous group, continuously con-
nected with the identity, is a symmetry. But we have also the parity and time reversal
transformations

P : (t, ⃗x) → (t, −⃗x)


T : (t, ⃗x) → (−t, ⃗x) (14.17)

which can also be symmetries, but not necessarily (there is no physical principle which
demands it).
Besides L↑+ , there are 3 other disconnected groups, L↑− , L↓+ and L↓+ . L+ is called proper
group, and L− improper, and the two are related by P , whereas L↑ is called orthocronous
and L↓ nonorthochronous, and the two are related by T .
Besides these transformations, we also have C, the charge conjugation transformation,
which transforms particles into antiparticles.
For a long time, it was thought that C, P, T must be symmetries of physics, for instance
since:
-gravitational, electromagnetic and strong interactions respect C, P, T .
-weak interactions however break C and P separately, but preserve CP . The Nobel prize
was awarded for the discovery of this breaking of parity, thought to have been fundamental.
-by now there is good experimental evidence for CP breaking also, which would point to
interactions beyond the Standard Model.
-However, there is a theorem, that under very general conditions (like locality and uni-
tarity), CP T must be preserved always, so CP breaking implies T breaking.
In the quantum theory, the action of the C, P, T symmetries is encoded in operators
C, P, T that act on other operators. The c-numbers are generally not modified (though see
T below), even if they contain t, ⃗x.
Parity

124
Parity is defined as being the same as reflection in a mirror. If we look in a mirror at a
momentum vector with a spin, or helicity, represented by a rotation around the momentum
axis, we see the momentum inverted, but not the spin, hence parity flips the momentum p⃗,
but not the spin s, see Fig.35.

mirror

Figure 35: Parity is reflexion in a mirror. The momentum gets inverted, but not the spin.

That means that the action of P on the annihilation/creation operators in the expansion
of the quantum field ψ is
P asp⃗ P −1 = ηa as−⃗p
P bsp⃗ P −1 = ηb bs−⃗p (14.18)
where ηa and ηb are possible phases. Since two parity operations revert the system to its
original, we have P 2 = 1, so P −1 = P , and then also ηa2 = ηb2 = ±1, since observables are
fermion bilinears, so a ± sign doesn’t make a difference.
Defining the momenta p̃ = (p0 , −⃗p), we have
(√ ) (√ )
−p · σξ −p̃ · σ̄ξ
u(p) = √ = √ = iγ 0 u(p̃)
(√ −p · σ̄ξ ) (√ −p̃ · σξ )
−p · σξ √ −p̃ · σ̄ξ
v(p) = √ = = −iγ 0 v(p̃) (14.19)
− −p · σ̄ξ − −p̃ · σξ
where ( )
0 1
γ = −i
0
(14.20)
1 0
Then

d3 p 1 ∑
P ψ(x)P = √ (ηa as−⃗p us (p)eip·x + ηb∗ bs† s
p v (p)e
−⃗
−ip·x
)
(2π)3 2Ep⃗ s

d3 p̃ 1 ∑
= 3
√ (ηa asp̃ iγ 0 us (p̃)ep̃(t,−⃗x) − ηb∗ bs† 0 s
p̃ iγ v (p̃)e
−ip̃(t,−⃗
x)
)(14.21)
(2π) 2Ep̃ s
where in the second line we have changed the integration variable from p to p̃, and then used
the relation between u(p) and u(p̃) and similarly for v. Finally, we see that we need to have
ηb∗ = −ηa to relate to another ψ field. In that case, we obtain
P ψ(x)P = ηa iγ 0 ψ(t, −⃗x) (14.22)

125
Then we have also

(P ψ(x)P )† = −iηa∗ ψ † (t, −⃗x)γ 0† = iηa∗ ψ † (t, −⃗x)γ 0 (14.23)

and so
P ψ̄(x)P = (P ψ(t, ⃗x)P )† iγ 0 = iηa∗ ψ̄(t, −⃗x)γ 0 (14.24)
We can now also compute the behaviour of fermion bilinears under parity. We obtain

P ψ̄ψP = |ηa |2 ψ̄(t, −⃗x)γ 0 (−γ 0 )ψ(t, −⃗x) = ψ̄ψ(t, −⃗x) (14.25)

i.e., it transforms as a scalar. Here we have used that ηa is a phase, so |ηa |2 = 1, and
−(γ 0 )2 = 1.
Next, we have

P ψ̄γ5 ψP = ψ̄(t, −⃗x)γ 0 γ5 (−γ 0 )ψ(t, −⃗x) = −ψ̄γ5 ψ(t, −⃗x) (14.26)

so is a pseudoscalar. Also,

P ψ̄γ µ ψP = ψ̄γ 0 γ µ (−γ 0 )ψ(t, −⃗x) = (−1)µ ψ̄γ µ ψ(t, −⃗x) (14.27)

where (−1)µ = +1 for µ = 0 and = −1 for µ ̸= 0, so it transforms like a vector. Further,

P ψ̄γ µ γ5 ψP = ψ̄γ 0 γ µ γ5 (−γ 0 )ψ(t, −⃗x) = −(−1)µ ψ̄ψ(t, −⃗x) (14.28)

so it transforms as a pseudovector.
Time reversal
Consider now the time reversal transformation. Under it, the momentum changes di-
rection (time flows opposite), and also the spin changes, since it rotates back, see Fig.36.
However, time reversal cannot be made into a linear operator like P , but it can be made
into an antilinear operator,

T (c − number) = (c − number)∗ T (14.29)

Figure 36: Time reversal inverts both momentum and spin.

Then we have

T aps⃗ T = a−s
−⃗p
T bps⃗ T = b−s
−⃗
p (14.30)

126
and since the T operator is antilinear, for the full Dirac equation modes we have

T asp⃗ us (p)eip·x T = a−s


−⃗
s ∗ −ip·x
p [u (p)] e
−ip·x −s† s ∗ ip·x
T bs† s
p v (p)e T = v−⃗ p [v (p)] e (14.31)
( ) ( )
1 0
We now consider a general spin basis, instead of and , corresponding to spin
0 1
oriented along the z direction, we take a spin direction of arbitrary spherical angles θ, ϕ,
giving
( ) ( )
1 cos θ/2
ξ(↑) = R(θ, ϕ) = iϕ
0
( ) ( −iϕ e sin θ/2)
0 −e sin θ/2
ξ(↓) = R(θ, ϕ) = (14.32)
1 cos θ/2

and we have ξ s = (ξ(↑), ξ(↓)), and then (⃗σ σ 2 = σ 2 (−⃗σ ∗ ) ⇒ if ⃗n · ⃗σ ξ = +ξ, then (⃗n ·
⃗σ )(−iσ 2 ξ ∗ ) = −(−iσ 2 ξ ∗ ))

ξ −s = (ξ(↓), −ξ(↑)) = −iσ 2 (ξ s )∗ (14.33)

Analogously, we write for the annihilation operators a−s ⃗ = (ap


p ⃗ , −ap
2 1
⃗ ) and bp
−s
⃗ = (bp⃗ , −bp
2 1
⃗ ).
For the column vectors we have
(√ ) ( √ ) ( 2 )
−p̃ · σ(−iσ 2 s∗
ξ ) −iσ 2
−p · σ ∗ ξ s∗ σ 0
u (p̃) = √
−s
= √ = −i [us (p)]∗ = γ 1 γ 3 [us (p)]∗
−p̃ · σ̄(−iσ 2 ξ s∗ ) −iσ 2 −p · σ̄ ∗ ξ s∗ 0 σ2
√ (14.34)

Here we have used the relation p̃ · σσ 2 = σ 2 p · σ ∗ , which we can prove by expanding the
square root for small p⃗, and then (⃗p · ⃗σ )σ2 = −σ2 (⃗p · ⃗σ ∗ ), true since σ2∗ = −σ2 , σ1,3

= σ1,3 .
Then we can compute the effect on the quantum field operator, and obtain

T ψ(t, ⃗x)T = −γ 1 γ 3 ψ(−t, ⃗x) (14.35)

For the fermion bilinears, we then obtain

T ψ̄ψT = +ψ̄ψ(−t, ⃗x)


T ψ̄γ5 ψT = −ψ̄γ5 ψ(−t, ⃗x)
T ψ̄γ µ ψT = (−1)µ ψ̄γ µ ψ(−t, ⃗x) (14.36)

Charge conjugation
Charge conjugation is the operation that takes particles to antiparticles. On the classical
spinor space, it acts not on the classical (t, ⃗x), but by the C-matrix (charge conjugation
matrix) defined before. We need to define how it acts on the quantum field operators.
−ip·x
As we said before, due to the fact that ψ ∼ asp⃗ us (p)eip·x + bs† s
⃗ v (p)e , and in the energy
∫ † † †
p
we have ω(a a − bb ), we can understand a as creation operator for fermions of positive
energy (e.g. electrons e− ) and b† as an annihilation operator for fermions of negative energy

127
in the fully occupied ”Dirac sea”, equivalent to a creation operator for a ”hole” of effective
positive energy, or antiparticle (e.g. positron e+ ).
Therefore the C operator maps a’s into b’s,

Casp⃗ C −1 = bsp⃗
Cbps⃗ C −1 = asp⃗ (14.37)

Note that we have ignored possible phases here. Then we use that
( ) (√ ) (√ )
0 −iσ 2 −p · σξ s −p · σξ s
s ∗
(v (p)) = √ =γ √
2
(14.38)
iσ 2 0 −p · σ̄ξ s −p · σ̄ξ s
√ √
where again we used −p · σσ 2 = σ 2 −p · σ̄ ∗ . Then we have for the column vectors

us (p) = γ 2 (v s (p))∗
v s (p) = γ 2 (us (p))∗ (14.39)

Finally, using the linearity property of the C operator and the above properties, we can
calculate the effect on the quantum field operator,

Cψ(x)C −1 = γ 2 ψ ∗ (x) = γ 2 (ψ † )T (14.40)

Therefore the charge conjugation operator interchanges ψ and ψ † up to some matrix.


For the fermion bilinears we find

C ψ̄ψC −1 = ψ̄ψ
C ψ̄γ5 ψC −1 = ψ̄γ5 ψ
C ψ̄γ µ ψC −1 = −ψ̄γ µ ψ
C ψ̄γ µ γ5 ψC −1 = ψ̄γ µ γ5 ψ (14.41)

The full transformations under P, T, C and CP T , for various bilinears and for the deriva-
tives ∂µ , are

ψ̄ : +1, +1, +1, +1


ψ̄γ5 ψ : −1, −1, +1, +1
ψ̄γ µ ψ : (−1)µ , (−1)µ , −1, −1
ψ̄γ µ γ5 ψ : −(−1)µ , (−1)µ , +1, −1
ψ̄γ µν ψ : (−1)µ (−1)ν , −(−1)µ (−1)ν , −1, +1
∂µ : (−1)µ , −(−1)µ , +1, −1 (14.42)

where of course under P and T we also transform the position for the fields.

Important concepts to remember


∑ ∑
• The spin sums are s us (p)ūs (p) = −ip/ + m and s v s (p)v̄ s (p) = −ip/ − m.

128
• The complete basis of 4×4 matrices is Oi = {1, γ µ , γ5 , γ µ γ5 , γ µν }, and fermion bilinears
are ψ̄Oi ψ.

• The presence of γ5 means a pseudoscalar/pseudovector,etc., i.e. an extra minus under


parity.

• Fierz recoupling follows from the completeness of the Oi basis.

• Vector currents are conserved, and axial vector currents are conserved only if m = 0.

• T,C,P act on quantum operators.

• Time reversal T is antilinear and charge conjugation C changes particles to antiparti-


cles.

Further reading: See chaptes 3.4 and 3.6 in [2].

129
Exercises, Lecture 14

1) Using the Fierz identity, prove that we have for Majorana spinors λa

(λ̄a γµ λc )(ϵ̄γ µ λb )fabc = 0 (14.43)

where fabc is totally antisymmetric, and also using

γµ γρ γ µ = −2γρ
γµ γρσ γ µ = 0 (14.44)

and that Majorana spinors satisfy

ϵ̄χ = χ̄ϵ
ϵ̄γµ χ = −χ̄γµ ϵ
ϵ̄γ5 χ = χ̄γ5 ϵ
ϵ̄γµ γ5 χ = χ̄γµ γ5 ϵ (14.45)

2) Prove that the transformations of ψ̄γ µ γ5 ψ under T, C and CP T are with (−1)µ , +1
and −1 respectively.

130
15 Lecture 15. Dirac quantization of constrained sys-
tems
In this lecture we will describe the quantization of constrained systems developed by Dirac.
I will largely follow the book by P.A.M. Dirac, ”Lectures on Quantum Mechanics”, which is
a series of 4 lectures given at Yeshiva University, published in 1964, and is possibly one of
the most influential books in physics, per number of pages. The first 2 lectures describe the
procedure now known as Dirac quantization, which is very important for modern theoretical
physics, and even now, there is little that needs to be changed in the presentation of Dirac.
So what is the problem that we want to solve? We want to quantize gauge fields (we will
do so next lecture), which have an action

1
S=− d4 xFµν
2
(15.1)
4
where the field strength is written in terms of the gauge field Aµ as
Fµν = ∂µ Aν − ∂ν Aµ , (15.2)
therefore we have a gauge invariance
δAµ = ∂µ Λ (15.3)
which leaves Fµν , and therefore the action, invariant. This means that there is a redundancy
in the system, there are less degrees of freedom in the system than variables used to describe
it (we can use Λ to put one component, for instance A0 , to zero). To quantize only the
physical degrees of freedom, we must fix a gauge, for instance the Lorentz, or covariant,
gauge ∂ µ Aµ = 0. That imposes a constraint on the system, so we must learn how to
quantize in the presence of constraints, which is the subject that Dirac tackled. We can deal
with this in the Hamiltonian formalism, leading to the Dirac quantization of constrained
systems. Even though we will not use much of the Dirac formalism later, it is important
theoretically, to understand the concepts involved better, hence we started with it.
In the Lagrangean formulation of classical mechanics, constraints are introduced in the
Lagrangean by multiplying them with Lagrange multipliers. We will see that in the Hamil-
tonian formulation, we also add the constraints to the Hamiltonian with some coefficients,
with some notable differences.
In the Hamiltonian formalism, we first define conjugate momenta as
∂L
pn = (15.4)
∂ q̇n
and then the Hamiltonian is ∑
H= (pn q̇n ) − L (15.5)
n
The Hamilton equations of motion are
∂H ∂H
q̇n = ; ṗn = − (15.6)
∂pn ∂qn

131
We define the Poisson brackets of two functions of p and q, f (p, q) and g(p, q), as
∑ [ ∂f ∂g ∂f ∂g
]
{f, g}P.B. = − (15.7)
n
∂q n ∂p n ∂pn ∂qn

Then the Hamilton equations of motion are

q̇n = {qn , H}P.B. ; ṗn = {pn , H}P.B. (15.8)

and more generally, we can write for time evolution (equation of motion) of some function
on phase space, g(q, p),
ġ = {g, H}P.B. (15.9)
But now assume that there are M constraints on phase space,

ϕm (q, p) = 0, m = 1, ..., M (15.10)

We will call these the primary constraints of the Hamiltonian formalism, we will see shortly
why.
We will define ≈, called weak equality, as equality only after using the constraints ϕm = 0.
Note that we must use the constraints only at the end of a calculation, for instance when using
Poisson brackets, since the Poisson brackets are defined by assuming a set of independent
variables qn , pn (so that partial derivatives mean when we keep all the other variables fixed;
this would not be possible if there is a relation between all the q’s and p’s). So only when
all the dynamical calculations are done, can we use ϕm = 0.
So the weak equality means that by definition

ϕm ≈ 0 (15.11)

since otherwise ϕm is some function of q’s and p’s, for which we can calculate Poisson brackets,
and is not identically zero.
Note that as far as the time evolution is concerned, there is no difference between H and
H + um ϕm (indistinguishable), since ϕm ≈ 0. Then the equations of motion are
∂H ∂ϕm
q̇n = + um ≈ {qn , H + um ϕm }P.B.
∂pn ∂pn
∂H ∂ϕm
ṗn = − − um ≈ {pn , H + um ϕm }P.B. (15.12)
∂qn ∂qn
and in general we write
ġ ≈ {g, HT }P.B. (15.13)
where the total Hamiltonian HT is

HT = H + um ϕm (15.14)

Note that in the above we have used the fact that the Poisson bracket of the coefficient um
(which is in principle a function of (q, p)) does not contribute, since it is multiplied by ϕm ,
so {g, um }ϕm ≈ 0.

132
Let us now apply the above time evolution to the particular case of the constraints ϕm
themselves. Physically, we know that if ϕm are good constraints, the time evolution should
keep us within the constraint hypersurface, so we must have ϕ̇m ≈ 0 (which means that the
time variation of ϕm must be proportional to some ϕ’s itself). That gives the equations

{ϕm , H}P.B. + un {ϕm , ϕn }P.B. ≈ 0 (15.15)

This in turn will give other, potentially independent, constraints, called secondary con-
straints. We can then apply the time variation again, and gain new constraints, etc., until
we get nothing new, finally obtaining the full set of secondary constraints, that we will call
ϕk , k = M + 1, ..., M + K. Together, all the constraints, primary and secondary, form the
set
ϕj ≈ 0, j = 1, ..., M + K (15.16)
While there is not much difference between primary and secondary constraints, as we
will see, they act both in the same way, there is one distinction that is useful. We will call
a function R(p, q) on phase space first class if

{R, ϕj }P.B. ≈ 0(i.e., = rjj ′ ϕj ′ ) (15.17)

for all j = 1, ..., M + K, and second class if {R, ϕj }P.B. is not ≈ 0 for some ϕj . In particular,
we have for the time evolution of all the constraints, by definition (all the set of constraints
satisfies that their time evolution is also a constraint)

{ϕj , HT }P.B. ≈ {ϕj , H}P.B. + um {ϕj , ϕm }P.B. ≈ 0 (15.18)

(here again we dropped the term {um , ϕj }ϕm ≈ 0). These equations tell us first that HT is
first class, and second, give us a set of M + K equations for the M coefficients um , which
are in principle functions of q and p, i.e. um (q, p). Even if the system looks overconstrained,
these equations need to have at least a solution from physics consistency, since if not, it would
mean that there is no possible consistent time evolution of constraints, which is clearly wrong.
But in general the solution is not unique. If Um is a particular solution of (15.18), then the
general solution is
um = Um + va Vam (15.19)
where va are arbitrary numbers, and Vam satisfy the equation

Vam {ϕj , ϕm }P.B. ≈ 0 (15.20)

Then we can split the total Hamiltonian, which is as we saw first class, as

HT = H + Um ϕm + va Vam ϕm
= H ′ + va ϕ a (15.21)

where
H ′ = H + Um ϕm (15.22)

133
is first class, by the definition of Um as a particular solution of (15.18), and

ϕa = Vam ϕm (15.23)

are first class primary constraints, because of the definition of Vam as satisfying (15.20).
Theorem. If R and S are first class, i.e. {R, ϕj }P.B. = rjj ′ ϕj ′ , {S, ϕj }P.B. = sjj ′ ϕj ′ , then
{R, S}P.B. is first class.
Proof: We first use the Jacobi identity for antisymmetric brackets (like the commutator
and the Poisson brackets), which is an identity (type 0=0) proved by writing explicitly the
brackets,
{{R, S}, P } + {{P, R}, S} + {{S, P }, R} = 0 (15.24)
to write

{{R, S}P.B. , ϕj }P.B. = {{R, ϕj }P.B. , S}P.B. − {{S, ϕj }P.B. , R}P.B.


= {rjj ′ ϕj ′ , S}P.B. − {sjj ′ ϕj ′ , R}P.B.
= −rjj ′ sj ′ j ′′ ϕj ′′ + {rjj ′ , S}P.B. ϕj ′
−sjj ′ (−rj ′ j ′′ ϕj ′′ ) − {sjj ′ , R}P.B. ϕj ′ ≈ 0 (15.25)

q.e.d.
Finally, we can add the first class secondary constraints ϕa′ as well to the Hamiltonian,
since there is no difference between the first class primary and first class secondary con-
straints, obtaining the extended Hamiltonian HE ,

HE = HT + va′ ′ ϕa′ (15.26)

However, the second class constraints ϕw are different. We cannot add them to the
Hamiltonian, since {ϕw , ϕj }P.B is not ≈ 0, so the time evolution of ϕj , ϕ̇j = {ϕj , H}P.B. ,
would be modified.
Therefore, we have the physical interpretation that first class constraints generate motion
tangent to the constraint hypersurface (adding them to the Hamiltonian, the new term
is ≈ 0, i.e. in the constraint hypersurface ϕm = 0). On the other hand, second class
constraints generate motion away from the constraint hypersurface, since adding them to
the Hamiltonian adds terms which are not ≈ 0 to the time evolution of ϕj .
Quantization
Now we finally come to the issue of quantization. Normally, we substitute the Poisson
bracket {, }P.B. with 1/i~[, ] when quantizing. But now there is a subtlety, namely the
constraints.
The simplest possibility is to impose the constraints on states (wavefunctions), i.e. to
put
ϕ̂j |ψ >= 0. (15.27)
But that leads to a potential problem. For consistency, applying this twice we get [ϕ̂j , ϕ̂j ′ ]|ψ >=
0. But since the ϕj are supposed to be ALL the constraints, the commutator must be a linear
combination of the constraints themselves, i.e.

[ϕ̂j , ϕ̂j ′ ] = cjj ′ j ′′ ϕ̂j ′′ (15.28)

134
Note the fact that in quantum mechanics the c’s could in principle not commute with the
ϕ̂j , so we must put the c’s to the left of ϕ̂j ’s, to have the commutator on the left hand side
equal zero.
Then the time evoluton of the constraints should also be a constraint, so

[ϕ̂j , H] = bjj ′ ϕ̂j ′ (15.29)

If there are no second class constraints, then {ϕj , ϕj ′ }P.B. ≈ 0 (the condition for all the
constraints to be first class), which in the quantum case turns into the algebra of constraints
(15.28), and we have no problem.
But if there are second class constraints, we have a problem, since then the fact that
{ϕw , ϕj ′ }P.B. is not ≈ 0 contradicts the consistency condition (15.28).
Let’s see what happens in an example. Consider a system with several degrees of freedom,
and the constraints q1 ≈ 0 and p1 ≈ 0. But then their Poisson brackets give {q1 , p1 }P.B. =
1 ̸= 0, which means that they are not first class. But then we can’t quantize the constraints
by imposing them on states as
q̂1 |ψ >= 0 = p̂1 |ψ > (15.30)
since then on one hand applying the constraints twice we get [q̂1 , p̂1 ]|ψ >= 0, but on the
other hand the commutator gives

[q̂1 , p̂1 ]ψ = i~|ψ > (15.31)

which is a contradiction. Therefore one of the two assumptions we used is invalid: either
ϕ̂j |ψ >= 0 is not a good way to impose the constraint, or the quantization map {, }P.B. →
1/i~[, ] is invalid. It turns out that the latter is the case. We need to modify the Poisson
bracket in order to map to 1/i~[, ]. In the example above, the solution is obvious. Since
the constraints are q1 = p1 = 0, we can just drop them from the phase space, and write the
modified Poisson bracket
∑N ( )
∂f ∂g ∂f ∂g
{f, g}P.B.′ = − (15.32)
n=2
∂qn ∂pn ∂pn ∂qn

But what do we do in general?


In general we must introduce the Dirac brackets as follows. We consider the indepen-
dent second class constraints (i.e., taking out the linear combinations which are first class,
remaining only with constraints which have all the linear combinations being second class),
called χs , and we define css′ by

css′ {χs′ , χs′′ }P.B. = δss′′ , (15.33)

i.e., we have the inverse matrix of Poisson brackets, css′ = {χs , χs′ }−1
P.B. . Then we define the
Dirac brackets as

[f, g]D.B. = {f, g}P.B. − {f, χs }P.B. css′ {χs′ , g}P.B. (15.34)

135
Then if we use the Dirac brackets instead of the Poisson brackets, the time evolution is not
modified, since

[g, HT ]D.B. = {g, HT }P.B. − {g, χs }css′ {χs′ , HT }P.B. ≈ {g, HT }P.B. ≈ ġ (15.35)

where we have used the fact that HT is first class, so {χs′ , HT }P.B. ≈ 0.
On the other hand, the Dirac bracket of the second class constraints with any function
on the phase space is zero, since

[f, χs′′ ]D.B. = {f, χs′′ }P.B. − {f, χs }P.B css′ {χs′ , χs′′ }P.B. = 0 (15.36)

(equal to 0 strongly!) where we have used the definition of css′ as the inverse matrix of
constraints. That means that in classical mechanics we can put χs = 0 strongly if we use
Dirac brackets. In quantum mechanics, we again impose it on states as

χ̂s |ψ >= 0 (15.37)

but now we don’t have any more contradictions, since the Dirac bracket, turning into a
commutator in quantum mechanics, is zero for the second class constraints with anything.
Therefore [fˆ, χ̂s ]|ψ >= 0 gives no contradictions now.
Finally, note that in quantum mechanics the difference between the primary and sec-
ondary constraints is irrelevant, whereas the difference between first and second class con-
straints is important, since second class constraints modify the Dirac bracket.
Example: electromagnetic field
We now return to the example which started the discussion, the electromagnetic field
with action ∫ 2 ∫ [ ]
Fµν 1 1
S=− d x 4
= − Fij F − F0i F
ij 0i
(15.38)
4 4 2
where Fµν = ∂µ Aν − ∂ν Aµ . We also denote derivatives as ∂µ B = B,µ . Then the momenta
conjugate to the fields Aµ are
δL
Pµ = = F µ0 (15.39)
δAµ,0
Since F 00 = 0 by antisymmetry, it means we have the primary constraint

P0 ≈ 0 (15.40)

The basic Poisson brackets of the fields Aµ with their conjugate momenta are as usual

{Aµ (⃗x, t), P ν (⃗x′ , t)}P.B. = δµν δ 3 (⃗x − ⃗x′ ) (15.41)

The Hamiltonian is then


∫ ∫ [ ]
1 ij 1 i0
H = d xP Aµ,0 − L = d x F Ai,0 + F Fij + F Fi0
3 µ 3 i0

∫ [ ] 4 2
1 1
= d3 x F i0 A0,i + Fij F ij − F i0 Fi0
4 2

136
∫ [ ]
1 1
= d x Fij F ij + P i P i − A0 P i ,i
3
(15.42)
4 2

where in the second equality we used Fi0 = A0,i − Ai,0 and in the second we used P i = F 0i
and partial integration.
Next we compute the secondary constraints. First we compute the time evolution of the
primary constraint, giving a secondary constraint

{P 0 , H}P.B. = P i ,i ≈ 0 (15.43)

The time evolution of the secondary constraint is in turn trivial,

{P i ,i , H}P.B. = 0 (15.44)

so there are no other secondary constraints.


All the constraints are first class, since we have

{P 0 , P 0 }P.B. = 0 = {P 0 , P i ,i }P.B. = {P i ,i , P j ,j }P.B. = 0 (15.45)

That means that there is no need for Dirac brackets. We have H = H ′ , and we add the first
class primary constraint P 0 with an arbitrary coefficient, giving the total Hamiltonian
∫ ∫ ( ) ∫ ∫
′ 1 ij 1 i i
HT = H + vP = 0
F Fij + P P − A0 P ,i + v(x)P 0 (x)
i
(15.46)
4 2

The extended Hamiltonian is found by adding also the secondary first class constraint with
an arbitrary coefficient, ∫
HE = HT + d3 xu(x)P i ,i (x) (15.47)

But A0 and P 0 contain no relevant information. We can put P 0 = 0 in HT since that only
makes the time evolution of A0 trivial, Ȧ0 = 0, and then we can get rid of A0 by redefining
u′ (x) = u(x) − A0 (x), obtaining the Hamiltonian
∫ ( ) ∫
1 ij 1 i i
HE = d x 3
F Fij + P P + d3 xu′ (x)P i ,i (x) (15.48)
4 2

Finally, we should note that the Dirac formalism is very important in modern theoretical
physics. It is the beginning of more powerful formalisms: BRST quantization, Batalin-
Vilkovitsky (BV) formalism, field-antifield formalism, used for gauge theories and string
theory, among others. However, we will not explain them here.

Important concepts to remember

• Dirac quantization of constrained system starts in the Hamiltonian formalism with the
primary constraints ϕm (p, q) = 0.

137
• We write weak equality F (q, p) ≈ 0 if at the end of the calculation we use ϕm = 0.

• The primary constraints are added to the Hamiltonian, forming the total Hamiltonian
HT = H + um ϕm .

• The secondary constraints are obtained from the time evolution of the primary con-
straints, {ϕm , HT }P.B. and iterated until nothing new is found. In total, ϕj , j =
1, ..., M + K are all the constraints.

• First class quantities R satisfy {R, ϕj } ≈ 0, and second class quantities don’t.

• HT is written as HT = H ′ + va ϕa where H ′ is first class, ϕa are first class primary


constraints, and va are numbers.

• The extended Hamiltonian is obtained by adding the first class secondary constraints
HE = HT + va′ ′ ϕa′ .

• When quantizing, we must impose the constraints on states, ϕ̂j |ψ >= 0, but to
avoid inconsistencies in the presence of second class constraints χs , we must intro-
duce Dirac brackets (to be replaced by the quantum commutators), by subtracting
{f, χs }P.B. {χs , χs′ }−1
P.B. {χs′ , g}P.B. .

• The electromagnetic field has primary constraint P 0 ≈ 0 and secondary constraint


P i ,i = F 0i ,i ≈ 0, but in the extended Hamiltonian we can drop A0 and P 0 .

Further reading: See Dirac’s book [5].

138
Exercises, Lecture 15

Dirac spinor
1) Consider the Dirac action for a Dirac spinor in Minkowski space,

Scl = d4 x(−ψ † iγ 0 γ µ ∂µ ψ) (15.49)

qith 8 independent variables, ψ A and ψA∗ = (ψ A )∗ , where A = 1, ..., 4 is a spinor index.


Calculate pA and p∗A and the 8 resulting primary constraints. Note that classical fermions

are anticommuting, so we define p by taking the derivatives from the left, e.g. ∂ψ (ψχ) = χ,
so that {p , qB } = −δB . In general, we must define
A A

( ) ( )
∂ ∂
{f, g}P.B. = −(∂f /∂p )
α fg α
g + (−) (∂g/∂p ) f (15.50)
∂qα ∂qα

where ∂f /∂pα is the right derivative, e.g. ∂/∂ψ(χψ) = χ, and (−)f g = −1 for f and g
being both fermionic and +1 otherwise (if f and/or g is bosonic). This bracket is antisym-
metric for f, g being bose-bose or bose-fermi, and symmetric if f and g are both fermionic.
Then compute HT by adding to the classical Hamiltonian H the 8 primary constraints with
coefficients um . Check that there are no secondary constraints, and then from

{ϕm , HT }P.B. ≈ 0 (15.51)

solve for uA , vA .

2) (continuation) Show that all constraints are second class, thus finding that

HT = H ′ = HE (15.52)

Write Dirac brackets. Using the Dirac brackets we can now put p∗ = 0 and replace iψ ∗ by
−p. Show that finally, we have

HT = − d3 x(pγ 0 γ k ∂k ψ) (15.53)

and
[ψ A , pB ]D.B. = {ψ A , pB }P.B. = −δAB δ 3 (⃗x − ⃗y ) (15.54)
(Observation. Note that the analysis for Majorana (real) spinors is somewhat different,
and there one finds
1
[ψ A , pB ]D.B. = − δBA δ 4 (⃗x − ⃗y ).) (15.55)
2

139
16 Lecture 16. Quantization of gauge fields, their path
integral, and the photon propagator
As we saw last lecture, for gauge fields there are redundancies in the description due to the
gauge invariance δAµ = ∂µ λ, which means that there are components which are not degrees
of freedom. For instance, since λ(x) is an arbitrary function, we could put A0 = 0 by a gauge
transformation. So we must impose a gauge in order to quantize.
-The simplest choice perhaps would be to impose a physical gauge condition, like the
Coulomb gauge ∇ ⃗ ·A ⃗ = 0, where only the physical modes propagate (no redundant modes
present), and impose quantization of only these modes.
-But we will see it is better to impose a covariant gauge condition, like the Lorentz
gauge ∂ µ Aµ = 0, and quantize all the modes, imposing some constraints. This will be called
covariant quantization.
Physical gauge
Let’s start with the discussion of the quantization in Coulomb, or radiation gauge. The
equation of motion for the action
∫ ∫
1 1
S=− d xFµν = −
4 2
[(∂µ Aν )∂ µ Aν − (∂µ Aν )∂ ν Aµ ] (16.1)
4 2
is
2Aν − ∂ ν (∂ µ Aµ ) = 0 (16.2)
and is difficult to solve. However, we see that in the Lorentz, or covariant gauge ∂ µ Aµ = 0,
the equation simplifies to just the Klein-Gordon (KG) equation,

2Aµ = 0 (16.3)

which has solution of the type Aµ ∼ ϵµ (k)e±ik·x , with k 2 = 0 and k µ ϵµ (k) = 0 (from the
gauge condition ∂ µ Aµ = 0 in momentum space). But note that the Lorentz gauge does not
fix completely the gauge, there is a residual gauge symmetry, namely δAµ = ∂µ λ, with λ
satisfying (since ∂ µ δAµ = 0)
∂ µ δAµ = 2λ = 0 (16.4)
We can use this λ to fix also the gauge A0 = 0, by transforming with ∂0 λ = −A0 , since the
KG equation for Aµ , (16.2), gives

2A0 = 0 ⇒ ∂0 2λ = 0 (16.5)

so it is OK to use a λ restricted to 2λ = 0 only. If A0 = 0, the Lorentz gauge reduces to



⃗ ·A ⃗ = 0.
µ
Note
∫ that this A0 = 0 gauge is consistent only with J = 0 (no current, since otherwise
A0 = J0 ), but we will consider such a case (no classical background). Therefore we consider
the full gauge condition as
A0 = 0; and∇ ⃗ ·A⃗=0 (16.6)

140
known as radiation, or Coulomb gauge, and famliliar from electrodynamics (see e.g., Jack-
son’s book). In this gauge there are only two propagating modes, the physical degrees of
freedom (corresponding to the two polarizations of the electromagnetic field, transverse to
the direction of propagation, or circular and anti-circular), since we removed A0 and ∂ µ Aµ
from the theory.
The classical solution is
∫ ∑ [ ]
d3 k −ik·x

A(x) = √ (λ) ik·x (λ)
⃗ϵ (k) e a (k) + a (k)e(λ)†
(16.7)
(2π)3 2Ek λ=1,2

and with k 2 = 0 (from the KG equation) and ⃗k·⃗ϵ(λ) (k) = 0 (from the Lorentz gauge condition
in momentum space). The normalization for the solutions was chosen, as in previous cases, so
that under quantization, a and a† act as creation and annihilation operators for the harmonic
oscillator. We can also chose the polarization vectors ⃗ϵ(λ) (k) such that they are orthogonal,
′ ′
⃗ϵ(λ) (⃗k) · ⃗ϵ(λ ) (k) = δ λλ (16.8)
Quantization in physical gauge
We now turn to quantization of the system. As we saw last lecture, in Dirac formalism,
we can drop A0 and P 0 from the system entirely, which is now needed since we want to use
the gauge A0 = 0, and impose the remaining gauge condition ∇ ⃗ ·A⃗ = 0 as an operatorial
condition.
Consider the conjugate momenta
P i (= Πi ) = F 0i = E i (16.9)
We would be tempted to impose the canonical equal time quantization conditions

′ d3 k ij i⃗k·(⃗x−⃗x′ )
i j
[A (⃗x, t), E (⃗x , t)] = i 3
δ e = iδ ij δ 3 (⃗x − ⃗x′ ) (16.10)
(2π)
but note that if we do so, applying ∇⃗x,i on the above relation does not give zero as it should
(since we should have ∇ ⃗ ·A
⃗ = 0). So we generalize δ ij → ∆ij and from the condition
ki ∆ij = 0, we get
∫ ∫ ( )
′ d3 k ij i⃗k·(⃗x−⃗x′ ) d3 k kikj ⃗ ′
i j
[A (⃗x, t), E (⃗x , t)] = i 3
∆ e =i 3
δ − 2 eik·(⃗x−⃗x )
ij

( (2π) ) (2π) k
∂i ∂j
= i δ ij − δ 3 (⃗x − ⃗x′ ) (16.11)

⃗ 2

and the rest of the commutators are zero,


[Ai (⃗x, t), Aj (⃗x′ , t)] = [E i (⃗x, t), E j (⃗x′ , t)] = 0 (16.12)

Replacing the mode decomposition of A ⃗ in the above commutators, we obtain the usual
harmonic oscillator creation/annihilation operator algebra

[a(λ) (k), a(λ )† (k ′ )] = (2π)3 δλλ′ δ (3) (⃗k − ⃗k ′ )

141
′ ′
[a(λ) (k), a(λ ) (k ′ )] = 0 = [a(λ)† (k), a(λ )† (k ′ )] (16.13)

We can then compute the energy and obtain the same sum over harmonic oscillator hamil-
tonians as in previous cases,

1 ⃗2 + B ⃗ 2)
E = H= d3 x(E
2
∑ ∫ d3 k k 0
= 3
[a(λ)† (k)a(λ) (k) + a(λ) (k)a(λ)† (k)] (16.14)
λ
(2π) 2

The proof is similar to previous cases, so is left as an exercise. Again as in previous cases,
we define normal ordering and find for the normal ordered Hamiltonian
∑ ∫ d3 k
: H := 3
k 0 a(λ)† (k)a(λ) (k) (16.15)
λ
(2π)

This quantization described here is good, but is cumbersome, since while the classical theory
was Lorentz invariant, the quantization procedure isn’t manifestly so, since we did it using the
Lorentz-symmetry breaking radiation gauge condition (16.6). That means that in principle
quantum corrections (in quantum field theory) could break Lorentz invariance, so we need
to check at each step Lorentz invariance explicitly.
Therefore a better choice is to use a formalism that doesn’t break manifest Lorentz
invariance, like the
Lorentz gauge (covariant) quantization
If we only impose the KG equation, but no other condition, we still have all the 4
polarization modes, so we have the classical solution
∫ ∑3 [ ]
d3 k −ik·x
Aµ (x) = √ (λ) ik·x (λ) (λ)†
ϵ (k) e a (k) + a (k)e (16.16)
(2π)3 2Ek λ=0 µ

where again k 2 = 0 from the solution of the KG equation, but since the KG equation
is obtained only in the Lorentz gauge, we still need to impose the Lorentz gauge in an
operatorial fashion, as we will see later.
Let us fix the coordinate system such that the momentum solving the KG equation k 2 = 0
is in the 3rd direction, k µ = (k, 0, 0, k), and define the 4 polarizations in the directions of
the 4 axis of coordinates, i.e. ϵλµ = δµλ , or
       
1 0 0 0
 0       
ϵ(λ) =       0
1 0
0 ; 0 ; 1 ; 0 for λ = 0, 1, 2, 3 (16.17)
0 0 0 1

Then for λ = 1, 2 we obtain transverse polarizations, i.e. physical polarizations, transverse


to the direction of the momentum,
k µ ϵ(1,2)
µ =0 (16.18)

142
and for the unphysical modes λ = 0 (timelike polarization) and λ = 3 (longitudinal polariza-
tion, i.e. parallel to the momentum),
k µ ϵ(0) µ (3)
µ = Ek = k ϵµ (k) (16.19)
These modes are unphysical since they don’t satisfy the gauge condition ∂ µ Aµ = 0, which
would also be needed in order to obtain the KG equation for them. That means that these
modes must somehow cancel out of the physical calculations, by imposing some operatorial
gauge condition on physical states.
The natural guess would be to impose the Lorentz gauge condition on states, ∂ µ Aµ |ψ >=
0, which is what Fermi tried to do first. But in view of the mode expansion (16.16), this is
not a good idea, since it contains also creation operators, so ∂ µ Aµ would not be zero even
on the vacuum |0 >, since a† |0 ≯= 0 (there are some details, but we can check that it is
impossible). Instead, the good condition was found by Gupta and Bleuler, namely that only
the positive frequency part of the Lorentz condition is imposed on states,
∂ µ A(+)
µ (x)|ψ >= 0 (16.20)
(note the somewhat confusing notation: the plus means positive frequency, not dagger, in
fact only the annihilation part must be zero on states), known as Gupta-Bleuler quantization
condition.
Therefore the equal time commutation relations in the covariant gauge are
[Aµ (⃗x, t), Πν (⃗x′ , t)] = igµν δ (3) (⃗x − ⃗x′ )
[Aµ (⃗x, t), Aν (⃗x′ , t) = [Πµ (⃗x, t), Πν (⃗x′ , t)] = 0 (16.21)
Note that since from the Lagrangean for electromagnetism we found Π0 = 0, this contradicts
the commutation relation above (and within this context we can’t impose Π0 = 0 on states,
and even if we do, we would still obtain a contradiction with the commutation relation). One
solution is to change the Lagrangean, and we will see later that we can in fact do exactly
that.
Substituting the expansion (16.16) at the quantum level, we obtain almost the usual
commutation relations for harmonic oscillator creation and annihilation operators, namely
′ ′
[a(λ) (k), a(λ )† (k ′ )] = g λλ (2π)3 δ (3) (⃗k − ⃗k ′ )
′ ′
[a(λ) (k), a(λ ) (k ′ )] = [a(λ)† (k), a(λ )† (k ′ )] = 0 (16.22)
For the λ = 1, 2, 3 modes that is OK, but for the λ = 0 mode not, since
[a(0) (k), a(0)† (k ′ )] = −(2π)3 δ (3) (⃗k − ⃗k ′ ) (16.23)
This means that these modes have negative norm, since
||a† |0 > ||2 ≡< 0|aa† |0 >= − < 0|a† a|0 > − < 0|0 >= −1 (16.24)
so they are clearly unphysical. The Gupta-Bleuler condition means, subtituing the form of
(1,2)
Aµ and k µ ϵµ = 0,

µ (k)a (k) + k ϵµ a (k))|ψ >= 0 ⇒


(k µ ϵ(0) (0) µ (3) (3)

143
[a(0) (k) + a(3) (k)]|ψ >= 0 (16.25)

Let’s see how this looks on a state which is a linear combination of one one (0) and one (3)
state:

[a(0) (k) + a(3) (k)](a(0)† (k) + a(3)† (k))|0 >


= ([a(0) (k), a(0)† (k)] + [a(3) (k), a(3)† (k)])|0 >= 0 (16.26)

so it is indeed a physical state. In general, we see that we need to have the same number of
(0) and (3) modes in the state (check).
So in general the contribution of a(0) ’s (timelike modes) and a(3) ’s (longitudinal modes)
cancel each other; for instance, in loops, these modes will cancel each other, as we will see
in QFT II.
Note that by imposing only the positive frequency part of the Lorentz gauge condition
on |ψ >’s, we impose the full gauge condition on expectation values,

< ψ|∂ µ Aµ |ψ >=< ψ|(∂ µ A(+) µ (−)


µ + ∂ Aµ )|ψ >= 0 (16.27)

since (∂ µ Aµ |ψ >)† =< ψ|∂ µ Aµ .


(+) (−)

We have defined the above two methods of quantization, since they were historically first.
But now, there are also other methods of quantization:
-in other gauges, like light cone gauge quantization.
-more powerful (modern) covariant quantization: BRST quantization. To describe it
however, we would need to understand BRST symmetry, and this will be done in QFT II.
-path integral procedures.
• We can write a Hamiltonian path integral based on Dirac’s formalism from last lecture.
This path is taken for instance in Ramond’s book, but we will not continue it here.

• We can use a procedure more easily generalized to the nonabelian case, namely Fadeev-
Popov quantization. This is what we will describe here.
First, we come back to the observation that we need to modify the Lagrangean in order
to avoid getting Π0 = 0. It would be nice to find just the KG equation (what we have in the
covariant gauge) from this Lagrangean, for instance. We try
1 2 λ
L = − Fµν − (∂µ Aµ )2 (16.28)
4 2
and indeed, its equations of motion are

2Aµ − (1 − λ)∂ µ (∂λ Aλ ) = 0 (16.29)

and so for λ = 1 we obtain the KG equation 2Aµ = 0. λ = 1 is called ”Feynman gauge”,


though it is a misnomer, since it is not really a gauge, just a choice. The Lagrangean above
was just something that gives the right equation of motion from the Lorentz gauge, but we
will see that it is what we will obtain in the Fadeev-Popov quantization procedure.

144
Fadeev-Popov path integral quantization
To properly define the path integral quantization, we must work in Euclidean space, as
we saw. The Wick rotation is x0 = t → −ix4 . But now, since we have the vector Aµ that
should behave like xµ , we must also Wick rotate A0 like x0 = −t. Then we obtain

(E) ∂ ∂ (M )
Ei = F4i = Ai − A4 = −iEi (16.30)
∂x4 ∂xi
since F∫0i → iF4i , like x∫0 → ix4 . Since the action is Wick rotated as iS (M ) → −S (E) , and we
have i dt(−Fij2 ) = − dx4 Fij2 , we obtain for the Lagrangean in Euclidean space
1 (E) (4) 1 (E) 2 (E) 2
L(E)
em (A) = + Fµν Fµν = ((Ei ) + (Bi ) ) (16.31)
4 2
and the action ∫
(E)
Sem = d4 xL(E)
em (A) (16.32)

Since we are in Euclidean space, we can do partial integrations without boundary terms (as
we saw, the Euclidean theory is defined for periodic time, on a circle with an infinite radius,
hence there are no boundary terms), so we can write

1
Sem [A] = dd xAµ (x)(−∂ 2 δµν + ∂µ ∂ν )Aν (x) (16.33)
2
So we would be tempted to think that we can write as usual
∫ { 1∫ }
Z[J] = DAµ (x) exp − dd x[Aµ (−∂ 2 δµν + ∂µ ∂ν )Aν ]
2
{1 ∫ }
d d
” = ” Z[0] exp d xd yJµ (x)Gµν (x, y)Jν (y) (16.34)
2
where Gµν is the inverse of the operator (−∂ 2 δµν + ∂µ ∂ν ). However, that is not possible,
since the operator has zero modes, Aµ = ∂µ χ, since

(−∂ 2 δµν + ∂µ ∂ν )∂ν χ = 0, (16.35)

and an operator with zero modes (zero eigenvalues) cannot be inverted. This is a consequence
of gauge invariance, since these zero modes are exactly the pure gauge modes, δAµ = ∂µ χ.
So in order to be able to invert the operator, we need to get rid of these zero modes, or
gauge transformations, from the path integral. Since physical quantities are related to ratios
of path integrals, it will suffice if we can factorize the integration over the gauge invariance,
∏∫
V ol(Ginv ) = dχ(x) (16.36)
x∈Rd

or the volume of the local U (1) symmetry group,



Ginv = Ux (1) (16.37)
x∈Rd

145
Or[A]

A
after X(A)

Figure 37: The gauge fixed configuration is at the intersection of the orbit of A, the gauge
transformations of a gauge field configuration A, and the space M of all possible gauge
conditions.

To do this, we will consider the set of more general covariant gauge conditions

∂µ Aµ = c(x) (16.38)

instead of the Euclidean Lorentz gauge ∂µ Aµ = 0.


Consider an arbitrary gauge field A and the orbit of A, Or(A), obtained from all possible
gauge transformations of this A, see Fig.37. Then consider also the space all possible gauge
conditions, M. We should have that there is only one point at the intersection of Or(A)
and M, i.e. that there is a unique gauge transformation of A, with χ(A) , that takes us
into the gauge condition. We will suppose this is the case in the following (in fact, there
is some issue with this assumption in the nonabelian case at least, it is not true for large
nonabelian gauge transformations, there exists what are called Gribov copies which arise for
large difference between A’s, but we will ignore this here, since anyway we deal with the
abelian case). Consider then the definition of χA ,

∂ 2 χ(A) (x) = −∂µ Aµ (x) + c(x) (16.39)

There is a unique solution in Euclidean space, considering that c(x) falls off sufficiently fast
at infinity, and similarly for A and χ(A) .
Define the gauge field A transformed by some gauge transformation χ as
χ
Aµ (x) ≡ Aµ (x) + ∂µ χ(x) (16.40)

Then we have the following


Lemma ∫ ∏ ∏ 1
dχ(x) δ(−∂µ (χ Aµ (y)) + c(y)) = (16.41)
det(−∂ 2 )
x∈Rd y∈Rd

Proof:
We have
−∂µ (χ Aµ ) + c = −∂ 2 χ − ∂µ Aµ + c = −∂ 2 χ + ∂ 2 χ(A) (16.42)

146

so we shift the integration from dχ to χ → χ − χ(A) , absorbing the second term above.
Then we obtain
∫ ∏ ∏ ∫ ∏ ∏
χ
dχ(x) δ(−∂µ ( Aµ (y)) + c(y)) = dχ(x) δ(−∂ 2 χ(y)) (16.43)
x∈Rd y∈Rd x y

But note that this is a continuum version of the discrete relation


∫ ∏n ∏n
1
dχi δ(∆ij χj ) = (16.44)
i=1 j=1
det ∆

for the operator


−∂ 2 (x, y) ≡ −∂x2 δ (d) (x − y) (16.45)
so we have proved the lemma. q.e.d.
We can then write the lemma with the determinant on the other side as
∫ ∏ ∏
”1” = dχ(x) det(−∂ 2 ) δ(−∂µ (χ Aµ (y)) + c(y)) (16.46)
x y

and we can do a gaussian integration over the gauge conditions c(x) of this result as
∫ ∫ d 2
Dc(x)e− 2α d xc (x) ”1”
1
”1(α)” =
∫ ∏ ∫ d
dχ(x) det(−∂ 2 )e− 2α d x(∂µ ( Aµ (x)))
1 χ 2
= (16.47)
x

where in the second equality we used the lemma and used the delta function in it. Note that
”1(α)” means that it is a function of the arbitrary number α only, so it is not important,
since this will cancel out in the ratio of path integrals which defines observables.
Then in the path integral for an obsevable O(A), we obtain
∫ ∫ ∏ ∫ ∫ d
−S[A]
dχ(x) DAe−S[A]− 2α d x(∂ A) O(A) (16.48)
1 χ 2
DAe O[A]”1(α)” = det(−∂ ) 2

Now we change variables in the path integral from A to A − ∂µ χ, such as to turn ∂µ (χ Aµ ) →


∂µ Aµ . Then the dependence on χ disappears from inside the integral, so the integration over
the gauge modes finally factorizes,
∫ [ ∫ ∏ ]∫
∫ d
DAe−S[A] O[A]”1(α)” = det(−∂ 2 ) DAe−S[A]− 2α d x(∂A) O(A) (16.49)
1 2
dχ(x)
x

That means that in correlator variables, defined as ratios of the path integral with insertion
to the one without, to cancel out the bubble diagrams, the integration over the gauge modes
cancels,

D(A...)O(A...)e−S(A) ”1(α)”
< O(A...) > = ∫
D(A...)e−S(A) ”1(α)”

147

D(A...)O(A...)e−Sef f (A,...)
= ∫ (16.50)
D(A...)e−Sef f (A...)
Here the effective Lagrangean is
1 2 1
Lef f = Fµν + (∂µ Aµ ) (16.51)
4 2α
where the extra term is called the gauge fixing term, as it is not gauge invariant, so the
remaining path integral is without zero modes.
Photon propagator
We can again partially integrate and obtain for the effective action
∫ ( ( ) )
1 1
Sef f (A) = d xAµ −∂ δµν + 1 −
d 2
∂µ ∂ν A ν (16.52)
2 α

where the operator in in brackets, (G(0) )−1


µν is now invertible. We write in momentum space

1 dd k
Sef f (A) = Aµ (−k)(G(0) )−1
µν (k)Aν (k) (16.53)
2 (2π)d
Since ( )
1
(G(0) )−1
µν (k)
2
= k δµν − 1− kµ kν , (16.54)
α
we have ( ( ) )
1 (0)
k δµν − 1 −
2
kµ kν Gνλ = δµλ (16.55)
α
meaning that finally the photon propagator is
( )
1 kµ kν
Gµν (k) = 2 δµν − (1 − α) 2
(0)
(16.56)
k k
Then in the case of α = 1 (λ = 1/α = 1), the ”Feynman gauge” (a misnomer, as we said)
we have the Klein-Gordon propagator
1
G(0)
µν (k; α = 1) = δµν (16.57)
k2

Important concepts to remember

• The physical gauge is A0 = 0 and ∂ µ Aµ = 0, or ∇


⃗ ·A
⃗ = 0, and here the equation of
motion is KG.

• Quantization in the physical gauge is done for only the two physical transverse polar-
izations, for which we expand in usual harmonic oscillator creation and annihilation
operators.

148
• Covariant gauge quantization is done keeping all the 4 polarization of the gauge field,
(+)
but imposing ∂ µ Aµ |ψ >= 0 on physical states.

• Timelike modes have negative norm, but they cancel in calculations against longitu-
dinal modes. In particular, the physical condition says the longitudinal and timelike
modes should match inside physical states.

• Fadeev-Popov path integral quantization factorizes the gauge modes, and leaves an
effective action which contains a gauge fixing term.

• The photon propagator in Feynman gauge is the KG propagator.

Further reading: See chapters 6.1,6.2,6.3.1 in [4], 7.3 in [1] and 7.1 in [3].

149
Exercises, Lecture 16

1) a) Prove that for quantization in physical gauge the quantum energy is


∫ ∑ ∫ d3 k k 0
1
E= 3 ⃗ 2 ⃗
d x(E + B ) =2
3
[a(λ)† (k)a(λ) (k) + a(λ) (k)a†(λ) (k)] (16.58)
2 λ=1,2
(2π) 2

b) Consider the state in covariant quantization


( )( )
|ψ >= a(1)† (k1 ) − a(2)† (k3 ) a(0)† (k2 ) + a(3)† (k4 ) |0 > (16.59)

Is it physical? Why?

2) Using the effective action Sef f (with gauge fixing term), for k µ = (k, 0, 0, k), write
down the equation of motion in momentum space, separately for the longitudinal, timelike
and transverse modes, at arbitrary ”gauge” α.

150
17 Lecture 17. Generating functional for connected
Green’s functions and the effective action (1PI di-
agrams)
Generating functional of connected Green’s functions
As we saw, Z[J] is the generating functional of the full (connected and disconnected)
Green’s functions.
But now we will prove that the generating functional of connected Green’s functions is
−W [J], where
Z[J] = e−W [J] (17.1)
We define the n-point Green’s functions in the presence of a nonzero source J as
δn
Gn (x1 , ..., xn )J = Z[J] (17.2)
δJ(x1 )...δJ(xn )

(before we had defined the Green’s functions at nonzero J). We will denote them by a box
with a J inside it, and n lines ending on points x1 , ..., xn coming out of it, as in Fig.38. A
box without any lines from it is the 0-point function, i.e. Z[J], see Fig.38.

Notation: x1
J =G(x1,...,xn) G(x) =
J J J = 0−pnt.fct.
J
xn
=propagator = J J = connected
x ij j x

a) = + +
J x
+...
x x

b) =
x
J J J
x

Figure 38: Notation for Green’s functions, followed by the diagrammatic expansion for the
1-point function with source J (a), which can be seen diagrammatically to factorize into the
connected part times the vacuum bubbles (b).

For the 1-point function in ϕ4 theory, writing the perturbative expansion as we already

151
did (before we put J = 0), we have
∫ ∫
G(x)J = d y∆(x−y)J(y)−g dd zdd y1 dd y2 dd y3 ∆(x−z)∆(y1 −z)∆(y2 −z)∆(y3 −z)J(y1 )J(y2 )J(y3 )+..
d

(17.3)
and the Feynman diagrams that correspond to it are: line from x to a cross representing
the source, line from x to a vertex from which 3 other lines end on crosses. There is also:
line from x to a cross with a loop on it (setting sun), etc., see Fig.38 a for the diagrammatic
form of the above
∫ equation.∑
A cross is J(x), or i Ji in the discretized
∫ version. A propagator is a line, and so a
line between a point and a cross is ∆ij Jj = dd y∆(x − y)J(y), see Fig.38.
As we said, a box with J inside and a line ending on an external point is the 1-point
function G(x)J . We also draw the connected piece by replacing the box with a circle (Fig.38).
Then the full one-point function is the connected one-point function times the 0-point
function (vacuum bubbles, or Z[J]), in exactly the same way as we showed it in the operator
formalism, see Fig.38b. Note that this only works for the one-point functions, since for
instance for the 2-point function we also have contributions like x1 connected with a cross,
and x2 connected with another cross, as in Fig.39.

x1

x2

Figure 39: For the two-point function with a source, the factorization doesn’t work anymore,
since we can have disconnected pieces without vacuum bubbles, as exemplified here.

The above diagrammatic equation is written as

δZ[J] δW [J]
=− Z[J] (17.4)
δJ(x) δJ(x)

where here by definition −W [J] is the generating functional of connected diagrams. The
solution of this equation is
Z[J] = N e−W [J] (17.5)
as we said it should be. Here W [J] is called free energy, since exactly like in thermodynamics,
the partition function is the exponential of minus the free energy.
Effective action
Another functional of interest is the effective action, which we will define soon, and
which we will find that is the generating functional of one-particle irreducible (1PI) Green’s
functions. One-particle irreducible means that the diagrams cannot be separated into two

152
disconnected pieces by cutting a single propagator. An example of a one-particle reducible
diagram would be two lines between two points, then one line to another, and then another
two to the last point, see Fig.40. To make it 1PI we would need to add another line for the
points in the middle.

1PR 1PI

Figure 40: One particle reducible graph (left): the middle propagator can be cut, separating
the graph into two pieces, versus one particle irreducible graph (right): one cannot separate
the graph into 2 pieces by cutting a single propagator.

The effective action we will find is a Legendre transform of the free energy, exactly like we
take a Legendre transform of the free energy in thermodynamics to obtain (Gibbs) potentials
like G = F − QΦ.
We need to define first the object conjugate to the source J (just like the electric potential
Φ is conjugate to the charge Q for G above). This is the classical field ϕcl [J] in the presence
of an external current J. As the name suggests, it is the properly normalized VEV of the
quantum field, which replaces the field of classical field theory at the quantum level,

< 0|ϕ̂(x)|0 >J 1 1 δZ[J] δ(−W [J])
ϕcl [J] ≡ = Dϕe−S[ϕ]+J·ϕ ϕ(x) = = = Gc1 (x; J)
< 0|0 >J Z[J] Z[J] δJ(x) δJ(x)
(17.6)
that is, the connected one-point function in the presence of the external source.
e.g. free scalar field theory in the discretized version
Consider the action

1 1
S0 − J · ϕ = dd x[(∂µ ϕ)2 + m2 ϕ2 ] − J · ϕ = ϕi ∆−1
ij ϕj − Jk ϕk (17.7)
2 2
The classical equation of motion for this is and its solution are

∆−1
ij ϕj − Ji = 0 ⇒ ϕi = ∆ij Jj (17.8)

which is the Coulomb law. Ji ∆ij Jj


On the other hand, we saw that the free partition function is Z0 = e 2 , so that the
free energy is
Ji ∆ij Jj
−W0 [J] = (17.9)
2

153
implying that in the free theory the classical field is
cl(0) δ
ϕi = (−W0 [J]) = ∆ij Jj (17.10)
δJi
that is, the same as the solution of the classical equation of motion.
This is so, since free fields are classical, no interactions means there are no quantum
fluctuations.
The classical field is represented by a propagator from the point i to a circle with a J
inside it.
1PI Green’s function
Some diagrams for the classical field in ϕ4 theory are as follows: line from i to a cross;
line from i to a vertex, then from the vertex 3 lines to crosses; the same just adding a setting
sun on propagators; the vertex diagram with a circle surrounding the vertex and crossing all
4 propagators; etc., see Fig.41.

i
class.field + + +
i J =
+ +...

i
= +
=1PI + J
1PI 1PI
+1/2
J J
+1/3! +...
J J

Figure 41: Diagrammatic expansion of the classical field. It can be reorganized into a self-
consistent equation, in terms of the classical field and the 1PI n-point functions.

If we think about how to write this in general, and also what to write for a general theory
instead of ϕ4 , denoting 1PI diagrams by a shaded blob, the relevant diagrams can be written
in a self-consistent sort of way as: line from i to a cross; plus propagator from i to a 1PI blob
with one external point; plus a propagator from i to a 1PI blob with two external points,
and the external point connected with the same classical field, plus 1/2 times the same with
3-point 1PI with 2 external classical fields; etc, see Fig.41.
We then obtain the following self-consistency equation for the classical field
[ ( )]
1 1
ϕi = ∆ij Jj − Γj + Πjk ϕk + Γjkl ϕk ϕl + ... +
cl cl cl cl cl
Γji ...i ϕ ...ϕcl
+ ...
2 (n − 1)! 1 n−1 i1 in−1
(17.11)

154
Note that only the Πjk was not called Γjk , the rest of the 1PI n-point functions were called
Γi1 ...in , we will see shortly why.
We define the generating functional of 1PI Green’s functions
1 1
Γ̂(ϕcl ) = Γi ϕcl cl cl
i + Πij ϕi ϕj + Γijk ϕcl cl cl
i ϕj ϕk + ... (17.12)
2 3!
such that the 1PI n-point functions are given by its multiple derivatives
δn
Γi1 ...in = Γ̂(ϕcl )|ϕcl =0 (17.13)
δϕcl
i1 ...δϕ cl
in

(only for two points we have Πij ). Then the self-consistency equation (17.11) is written as
( )
δW δ Γ̂
− ≡ ϕcl
i = ∆ij Jj − cl (17.14)
δJi δϕj

In turn, this is written as


[ ]
δ Γ̂ −1 cl δ 1 cl −1 cl
+ ∆ij ϕj = Ji ⇒ cl Γ̂(ϕ ) + ϕk ∆kj ϕj = Ji
cl
(17.15)
δϕcli δϕi 2
That means that we can define the effective action
1
Γ(ϕcl ) = Γ̂(ϕcl ) + ϕcl ∆−1 ϕcl (17.16)
2 k kj j
such that we have
δΓ(ϕcl )
= Ji (17.17)
δϕcl
i
just like the classical equation of motion is
δS
= Ji (17.18)
δϕi

In other words, ϕcl plays the role of the field ϕ in the classical field theory, and instead of the
classical action S, we have the effective action Γ, which contains all the quantum corrections.
For the effective action we define
∑ 1
Γ(ϕcl ) = Γi1 ...iN ϕcl cl
i1 ...ϕiN (17.19)
N ≥1
N!

where the only difference from the generating functional of the 1PI diagrams is in the 2-point
function, where
Γij = Πij + ∆−1
ij (17.20)
which does not have a diagrammatic interpretation (we have ∆−1 , not ∆), see Fig.42. Oth-
erwise, the effective action contains only 1PI diagrams.

155
−1
= +
ij

Figure 42: The effective action is not technically diagrammatic, because of the inverse free
propagator.

Theorem The effective action is the Legendre transform of the free energy, i.e.

−Γ(ϕcl )+Ji ϕcl
e i = Dϕe−S[ϕ]+Ji ϕi = e−W [J] (17.21)

or
Γ[ϕcl ] = W [J] + Jk ϕcl
k (17.22)
Proof: We take the derivative with respect to ϕcl
i of the needed relation above, and obtain

δΓ δW δJk δJk cl δJk δJk


= Ji = k + Ji = −ϕk
+ cl ϕcl + cl ϕcl + Ji = Ji (17.23)
δϕicl cl
δJk δϕi δϕi cl
δϕi δϕi k

i.e. we obtain an identity, as wanted. q.e.d.


The effective action contains all the information about the full quantum field theory, so in
principle, if we were able to find the exact effective action, it would be equivalent to solving
the quantum field theory.
The connected two-point function
The connected one-point function is

δW [J]
i [J] = −
GC = ϕcl
i [J] (17.24)
δJi
and we obtain the connected two-point function by taking a derivative,

δ δ δϕcl
ij [J] = −
GC W [J] = i (17.25)
δJi δJj δJj

Then we can substitute (17.11) and write

δϕcl
ij [J] = ∆ij − ∆ik Πkl
l
GC + ... (17.26)
δJj

where the terms that were dropped vanish when J = 0, with the assumption that ϕcl i [J =
0] = 0, which is a reasonable requirement in quantum field theory (classically, at zero source
we should have no field, and so quantum mechanically under the same condition we should

156
have no field VEV, or rather, this should be removed by renormalization, of which we will
learn next semester). Then at J = 0, the two point function obeys the equation

ij = ∆ij − ∆ik Πkl Glj


GC C
(17.27)

which gives

(δik + ∆il Πlk )Gckj = ∆ij ⇒


(∆−1 C
mk + Πmk )Gkj = δmj (17.28)

which means that the connected two-point function is the inverse of the two-point term in
the effective action,
Γik GCkj = δij (17.29)
or explicitly

GC = (1 + ∆Π)−1 ∆ = (1 − ∆π + ∆Π∆Π − ∆Π∆Π∆Π + ...)∆ ⇒


ij = ∆ij − ∆il Πlk ∆kj + ∆il Πlk ∆km Πmn ∆nj − ...
GC (17.30)

So the connected two-point function is a propagator; minus a propagator, two-point 1PI,


propagator again; plus a propagator, two-point 1PI, propagator, two-point 1PI, propagator;
minus..., as in Fig.43.

c = − +

−...

Figure 43: Diagrammatic equation relating the connected 2-point function and the 1PI 2-
point function.

So apart from signs, it is obvious from diagrammatics.


Classical action as the generating functional of tree diagrams
In the classical ~ → 0 limit, the effective action turns into the classical action (all the
quantum corrections vanish). In the same limit, all the 1PI diagrams disappear and we are
left with only the vertices in the action (e.g. the 4-point vertex in the case of the ϕ4 theory).
Therefore from all the diagrams we are left with only the tree diagrams.
But note that having only tree diagrams is still quantum mechanics, since we still calcu-
late amplitudes for quantum transition, probabilities, and can have interference, etc. The

157
point however is that we don’t have quantum field theory (second quantization), but we still
have quantum mechanics. For the transition amplitudes we still have quantum mechanics
rules, and we know for instance that some of the early successes of QED come from tree
diagram calculations. The point is of course that when we talk about external particles, and
transitions for them, that is a quantum mechanics process: the classical field is a a collection
of many particles, but if we have a single particle we use quantum mechanics.
As an example, let’s consider the classical ϕ4 theory,
1 λ∑ 4
S[ϕ] = ϕi ∆−1 ϕ
ij j + ϕ (17.31)
2 4! i i

with equation of motion


λ 3
∆−1
ij ϕj + ϕ = Ji (17.32)
3! i
with solution
λ
ϕi = ∆ij Jj − ∆ij ϕ3j (17.33)
3!
This is a self-consistent solution (like for ϕcl at the quantum level), written as the classical
field (line connected with a circle with J in the middle, at classical level)= line ending in
cross −λ/3!× line ending in vertex connected with 3 classical fields.
It can be solved perturbatively, by replacing the classical field first with the line ending
in cross, substituting on the right hand side of the equation, then the right hand side of the
resulting equation is reintroduced instead of the classical field, etc. In this way, we obtain
that the classical field (at classical level) equals the line ending in cross; −λ/3!× line ending
in vertex, with 3 lines ending in cross; +3(−λ/3!)2 × a tree with one external point and
5 crosses; etc. In this way we obtain all possible trees with one external point and many
external crosses. This is also what we would get from ϕcl by keeping only the tree diagrams.

Important concepts to remember

• The generating functional of connected diagrams is −W [J] = ln Z[J], known as the


free energy by analogy with thermodynamics.

• The classical field, defined as the normalized VEV of the quantum field, is the connected
one-point function in the presence of a source J.

• The generating functional of 1PI diagrams is Γ̂(ϕcl ), and the effective action Γ is the
same, just in the two-point object we need to add ∆−1 , i.e. we have Γij = Πij + ∆−1ij .

• The effective action Γ[ϕcl ] is the Legendre transform of the free energy W [J].

• The connected two-point function at J = 0, GC


ij , is the inverse of the two-point effective
action Γij .

158
• The classical action is the generating functional of tree diagrams. Tree diagrams still
can be used for quantum mechanics, but we have no quantum field theory.

Further reading: See chapters 2.5, 4.1 in [4] and 3.1, 3.3 in [3].

159
Exercises, Lecture 17

1) Consider the hypothetical effective action in p space (here ∆−1 = p2 , as usual for
massless scalars)

cl d4 p cl
Γ(ϕ ) = 4
ϕ (p)(p2 + Gp4 )ϕcl (−p)
(2π)
{ ∫ ( 4 ) }
∏ d4 pi
+ exp λ 4
δ 4 (p1 + p2 + p3 + p4 )ϕcl (p1 )ϕcl (p2 )ϕcl (p3 )ϕcl (p4 ) (17.34)
i=1
(2π)

a) Calculate the 1PI 4-point Green’s function Γp1 p2 p3 p4 .


b) Calculate the 1PI 2-point function Πp,−p and the connected 2-point function.

2) Consider the theory


1 λ∑ 3
S[ϕ] = ϕi ∆−1
ij ϕj + ϕ (17.35)
2 3! i i
in the presence of sources Ji . Write down the perturbative diagrams for the classical equation
of motion up to (including) order λ3 .

160
18 Lecture 18. Dyson-Schwinger equations and Ward
identities
At the classical level, we have the classical equations of motion

δS[ϕ]
− Ji = 0 (18.1)
δϕi
But we know from Ehrenfest’s theorem for quantum mechanics that in the quantum
theory these classical equations should hold on the quantum average, or VEV (note that
we are talking about the quantum average of the equation for the field, not the classical
equation for the quantum average of the field!). In the path integral formalism, these turn
out to be the Dyson-Schwinger equations in a very easy (almost trivial) way. Historically,
these equations were found in the operator formalism, and it was quite nontrivial to find
them.
Consider the following simple identity
∫ +∞
d
dx f (x) = 0 (18.2)
−∞ dx

if f (±∞) = 0, and generalize it to the path integral case. In the path integral in Euclidean
space, we have something similar, just that the boundary condition is even simpler: instead
of fields going to zero at t = ±∞, we use periodic boundary conditions (Euclidean time is
periodic, namely is a circle of radius R → ∞), hence we have

δ −S[ϕ]+J·ϕ
0 = Dϕ e (18.3)
δϕi
Writing explicitly, we have
∫ [ ]
δS[ϕ]
0 = Dϕ − + Ji e−S[ϕ]+J·ϕ
[ δϕi ]

δS
= − + Ji Z[J], (18.4)
δϕi δ ϕ= δJ
i


where
∫ in the second equality we used the usual fact that we have for example Dϕϕi e−S[ϕ]+J·ϕ =
Dϕ(δ/δJi )e−S[ϕ]+J·ϕ = δ/δJi Z[J]. This equation is the Dyson-Schwinger equation. Now it
appears trivial, but it was originally derived from an analysis of Feynman diagrams, where
it looks nontrivial. Now, we will derive the relation for Feynman diagrams from this.
To see it, we first specialize for a theory with quadratic kinetic term, which is a pretty
general requirement. Therefore consider
∑1
S[ϕ] = ϕi ∆−1
ij ϕj + SI [ϕ] (18.5)
ij
2

161
Then we obtain (by differentiating and multiplying with ∆li )
∑ δS[ϕ] ∑ δSI [ϕ]
∆li = ϕl + ∆li (18.6)
i
δϕi i
δϕi

and substituting it in the Dyson-Schwinger equation (18.4), we get


[ ]
δ ∑ δSI ∑
− − ∆li | δ + ∆li Ji Z[J] = 0 (18.7)
δJl i
δϕi ϕ= δJ i

so that we get the Dyson-Schwinger equation for Z[J],


δ ∑ ∑ δSI [ϕ]
Z[J] = ∆li Ji Z[J] − ∆li |ϕ= δ Z[J] (18.8)
δJl i i
δϕ i δJ

From it, we can derive a Dyson-Schwinger equation for the full Green’s functions (with
a diagrammatic interpretation) if we choose a particular interaction term.
Let’s consider a ϕ3 plus ϕ4 interaction
g3 ∑ 3 g4 ∑ 4
SI [ϕ] = ϕ + ϕ (18.9)
3! i i 4! i i

Then the Dyson-Schwinger equation (18.8) becomes


∑ ∑ [ ]
δ g3 δ 2 g4 δ3
Z[J] = ∆li Ji Z[J] − ∆li + Z[J] (18.10)
δJl i i
3! δJi δJi 4! δJi δJi δJi

We put l = i1 and then take (δ/δJi2 )...(δ/δJin ) and finally put J = 0 to obtain a relation
between (full) Green’s functions,
∑ [g g4 (n+2) ]
(n) (n−2) (n−2) (n−2) 3 (n+1)
Gi1 ...in = ∆i1 i2 Gi3 ...in + ∆i1 i3 Gi2 i4 ...in + ... + ∆i1 in Gi2 i3 ...in−1 − ∆i1 i Giii2 ...in + Giiii2 ...in
i
2! 3!
(18.11)
We can write a diagrammatic form of this equation. The full Green’s function is represented
by a box with n external points, i1 , ..., in . We need to choose one special point, i1 above,
and we can write it on the left of the box, and the other on the right of the box. There are
n − 1 terms where we connect i1 with one of the other by a propagator, disconnected from
a box with the remaining n − 2 external points, minus 1/2!× a term where the propagator
from i1 reaches first an i, where it splits into two before reaching the box, minus 1/3!× a
term where the propagator from i1 reaches first an i where it splits into 3 before reaching
the box, see Fig.44 for more details.
The whole perturbative expansion can be obtained by iterating again the Dyson-Schwinger
equation.
Example We will see this in the example of the two-point function in the above theory.
We first write the Dyson-Schwinger equation for it as above,
(2) 1 ∑ (3) 1 ∑ (4)
Gi1 i2 = ∆i1 i2 G(0) − ∆i1 i Giii2 − ∆i1 i Giiii2 (18.12)
2! i 3! i

162
i2 i2 i2
i1 i3 i1 i3 i1 i3
= + +...
in in
in
i2 i2 i2
i1 i3 i3 i
−1/2! i1 i
−1/3!
i3

in in i1
in

Figure 44: Dyson-Schwinger equation for the n-point function for the ϕ3 plus ϕ4 theory.

(3) (4)
and then substitute in it the Dyson-Schwinger equations for Giii2 and Giiii2 , with the first
index on each considered as the special one (note that we cannot write a DS equation for
G(0) ),
(3) (1) (1) 1 ∑ (4) 1 ∑ (5)
Giii2 = ∆ii Gi2 + ∆ii2 Gi − ∆ij Gjjii2 − ∆ij Gjjjii2
2! j 3! j
(4) (2) (2) (2) 1 ∑ (5) 1 ∑ (6)
Giiii2 = ∆ii Gii2 + ∆ii Gii2 + ∆ii2 Gii − ∆ij Gjjiii2 − ∆ij Gjjjiii2 (18.13)
2! j 3! j

These are represented in Fig.45.


We then obtain
[ ]
(2) 1 ∑ (1) (1) 1 ∑ (4) 1 ∑ (5)
Gij = ∆ij G − (0)
∆i1 i ∆ii Gi2 + ∆ii2 Gi − ∆ij Gjjii2 − ∆ij Gjjjii2
2! i 2! j 3! j
[ ]
1 ∑ (2) (2) (2) 1 ∑ (5) 1 ∑ (6)
− ∆i1 i ∆ii Gii2 + ∆ii Gii2 + ∆ii2 Gii − ∆ij Gjjiii2 − ∆ij Gjjjiii2
3! i 2! j 3! j
(18.14)
This is represented in Fig.46. By iterating it further, we can obtain the whole perturbative
expansion.
Let’s consider the expansion to order (g3 )0 (g4 )1 of the above second iteration. Since for
(0)
G , the nontrivial terms need to have at least two vertices, it only contributes to order 0,
i.e. as = 1. Also the first [] bracket does not contribute, since all the terms have at least
one g3 in them. In the second bracket, the last two terms are of orders (g3 )1 (g4 )1 and ((g4 )2
respectively, so also don’t contribute. Thus besides the trivial propagator term, we only
have the first 3 terms in the second bracket contributing. But in it, the iterated two-point
(2)
functions must be trivial to contribute to order (g3 )0 (g4 )1 to Gij , i.e. must be replaced by
1.

163
iterate:

= −1/2 −1/3!

= + +

−1/2! −1/3!

= + +

−1/2! −1/3!

Figure 45: Ingredients for iterating the Dyson-Schwinger equation (step 1). Consider the
first iteration for the 2-point function, and choose a specific leg on the rhs. The Dyson-
Schwinger equations for the 3-point and 4-point functions with a special leg as chosen above
are written below.

(2)
We then obtain that to order (g3 )0 (g4 )1 , the two-point function Gij is the propagator
∆ij plus 3 times 1/3!× the propagator self-correction, i.e.
(2) 3 ∑
Gij = ∆ij + ∆ik ∆kk ∆kj + ... (18.15)
3! k

This is written diagrammatically in Fig.47.


Note that in this way we have obtained the correct symmetry factor for the one-loop
diagram, S = 2, since 1/S = 3/3!. That means that the iterated Dyson-Schwinger equation is
an algorithmic way to compute the symmetry factors in cases where they are very complicated
and we are not sure of the algebra.
Note also that we wrote Dyson-Schwinger equations for the full Green’s functions, but
we can also write such equations for the connected and 1PI Green’s functions, and these are
also iterative and self-consistent. We will however not do it here.
Symmetries and Ward identities
Noether’s theorem
We will first review Noether’s theorem. Consider a global symmetry
δϕi = ϵa (iT a )ij ϕj (18.16)

164
= −1/2! +

−1/2! −1/3!

−1/3!
+ +

+ −1/2! −1/3!

Figure 46: The second iteration of the Dyson-Schwinger equation, using the ingredients from
step 1.

where ϵa are constant symmetry parameters. We have


∫ [ ]
4 ∂L i ∂L i
0 = δS = d x δϕ + ∂µ δϕ
∫ [( ∂ϕi ( ∂(∂µ ϕi (x)) )) ( )]
∂L ∂L ∂L
= 4
dx − ∂µ i
δϕ + ∂µ δϕ i
(18.17)
∂ϕi ∂(∂µ ϕi (x)) ∂(∂µ ϕi (x))
If the equations of motion are satisfied, the first bracket is zero, and then substituting for δϕi
the transformation under the global symmetry, we obtain that classically (if the equations
of motion are satisfied)
[ ]
∑ ∑ ∂L
(δL)class
symm = i ϵ a ∂µ i
(T a )ij ϕj (x) (18.18)
a ij
∂(∂ µ ϕ (x))

We then obtain that the current is


∑ ∂L
jµa = (T a )ij ϕj (x) (18.19)
ij
∂(∂ µ ϕi (x))

and is classically conserved on-shell, i.e.


∂ µ jµa = 0 (18.20)

165
= + 3/3! +...

Figure 47: The result of iterating the Dyson-Schwinger equation for the 2-point function, at
order (g3 )0 (g4 )1 .


which is the statement of the Noether theorem. Since Qa = d3 xj0a , current conservation
means that dQ/dt = 0, i.e. the charge is conserved in time.
Under a symmetry transformation, but off-shell (if the equations of motion are not sat-
isfied), we still have δS = 0, which in turn means that
[( ( )) ( )]
∂L ∂L ∂L
a a
0 = δL = iϵ (T )ij − ∂µ j
ϕ + ∂µ δϕ j
(18.21)
∂ϕi ∂(∂µ ϕi (x)) ∂(∂µ ϕi (x))

We now promote the global transformation (18.16) to a local one, with parameter ϵa (x).
Obviously then, it is not a symmetry anymore, so under it, δS is not zero anymore, but
rather
∫ [( ( )) ( )]
∂L ∂L ∂L
δS = 4 a
d xiϵ (x)(T )ij a
− ∂µ j
ϕ + ∂µ δϕ j

∫ ∂ϕ[i ∂(∂µ ϕi (x))


] ∂(∂µ ϕi (x))
∑ ∂L
+ i d4 x (∂µ ϵa )(T a )ij i (x))
ϕj (x)
∂(∂ µ ϕ
∑∫ ∑∫
a,i,j

= i d x(∂ ϵ (x))jµ (x) = −i


4 µ a a
d4 xϵa (x)(∂ µ jµa (x)) (18.22)
a a

where in the second equality we have used (18.21) and in the last step we have assumed
that ϵa (x) ”vanish at infinity” so that we can partially integrate without boundary terms.
Therefore we have the variation under the local version of the global symmetry,
∑∫ ∑∫
δS = i d x(∂ ϵ (x))jµ (x) = −i
4 µ a a
d4 xϵa (x)(∂ µ jµa (x)) (18.23)
a a

This formula is valid off-shell as we said, so we can use it inside the path integral. This gives
a way to identify the current as the coefficient of ∂µ ϵa in the variation of the action under
the local version of the global symmetry.
Ward identities
Since classically (on the classical equations of motion) we have ∂ µ jµa = 0, by Ehrenfest’s
theorem we expect to have ∂ µ jµa = 0 as a VEV (quantum average), and that will be the
Ward identity.

166
However, it can happen that there are quantum anomalies, meaning that the classical
symmetry is not respected by quantum corrections, and we then get that the classical equa-
tion ∂ µ jµa = 0 is not satisfied as an average at the quantum level.
We consider the local version of the symmetry transformation,

δϕi (x) = i ϵa (x)(T a )ij ϕj (x) (18.24)
a,j

where ϕ′ = ϕ + δϕ. Changing integration variables from ϕ to ϕ′ (renaming the variable,


really) does nothing, so ∫ ∫
′ −S[ϕ′ ]
Dϕ e = Dϕe−S[ϕ] (18.25)

However, now comes a crucial assumption: if the jacobian from Dϕ′ to Dϕ′ is 1, then
Dϕ′ = Dϕ, so ∫ ∫
[ ]
−S[ϕ′ ] −S[ϕ]
0 = Dϕ e −e = − DϕδS[ϕ]e−S[ϕ] (18.26)

This assumption however is not true in general. In the path integral formalism, quantum
anomalies appear exactly as anomalous jacobians for the change of variables in the measure,
and lead to breakdown of the symmetries at the quantum level. However, these will be
described in the second semester of the course, so we will not discuss them further here.
Now we can use the previously derived form for δS under a local version of a global
symmetry, eq. (18.23), obtaining
∫ ∫
0 = d xiϵ (x) Dϕe−S[ϕ] ∂ µ jµa (x)
4 a
(18.27)

But since the parameters ϵa (x) are arbitrary, we can derive also that

Dϕe−S[ϕ] ∂ µ jµa (x) = 0 (18.28)

which is indeed the quantum averaged version of the conservation law, namely the Ward
indentity.
Note that here, jµa is a function of ϕ also (the object integrated
∫ over), so it cannot be
taken out of the path integral (this is an equation of the type dxf (x) = 0, from which
nothing further can be deduced).
We can derive more general Ward identities by considering a general operator A =
A({ϕi }), such that ∫
δA({ϕi }) a
δA({ϕ }) = d4 x
i
ϵ (x) (18.29)
δϵa (x)
Then analogously we get
∫ ∫ ∫ [ ]
[ −S[ϕ] ] −S[ϕ] δA({ϕi })
0 = Dϕδ e A({ϕ }) = i d xϵ (x) Dϕe
i 4 a
∂ jµ (x)A − i
µ a
(18.30)
δϵa (x)

167
so that we have the more general Ward identities
∫ ∫
−S[ϕ] µ a δA({ϕi })
Dϕe (∂ jµ (x))A = i Dϕe−S[ϕ] (18.31)
δϵa (x)

There are also other forms of Ward identities we can write. For instance, choosing
A = eJ·ϕ , or in other words adding a current J, giving
∫ ∫ [ ]
δ −S+J·ϕ δS j
0 = Dϕ a e = i(T )ij Dϕ − i ϕ (x) + Ji ϕ (x) e−S+J·ϕ
a j
(18.32)
δϵ (x) δϕ (x)

Now using the usual trick of replacing ϕi inside the path integral with δ/δJ , we obtain the
Ward identities
[ { } ]
δS δ δ
i(T )ij − i
a
ϕ= + Ji (x) Z[J] = 0 (18.33)
δϕ (x) δJ δJj (x)

Taking further derivatives with respect to J and then putting J = 0 we obtain Ward identities
between the Green’s functions, in the same way it happened with the Dyson-Schwinger
equation. Also in this case we can write Ward identities for connected or 1PI Green’s
functions as well.
Ward identities play a very important role in the discussion of symmetries, since they
constrain the form of the Green’s functions.
For example, we will see later that in QED, the transversality condition k µ Aµ = 0, coming
from the Lorentz gauge condition, that fixes (part of) the (local) gauge invariance, gives a
similar constraint on Green’s functions. In particular, for the 1PI 2-point function Πµν we
can write
k µ Πµν = 0 (18.34)
which implies that the Lorentz structure of Πµν is fixed, namely

Πµν (k) = (k 2 δµν − kµ kν )Π(k 2 ) (18.35)

This constraint then is a local analogue of the Ward identities for global symmetries.

Important concepts to remember

• The Dyson-Schwinger equation is the quantum version of the classical equation of


motion δS/δϕi = Ji , namely its quantum average (under the path integral).

• It can be written as an operator acting on Z[J], and from it we can deduce a relation
between Green’s functions, for ϕk interactions relating G(n) with G(n−2) and G(n+k−2) .

• By iterating the Dyson-Schwinger equation we can get the full perturbative expansion,
with the correct symmetry factors. It therefore gives an algorithmic way to calculate
the symmetry factors.

168
• The Noether conservation gives on-shell conservation of the current jµa associated with
a global symmetry.

• By making ϵ → ϵ(x), the variation of the action is δS = i (∂ µ ϵa )jµa .

• Ward identities are quantum average versions of the classical (on-shell) current conser-
vation equations ∂ µ jµa = 0, perhaps with an operator A inserted, and its variation on
the right hand side. However, there can be quantum anomalies which spoil it, mani-
festing themselves as anomalous Jacobians for the transformation of the path integral
measue.

• Ward identities can also be written as an operator acting on Z[J], and from it deriving
relations between Green’s functions.

Further reading: See chapters 4.2, 4.3 in [4], 3.1 in [6] and 9.6 in [2]

169
Exercises, Lecture 18

1) Consider the scalar theory with interaction term


g5 ∑ 5
SI (ϕ) = ϕ (18.36)
5! i i

Write down the Dyson-Schwinger equation for the n-point function in this theory (equation
and diagrammatic). Iterate it once more for the two-point function.

2) Consider the Euclidean action for N real scalars ϕi ,


∫ [ N ]
1 ∑ λ ∑
S = dd x (∂µ ϕj )2 + ( ϕ2 )2 , (18.37)
2 i=1 4! i i

invariant under rotations of the N scalars. Write down the explicit forms of the 2 types of
Ward identities

-for A = i (ϕi )2 .
-for the partition function Z[J].

170
19 Lecture 19. Cross sections and the S-matrix
We now turn to experimentally measurable quantities.
We already said that the S-matrix is something like Sf i =< f |S|i >, with S = UI (+∞, −∞)
(evolution operator in the interaction picture; this lecture we will use the Heisenberg picture
however, next one we will use the interaction picture), and that the LSZ formula relates
S-matrices to residues at all the p2i = −m2i external line poles of the momentum space
Green’s functions G̃n (p1 , ..., pn). Now we define better these facts and relate to experimental
observables, in particular the cross section.
Cross sections
In almost all experiments, we scatter two objects, usually projectiles off a target (”labo-
ratory reference frame”), or (like at the LHC) in a center of mass frame, collide two particles
off each other.

B A

l l
B A

Figure 48: Scattering usually involves shooting a moving projectile at a fixed target.

If we have a box of particles target A, with density ρA and length lA along the direction
of impact, and a box of projectiles B, with density ρB and length lB along the direction of
impact, and with common cross sectional area (area of impact) A, as in Fig.48, the number
of scattering events (when the projectiles are scattered) is easily seen to be proportional to
all these, ρA , lA , ρB , lB and A. We can then define the cross section σ as
Nr. of scatt. events
σ= (19.1)
ρA lA ρB lB A
This is the definition used for instance by Peskin and Schroeder. But we have more intuitive
ways to define it. Before that, let’s notice that the dimension of σ is [σ] = 1/[(L/L3 )(L/L3 )L2 ] =
L2 , i.e. of area. The physical interpretation of the cross section is the effective (cross sec-
tional) area of interaction per target particle. I.e., within such an area around one target,
incident projectiles are scattered.
The cross section should be familiar from classical mechanics, where one can do the
same. For instance, in the famous Rutherford experiment, proving that the atoms are
not continuous media (”jellium model”), but rather a nucleus surrounded by empty space

171
and then orbiting electrons, one throws charged particles (e.g. electrons) against a target,
and observes the scattered objects. The simple model which agrees with the experiment
(”Rutherford scattering”) is: one scatters the classical pointlike particle off a central potential
V (r) = Ze2 /r, and calculates the classical cross section.
The cross section definition for scattering on one single target in that case is, more
intuitively,
∆Nscatt /∆t ∆Nscatt /∆t ∆Nscatt
σ= = = (19.2)
ϕ0 ∆Nin /(∆tA) nB
i.e. the ratio of scattered particles per time, over the incident flux, or particles per time and
unit area. Here nB is the number of incident particles per area. Also, the flux is
∆Nin ρB (v∆t)A
ϕ0 = = = ρB v (19.3)
∆tA ∆tA
since the incident volume in time ∆t is (v∆t)A.
Now consider the case of N targets, like above, where N = ρA lA A. Then to find the cross
section (defined per target) we must divide by N , getting
∆Nscatt /∆t ∆Nscatt ∆Nscatt
σ= = = (19.4)
ϕ0 N (v∆t)ρB (ρA lA A) ρB lB ρA lA A
as above. Note that in general, if both the target and projectiles are moving, v refers to the
relative velocity, v = vrel = |⃗v1 − ⃗v2 |.
The above total cross section is a useful quantity, but in experiments generally we
measure the momenta (or directions, at least) of scattered particles as well, so a more useful
quantity is the differential cross section, given momenta p⃗1 , ..., p⃗2 (almost) well defined in
the final state,

(19.5)
d p1 ...d3 pn
3

An important case, for instance for the classical Rutherford scattering above, is of n = 2,
i.e. two final states. That means 6 momentum components, constrained by the 4-momentum
conservation delta functions, leaving 2 independent variables, which can be taken to be two
angles θ, ϕ. In the case of Rutherford scattering, these were the scattering angle θ relative
to the incoming momentum, and the angle ϕ of rotation around the incoming momentum.
Together, these two angles form a solid angle Ω. Therefore, in the case of 2 → 2 scattering,
after using the 4-momentum conservation, one defines the differential cross section

(19.6)
dΩ
Decay rate
There is one more process that is useful, besides the 2 → n scattering.
Consider the decay of an unstable particle at rest into a final state made up of several
particles. Then we define the decay rate
#decays/time dN
Γ= = (19.7)
#of part. N dt

172
The lifetime τ of a particle with decay rates Γi into various channels is
∑ 1
Γi = (19.8)
i
τ

Consider an unstable atomic state in nonrelativistic quantum mechanics. In those cases,


we know that the decay of the unstable state is found as a Breit-Wigner resonance in the
scattering amplitude, i.e.
1
f (E) ∝ (19.9)
E − E0 + iΓ/2
which means that the cross section for decay, or probability, is proportional to ||2 of the
above,
1
σ∝ (19.10)
(E − E0 )2 + Γ2 /4
I.e., the resonance appears as a bell curve with width Γ, centered around E0 .
In relativistic quantum mechanics, or quantum field theory, the same happens. Initial
particles combine to form an unstable (or metastable) particle, that then decays into others.
In the amplitude, this is reflected in a relativistically invariant generalization of the Breit-
Wigner formula, namely an amplitude proportional to
1 −1
≃ (19.11)
p2 + m − imΓ
2
2Ep (p − Ep +
0 im Γ
Ep 2
)

i.e., of the same form near the mass shell and for Γ small.
In-out states and the S-matrix
In quantum mechanics, we first need to define the states that scatter.
-We can consider states whose wavepackets are well isolated at t = −∞, so we can
consider them non-interacting then. But these (Heisenberg) states will overlap, and therefore
interact at finite t. We call these Heisenberg states the in-states,
|{⃗pi } >in (19.12)
-We can also consider states whose wavepackets are well isolated at t = +∞, so we can
consider them noninteracting there, but they will overlap at finite t. We call these Heisenberg
states the out-states,
|{⃗pi } >out (19.13)
Then the in-states, after the interaction, at t = +∞, look very complicated, and reversely,
the out states, tracked back to −∞, look very complicated. But all the in-states, as well as
all the out-states form complete sets,
∑ ∑
|{⃗pi }; in >< {⃗pi }; in| = |{⃗pi }; out >< {⃗pi }; out| = 1 (19.14)

That means that we can expand one in the other, so the amplitude for an (isolated) out
state, given an (isolated) in state is

out < p⃗1 , p⃗2 , ...|⃗kA , ⃗kB >in = lim < p⃗1 , p⃗2 , ...(T )|⃗kA , ⃗kB (−T ) >
T →∞

173
= lim < p⃗1 , p⃗2 , ...|e−iH(2T ) |⃗kA , ⃗kB > (19.15)
T →∞

where in the last form the states are defined at the same time, so we can define it as the
time when Heisenberg=Schrödinger, thus think of them as Schrödinger states.
Then the S-matrix is

< p⃗1 , p⃗2 , ...|S|⃗kA , ⃗kB >= out < p⃗1 , p⃗2 , ...|⃗kA , ⃗kB >in (19.16)

which means that


S = e−iH(2T ) (19.17)
is the Heisenberg picture evolution operator.
Wavefunctions
The one-particle states are always isolated, so in that case

|⃗pin >= |⃗pout >= |⃗p > (= 2Ep a†p |0 >) (19.18)

where in brackets we wrote the free theory result, and we can construct the one-particle
states with wavefunctions ∫
d3 k 1
|ϕ >= 3
√ ϕ(⃗k)|⃗k > (19.19)
(2π) 2Ek
With this normalization, the wavefunctions give the probabilities, since

d3 k
|ϕ(⃗k)|2 = 1 (19.20)
(2π)3

and so < ϕ|ϕ >= 1. A wavefunction can be something like eik·⃗x , giving an ⃗x dependence.
In the case of 2-particle states, we can write the in-states as
∫ ∫
d 3 kA d3 kB ϕA (⃗kA )ϕB (⃗kB ) −i⃗b·⃗kB ⃗ ⃗
|ϕA ϕB >in = √ √ e |kA kB >in (19.21)
(2π)3 (2π)3 2EA 2EB
⃗ ⃗
Note that we could have absorbed the factor e−ikB ·b in the B wavefunction, but we have
written it explicitly since if the wavefunctions are centered around a momentum, like in a
classical case, we have a separation between particles. In the Rutherford experiment, the
imaginary line of the non-deflected incoming projectile passes at a ⃗b minimum distance close
to the target. Note then that ⃗b is perpendicular to the collision direction, i.e. is transversal.
The out state of several momenta is defined as usual
( )
∏ d3 pf ϕf (pf )
out < ϕ1 , ϕ2 ...| = √ out < p1 p2 ...| (19.22)
f
(2π)3 2Ef

The S-matrix of states with wavefunctions is

Sβα =< βout |αin > (19.23)

174
Because this matrix gives probabilities, and the S operator is, as we saw above, an evolution
operator, it corresponds to a unitary operator, so

SS † = S † S = 1 (19.24)

But the operator S contains the case where particles go on without interacting, i.e. the
identity 1. To take that out, we define

S = 1 + iT (19.25)

Note that the i is conventional. Moreover, amplitudes always contain a momentum conser-
vation delta function. Therefore, in order to define nonsingular and finite objects, we define
the invariant (or reduced) matrix element M by
( ∑ )
⃗ ⃗
< p⃗1 , p⃗2 , ...|iT |kA , kB >= (2π) δ kA + kB −
4 4
pf iM(kA , kB → pf ) (19.26)

Reduction formula (Lehmann, Symanzik, Zimmermann)


The LSZ formula relates S-matrices to Green’s functions, as mentioned. We will not
prove it here, since for that we need renormalization, and that will be done in the second
semester. We will just state it.
We also introduce an operator A to make the formula a bit more general, but is not really
needed. Define the momentum space Green’s functions
n ∫
∫ ∏ m ∫

−ipi ·xi
d4 yj eikj ·yj < Ω|T {ϕ(x1 )...ϕ(xn )A(0)ϕ(y1 )...ϕ(ym )}|Ω >
(A)
G̃n+m (pµi , kjµ ) = 4
d xi e
i=1 j=1
(19.27)
Then we have

< {pi }n|A(0)|{kj }m >out


in
1 ∏n ∏m
= 2 lim √ 2 2 (A)
(pi + m − iϵ) (kj2 + m2 − iϵ)G̃n+m (pµi , kjµ )
pi →−mi ,kj →−mj (−i Z)
2 2 2 m+n
i=1 j=1
(19.28)

For A = 1, we obtain a formula for the S-matrix as the multiple residue at all the external
line poles of the momentum space Green’s functions, dividing by the extra factors of Z.
The full 2-point function behaves near the pole as

−iZ
G2 (p) = d4 xe−ip·x < Ω|T {ϕ(x)ϕ(0)}|Ω >∼ 2 (19.29)
p + m2 − iϵ
In other words, to find the S-matrix, we put the external lines on shell, and divide by the full
propagators corresponding
√ to all the external lines (but note that Z belongs to 2 external
lines, hence the Z). This implies a diagrammatic procedure called amputation that will be
explained next lecture.

175
Note that the factor Z has a kind of physical interpretation, since we can define it as

Z ≡ | < Ω|ϕ(0)|⃗p > |2 (19.30)

In other words, it is the probability to create a state from the vacuum. Note that the factor
Z = 1 + O(g 2 ), but the ”correction” is an infinite loop correction (this is the oddity of
renormalization, to be understood next semester). However, at tree level, Z = 1.
Cross sections from M.
The probability to go from |ϕA ϕB > to a state within the infinitesimal interval d3 p1 ...d3 pn
is ( )
∏ d3 pf
P(AB → 12...n) = 3 2E
|out < p1 p2 ...|ϕA ϕB >in |2 (19.31)
f
(2π) f

since it must be proportional to the infinitesimal interval, and with the ||2 of the amplitude,
and the rest is the correct normalization.
For one target, i.e. NA = 1, and for nB particles coming in per unit of transverse area,
the number of scattered particles is

∆N = d2 bnB P(⃗b) (19.32)

If nB is constant, we can take it out of the integral. Then the cross section is (as we saw in
(19.2) ∫
∆N
dσ = = d2 bP(⃗b) (19.33)
nB
Replacing the form of P and on the states in the amplitudes there, we get
( )∫
∏ d3 pf 1 ∏ ∫ d3 ki ϕi (⃗ki ) ∫ d3 k̄i ϕ∗ (⃗k̄i )
dσ = 3 2E
db2
3
√ 3
√i
f
(2π) f
i=A,B
(2π) 2E i (2π) 2Ēi
⃗ ⃗ ⃗ ( ) ∗
eib·(k̄B −kB ) (out < {pf }|{ki } >in ) out < {pf }|{k̄i } > (19.34)

To compute it, we use



⃗ ⃗ ⃗
d2⃗beib·(k̄B −kB ) = (2π)2 δ (2) (kB

− k̄B⊥
)
(∑ ∑ )
out < {p f }|{k i } > in = iM(2π) 4 (4)
δ ki − pf
(∑ ∑ )
∗ ∗
out < {pf }|{k̄i } >in = −iM (2π) δ k̄i −
4 (4)
pf (19.35)

We also use
∫ ∫ (∑ ∑ )
⊥ ⊥
3
d k̄A 3
d k̄B δ k̄i −
(4)
pf δ (2) (kB − k̄B )
∫ ∑ ( ∑ )
⊥ ⊥
= (ki = k̄i ) × dk̄A dk̄B δ(k̄A + k̄B −
z z z z
pf )δ ĒA + ĒB −
z
Ef

176
∫ (√ √ ∑ )
= dk̄Az δ k̄A + mA + k̄B + mB −
2 2 2 2
Ef ∑ z
z =
k̄B pf −k̄A
z

1 1
= = (19.36)
k̄z
| ĒAA −
z
k̄B
| |vA − vB |
ĒB
√ ∑
In the last line we have used that (since E = z
k 2 + m2 and there we have k̄B = pf − ⃗k̄Az )

dĒA k̄A dĒB k̄B dĒB


= ; = =− (19.37)
dk̄A ĒA dk̄B ĒB dk̄A

and the fact that dxδ(f (x) − f (x0 )) = 1/|f ′ (x0 )|.
Putting everything together, we find
( )∫ ∫ 3
∏ d3 pf 1 d 3 kA d kB
dσ = 3 3
f
(2π) 2Ef (2π) (2π)3
|M(kA , kB → {pf })|2 ( ∑ )
|ϕA (kA )|2 |ϕB (kB )|2 (2π)4 δ (4) kA + kB − pf (19.38)
2EA 2EB |vA − vB |

For states with wavefunctions centered sharply on a given momentum δ(⃗k − p⃗), we obtain
( )
1 ∏ d3 pf 1 ( ∑ )
dσ = |M(kA , kB → {pf })| (2π) δ
2 4 (4)
kA + kB − pf
2EA 2EB |vA − vB | f (2π)3 2Ef
(19.39)
In this formula, |M| is Lorentz invariant,
2

∫ ( )
∏ d3 pf 1 ( ∑ )
dΠn = 4 (4)
(2π) δ kA + kB − pf (19.40)
f
(2π)3 2Ef

is relativistic invariant as well, and is called relativistically invariant n-body phase space,
however
1 1 1
= = (19.41)
EA EB |vA − vB | |EB pA − EA pB |
z z
|ϵµxyν pµA pνB |
is therefore not relativistically invariant, meaning the cross section is not relativistically in-
variant either. However, if ⃗kA ∥⃗kB (for instance in the center of mass frame, or the laboratory
frame), this can be written as
1
√ (19.42)
(p1 · p2 )2 − m21 m22
So if we adopt this form in all reference frames, i.e. replace in the formula for the differential
cross section
1 1
→√ (19.43)
EA EB |vA − vB | (p1 · p2 )2 − m21 m22
then we obtain the relativistically invariant cross section. This is a theoretical concept, which
is useful since we can write this cross section in simple way in terms of Mandelstam variables

177
s, t (to be defined later). But it is not measurable, so for comparison with experiment, we
should remember to go back to the usual form (or just use center of mass frame or laboratory
frame).
To understand the relativistically invariant n-body phase space better, consider the im-
portant case of n = 2, i.e. 2 → 2 scattering, and consider working in the center of mass
frame, so p⃗total = 0. Then we have p⃗1 , p⃗2 , but the delta function over 3-momenta imposes
p⃗2 = −⃗p1 , so we are left with
∫ ∫
dp1 p21 dΩ
dΠ2 = 2πδ(ECM − E1 − E2 )
(2π)3 2E1 2E2
∫ ∫
p12
1 |p1 |
= dΩ p1 = dΩ (19.44)
2 p1
16π E1 E2 E1 + E2 16π 2 ECM
p1 =−p2
√ √
Here ECM = E1 + E2 and E1 = p21 + m21 , E2 = p21 + m22 .
Therefore finally we obtain that in the center of mass frame
( )
dσ 1 |⃗p1 |
= |M(pA , pB → p1 , p2 )|2 (19.45)
dΩ CM 2EA 2EB |vA − vB | 16π 2 ECM
In the case of identical masses for all the particles (A,B,1,2), we have
ECM
= EA = EB ; |pA | = |pB | = |p1 | = |p2 | (19.46)
2
and substituting above (together with EA EB |vA − vB | = |EA pB − EB pA |), we get
( )
dσ |M|2
= 2
(19.47)
dΩ CM 64π 2 ECM
Particle decay
We can now calculate particle decay very easily. Formally, all we have ∫to do is keep only
∫kA ’s3 and drop the kB ’s in the calculation above. Dropping
∫ the integrals d3 kB /(2π)3 and
d k̄B /(2π)3 implies that in the above we don’t have dk̄Bz
anymore in (19.36), only
∫ ( ∑ ) ( ∑ ) ( ∑ )
dk̄Az δ k̄Az − pzf δ ĒA − Ef = δ ĒA − Ef (19.48)

so in effect, we need to remove the factor 1/|vA − vB | from the calculation. We can then
immediately write the result for the decay rate in the center of mass system,
( )
1 ∏ d3 pf 1 ( ∑ )
dΓ|CM = |M(mA → {pf })| (2π) δ pA −
2 4 4
pf (19.49)
2mA f
(2π)3 2Ef

We only have a problem of interpretation: what does this represent, since the state is unsta-
ble, so how do we define the asymptotic (noninteracting) state A? Nevertheless, the result
is correct.

Important concepts to remember

178
• The cross section is the number of scattering events per time, divided by the incoming
flux and the number of targets, and it measures the effective area of interaction around
a single target.

• The decay rate is the number of decays per unit time, divided by the number of particles
decaying. The sum of the decay rates gives the inverse of the lifetime.

• In a cross section, a resonance appears via a relativistic generalization of the Breit-


Wigner formula, with amplitudes proportional to 1/(p2 + m2 − imΓ) for the resonance.

• In states are well separated states at t = −∞, out states are well separated states at
+∞, and their overlap gives the S-matrix.

• The S operator is 1 + iT , and extracting from the matrix elements of T the overall
momentum conservation delta functions, we obtain the finite reduced matrix element
M.

• The LSZ formula says that the S-matrix is the residue at all the external poles of the
momentum space Green’s function, divided by the Z factors, or the Green’s function
divided by the full external propagators near the mass shell.

• The differential cross section equals |M|2 , times the relativistically invariant n-body
phase space, times 1/(EA E√ B |vA − vB ). If we replace the last factor with the relativis-
tically invariant formula 1/ (p1 · p2 )2 − m21 m22 , we obtain the relativistically invariant
cross section, a useful theoretical concept.

Further reading: See chapters 4.5 in [2], 2.5, 4.2, 4.3 in [1].

179
Exercises, Lecture 19

1) Consider the scattering of a massive field off a massless field (m1 = m3 = m; m2 =


m4 = 0) in a theory where iM ≃ iλ =constant. (e.g. Fermi’s 4-fermion theory for weak
interactions). Calculate the total, relativistically invariant cross section σtot,rel as a function
of the center of mass energy.

2) Consider the theory


1 1 1 1
L = − (∂µ ϕ)2 − (∂µ Φ)2 − M 2 Φ2 − m2 ϕ2 − µΦϕϕ (19.50)
2 2 2 2
Then, if M > 2m, the first order amplitude for Φ to decay in 2 ϕ’s is |M| = µ. Calculate
the lifetime τ for Φ.

180
20 Lecture 20. S-matrix and Feynman diagrams
We now want to go from the Heisenberg picture used last lecture to the interaction picture,
the same way as we did for Green’s functions. Last lecture we wrote for the S-matrix

< p⃗1 p⃗2 ...|S|⃗kA⃗kB >= lim < p⃗1 p⃗2 ...|e−iH(2T ) |⃗kA⃗kB > (20.1)
T →∞

where the states are in the full theory (not free states), i.e. are eigenfunctions of H, and are
defined at the same time. But as for the Green’s functions, we want to replace them by free
states, i.e. eigenfunctions of H0 . Before, we wrote

|Ω >= lim (e−iE0 T < Ω|0 >)−1 e−iHT |0 >= lim (e−iE0 T < Ω|0 >)−1 UI (0, −T )|0 >
T →∞(1−iϵ) T →∞(1−iϵ)
(20.2)
(by introducing a complete set of full states |n >< n|, with |Ω > having H|Ω >= E0 |Ω >
and the T → ∞(1−iϵ) guaranteeing only the lowest energy mode, i.e. the vacuum, remains).
Now we want to write in a similar way

|⃗kA⃗kB >∝ lim e−iHT |⃗kA⃗kB >0 (20.3)


T →∞(1−iϵ)

where we wrote a proportionality sign, since proving the relation and getting the constant is
now complicated: the external state is not a ground state anymore. Using this, we rewrite
the right hand side of (20.1) as

lim 0 < p⃗1 ...⃗pn |e−iH(2T ) |⃗pA p⃗B >0


T →∞(1−iϵ)
{ [ ∫ T ]}
∝ lim 0 < p ⃗1 ...⃗pn |T exp −i dtHI (t) |⃗pA p⃗B >0 (20.4)
T →∞(1−iϵ) −T

therefore
Sf i |int =< f |UI (+∞, −∞)|i > (20.5)
The proportionality factors cancel out in the Green’s functions case by using the Feynman
and Wick theorems, and only the connected pieces remain. Here a similar story happens,
though the proof is harder to do. We will explain later the result in terms of the LSZ
formalism. The correct formula is

< p⃗1 ...⃗pn |iT |⃗pA p⃗B > ( { [ ∫ ]} )


√ n+2 T
= ( Z) lim ⃗1 ...⃗pn |T exp −i
0 < p dtHI (t) |⃗pA p⃗B >0 (20.6)
T →∞(1−iϵ) −T connected,amputated

We have iT instead of S, since the formula gives only the matrix elements where the initial
and final states are different. To understand the meaning of ”connected, amputated” we will
do some examples. We first note that the factors of Z and equal to 1 at tree level, and we
will understand the need to put them there later, from the LSZ formula.
We first consider the free term on the right hand side of (20.6). It is given by

⃗1 p⃗2 |⃗pA p⃗B >0 ≡ 2EA 2EB 2E1 2E2 < 0|a1 a2 a†A a†B |0 >
0 < p

181
( )
= 2EA 2EB (2π)6 δ (3) (⃗pA − p⃗1 )δ (3) (⃗pB − p⃗2 ) + δ (3) (⃗pA − p⃗2 )δ (3) (⃗pB − p⃗1 ) (20.7)

This has the diagrammatic interpretation of propagator lines connecting (1A)(2B) and
(1B)(2A), see Fig.49, and corresponds to the 1 in S = 1 + iT , therefore should be ex-
cluded from the right hand side of (20.6). It is excluded, as the two lines are not connected
among each other.

1 2 1 2

A B A B

Figure 49: Free term for the scattering matrix: it is excluded (only interacting pieces con-
sidered).

We next consider the first order term in λ,


{ ∫ } { ∫ }
λ λ
⃗1 p⃗2 |T −i
0 < p d xϕI (x) |⃗pA p⃗B >0 = 0 < p⃗1 p⃗2 |N −i
4 4 4 4
d xϕI (x) + contractions
4! 4!
(20.8)
where in the equality we used Wick’s theorem.
But now we don’t have |0 > anymore, but rather |⃗pA p⃗B >0 , so the ”non-contracted”
pieces give a nonzero result. Better said, there is another type of contraction. Consider the
annihilation part of ϕI , ϕ+ I , acting on an external state:
(∫ )
d3 k 1 √
+
ϕI (x)|⃗p >0 = 3
√ a ⃗
k eik·x
( 2Ep a†p |0 >)
∫ (2π) 2Ek
d3 k 1 ( )
= √ e ik·x
(2π) 3 (3) ⃗
δ ( k − p
⃗ ) |0 >= eip·x |0 > (20.9)
(2π)3 2Ek

Inside < ..|N (ϕn )|.. >=< ..|(ϕ− )n (ϕ+ )n |.. >, we have ϕ+ ’s acting on a state on the right and
ϕ− ’s acting on the left, so we define the contraction with an external state,

ϕI (x)|⃗p >= eip·x |0 > (20.10)

and the conjugate


< p⃗|ϕI (x) =< 0|e−ip·x (20.11)
We now analyze the first order term in the S matrix (20.8). It contains a term ϕϕϕϕ, a
term ϕϕϕϕ, and a term ϕϕ ϕϕ.

182
Consider the last term, with two ϕϕ contractions. It is

λ
−i d4 xϕϕ ϕϕ0 < p⃗1 p⃗2 |⃗pA p⃗B >0 (20.12)
4!

It gives the figure eight vacuum bubble times the free term (zeroth order) above in Fig.49,
so it is a disconnected piece where the initial state and final state are the same. Therefore
again this is a piece that contributes to 1, not to iT , and again is disconnected, consistent
with (20.6).
Next consider the second term, with a single contraction. After applying the normal
ordering N operator, we have ∼ a† a† + 2a† a + aa, but only the term with the same number
of a† and a contributes in the expectation value (e.g. < p1 p2 |a† a† |pA pB >∼< 0|aa(a† )4 |0 >∼
[a, a† ]2 < 0|a† a† |0 >= 0), i.e.
∫ ∫
λ
−i d4 x2ϕϕ0 < p⃗1 p⃗2 |ϕ− ϕ+ |⃗pA p⃗B >0 (20.13)
4!
Doing the contractions of the ϕ’s with the external states we obtain the same zeroth order
term with a loop on one of the legs (thus a ∫total of 4 terms), see Fig.50. Contractions with
external states are as in Fig.51. Since the d4 x interval gives a delta function, on the leg
with a loop we also have the same momentum on both sides, so this is again a contribution
to the identity 1, and not to iT , and is disconnected, consistent with (20.6).

1 2 1 2 1 1 2
2

+ +

A B A B A B A B

Figure 50: Other disconnected terms, this time with nontrivial contractions (loops), excluded
from the scattering matrix.

(x) p = p (x) =
I x p p x
I

Figure 51: Feynman diagrammatic representation for contractions with external states.

183
Finally, the only nonzero term is the one where there are no contractions between the
ϕ’s, and only contractions with the external states, of the type

0 < p⃗1 p⃗2 |ϕ− ϕ− ϕ+ ϕ+ (x)|⃗pA p⃗b >0 (20.14)

where all the ϕ’s are at the same point, so this gives a term ei(pA +pB −p1 −p2 )·x . But this comes
from N (ϕ4 (x)), with ϕ = ϕ+ + ϕ− , so there are 4! = 24 such terms as the above, since from
the first ϕ, we can pick either ϕ− or ϕ+ to contract with either one of the 4 external states,
p⃗1 , p⃗2 , p⃗A , p⃗B , the next ϕ can be contracted with either of the remaining (uncontracted) 3
external states, etc, giving 4 · 3 · 2 · 1 terms. The total result is then
( )∫
λ
4! × −i d4 xei(pA +pB −p1 −p2 )·x = −iλ(2π)4 δ (4) (pA + pB − p1 − p2 ) (20.15)
4!

which is just a vertex connecting the 4 external lines, as in Fig.52a.

1 2 p1 p2
k
x
p’
p y
A B A p
b) B
a)

Figure 52: a) The first order contribution to the S matrix. b) Even among the connected
diagrams there are some that do not contribute to the S matrix, like this one. It is one which
will be amputated.

Since iT is (2π)4 δ(...) × iM, we have M = −λ. Replacing in the formula for the
differential cross section in the center of mass frame from last lecture, we get
( )
dσ λ2
= 2
(20.16)
dΩ CM 64π 2 ECM

This is independent of the angles θ, ϕ, therefore it integrates trivially, by multiplying with


Ω = 4π, giving finally
λ2
σtot = 2
(20.17)
32πECM
Here we added a factor of 1/2 since we have identical particles in the final state, so it is
impossible to differentiate experimentally between 1 and 2.
So we need to consider only fully connected diagrams, where all the external lines are
connected as well. But even among these diagrams there are some that need to be excluded.

184
To see this, consider the second order diagram obtained as follows: a vertex diagram with a
loop on one of the external states (pB ), with k in the loop and p′ after it, as in Fig.52b. It
gives
∫ ∫
1 d4 p′ −i d4 k −i
4 ′2 2
(−iλ)(2π)4 δ (4) (pA +p′ −p1 −p2 )×(−iλ)(2π)3 δ (4) (pB −p′ )
2 (2π) p + m (2π) k + m2
4 2

(20.18)
where the delta functions coming from the integration over vertices. But doing the p′ inte-
gration we get
1 1 1
= 2 = (20.19)
′2 2
p + m p′ =pB pB + m 2 0
so the result is badly defined. That means that this diagram must be excluded from the
physical calculation, so we must find a procedure that excludes this kind of diagrams.
We now define amputation as follows. From the tip of an external leg, find the furthest
point in the connected diagram where we can disconnect the external line from the rest by
cutting just one propagator, and remove that whole piece (including the external line) from
the diagrams. See examples of diagrams in Fig.53.

Figure 53: Examples for the diagrammatic representation of amputation, to obtain the
Feynman diagram contributions to the S matrix.

Therefore we can say that


( ∑ ) (∑ ) √
iM(2π) δ 4 (4)
pA + pB − pf = connected, amputated Feynman diag. ×( Z)n+2
(20.20)
These pieces of diagrams that we are excluding are contributions to the two-point func-
tions for the external lines (quantum corrections to the propagators). We will see the inter-
pretation of this shortly.
But first we consider the Feynman diagrams in x space. See Fig.54, x space.
-Propagator ∫DF (x − y).
-Vertex −iλ d4 x.
-External line eip·x .
-Divide by the symmetry factor.
And the Feynman diagrams in p space. See Fig.54 p space.
-Propagator DF (p) = −i/(p2 + m2 − iϵ).
-Vertex −iλ.

185
x y
=DF(x−y) =DF(p)

=vertex
x
x p
= eipx =−i

x space
p space =1
x

Figure 54: Diagrammatic representation fot the relevant Feynman rules for S-matrices in a)
x-space and b) p-space.

-External line = 1.
-Momentum
∫ 4 conservation at vertices.
4
- d p/(2π) for loops.
-Divide by the symmetry factor.
Given the LSZ formula (which we only wrote, we didn’t derive), we can understand the
above rule (20.20). Indeed, we can rewrite the LSZ formula as
( ∑ )
Sf i − δf i =< f |iT |i >= (2π)4 δ (4) pA + pB − pf iM
( ) ( )
√ m+n ∏ p2i + m2 − iϵ ∏ kj2 + m2 − iϵ
n m
= 2 lim ( Z) × G̃n+m (pi , kj )
pi +m2 →0,kj2 +m2 →0
i=1
−iZ j=1
−iZ
(20.21)

But in G̃n+m we have the connected Feynman diagrams, and the factors in the brackets are,
near their mass shell, the inverse of the full propagators for the external lines,
−iZ
G2 (p) ∼ (20.22)
p2 + m2 − iϵ
So amputation is just the removal (dividing by) these full on-shell 2-point functions for the
external lines. Thus we obtain exactly the rule (20.20).

Important concepts to remember

• In the interaction picture, Sf i ∝< f |UI (+∞, −∞)|i >.

• The matrix elements of iT (nontrivial S-matrices) are the sum over connected, ampu-
tated diagrams.

186
• Amputation means removing the parts that contribute to the (on-shell) 2-point func-
tions of the external lines.

• The S matrix diagrammatic rules follow from the LSZ formula.

Further reading: See chapters 4.6 in [2], 4.1 and parts of 4.4 in [1].

187
Exercises, Lecture 20

1) Consider the second order term in the S matrix


{( )2 ∫ ∫ }
−iλ
⃗1 p⃗2 |T
0 < p d4 xϕ4I (x) d4 yϕ4I (y) |⃗pA p⃗B >0 (20.23)
4!

Using the Wick contractions and theorem, show the only Feynman diagrams which contribute
to the S-matrix.

2) Consider the Feynman diagram in Fig. 55 in λϕ4 /4! theory.

Figure 55: Example of Feynman diagram.

Write down the Feynman diagram corresponding to this in 0 < p⃗1 p⃗2 |S|⃗p3 p⃗4 >0 and write
the integral expression for it using the Feynman rules.

188
21 Lecture 21. The optical theorem and the cutting
rules
The optical theorem is a straightforward consequence of the unitarity of the S-matrix. In
a good quantum theory, the operator defining the evolution of the system, in our case
Sf i =< f |U (+∞, −∞)|i >, has to be unitary, in order to conserve probability (there are no
”sources or sinks of probability”). Therefore, with S = 1 + iT , we have

S † S = 1 = SS † ⇒ −i(T − T † ) = T † T (21.1)

which is the essence of the optical theorem. But the point is that in our definition of quantum
field theory, and more importantly in our perturbative definition of S-matrices, there were
many unknowns and approximations, so it is by no means obvious that the S matrix is
unitary. We have to prove it perturbatively (order by order in perturbation theory), and from
the above, that is equivalent to proving the optical theorem.
To obtain the usual formulation of the optical theorem, we evaluate (21.1) between two
2-particle states, |⃗p1 p⃗2 > and |⃗k1⃗k2 >. Moreover, in between T † T , we introduce the identity
expressed by considering the completeness relation for external states,
∑∏ n
d3 qi 1
1= |{⃗qi } >< {⃗qi }| (21.2)
n i=1
(2π)3 2Ei

obtaining
∑∏ n
d3 qi 1

< p⃗1 p⃗2 |T T |⃗k1⃗k2 >= 3 2E
< p⃗1 p⃗2 |T † |{⃗qi } >< {⃗qi }|T |⃗k1⃗k2 > (21.3)
n i=1
(2π) i

But ( ∑ )
< {⃗pf }|T |⃗k1⃗k2 >= M(k1 , k2 → {pf })(2π)4 δ (4) k1 + k2 − pf (21.4)

so by replacing in the above and taking out a common factor (2π)4 δ (4) (k1 + k2 − p1 − p2 ), we
obtain

−i[M(k
( 1 n, k2 → p1 , p2 ))− M (p1 , p2 → k1 , k2 )] ( )
∑ ∏ d3 qi 1 ∑
= 3 2E
M∗ (p1 p2 → {qi })M(k1 k2 → {qi })(2π)4 δ (4) k1 + k2 − qi
n i=1
(2π) i i
(21.5)

or more schematically (here dΠf is the n-body relativistically invariant phase space defined
in the next to last lecture)
∑∫

−i[M(a → b) − M (b → a)] = dΠf M∗ (b → f )M(a → f ) (21.6)
f

which is the general statement of the optical theorem on states.

189
An important particular case is when the initial and final states are the same (a = b),
or forward scattering, pi = ki , in which case the right hand side of (21.5) contains the total
cross section,
ImM(k1 k2 → k1 k2 ) = 2ECM pCM σtot (k1 k2 → anything) (21.7)
or the imaginary part of the forward scattering amplitude is proportional to the total cross
section for the two particles to go to anything, see Fig.56.

k2
k2 k2
*
2Im =
d f
k1 k1
f
f k1

k2

f
k1

Figure 56: The optical theorem, diagrammatic representation.

But as we said, we supposed that the S-matrix is unitary, and a priori it could not be, we
need to prove it in perturbative quantum field theory. It was proved for Feynman diagrams
to all orders, first in λϕ4 theory by Cutkovsky, then in QED by Feynman and in gauge
theories later. In particular, the proof for spontaneously broken gauge theories was done
by ’t Hooft and Veltman, which together with their proof of renormalizability of the same
meant that spontaneously broken gauge theories are well-defined, and for which they got
the Nobel prize. That proof was done using a more general formalism, of the largest time
equation. Here we will just present the one-loop case in λϕ4 .
Optical theorem at 1-loop in λϕ4 theory
The unitarity proof is done proving directly the optical theorem for Feynman diagrams.
Note that above the reduced amplitude M is defined formally, but we can define it purely
by Feynman diagrams, and that is the definition we will use below, since we want to prove
unitarity for Feynman diagrams. Then, M is defined at some center of mass energy, with
s = ECM2
∈ R+ . But we analytically continue to an analytic function M(s) defined on the
complex s plane.

Consider s0 the threshold energy for production of the lightest multiparticle state.
Then, if s < s0 and real, it means any possible intermediate state created from the initial

190
state cannot go on-shell, can only be virtual. In turn, that means that any propagators in
the expression of M in terms of Feynman diagrams are always nonzero, leading to an M(s)
real for s < s0 real, i.e.
M(s) = [M(s∗ )]∗ (21.8)
We then analytically continue to s > s0 real, and since both sides of the above relation are
analytical functions, we apply it for s ± iϵ with s > s0 real, giving (since now the propagators
can be on-shell, so we need to avoid the poles by moving s a bit away from the real axis)
ReM(s + iϵ) = ReM(s − iϵ); ImM(s + iϵ) = −ImM(s − iϵ) (21.9)
That means that for s > s0 and real there is a branch cut in s (by moving accross it, we go
between two Riemann sheets), with discontinuity
DiscM(s) = 2iImM(s + iϵ) (21.10)
It is much easier to compute the discontinuity only, than the full amplitude for complex s
and taking the discontinuity.
The discontinuity of Feynman diagrams with loops, across cuts
∫ on loops (equal from the
above to the Im parts of the same diagrams) will be related to dΠ|M̄| , i.e. the optical
2

theorem formula, see Fig.57. We will obtain the so-called ”cutting rules”, ironically found
by Cutkovsky, so also called Cutkovsky rules.

Disc 1
a
2
= d
b
2 =2Im(’’)

Figure 57: On the lhs of the optical theorem, represent the imaginary part of diagrams as a
discontinuity (cut).

We consider the 2 → 2 one-loop diagram in λϕ4 where k1 , k2 go to two intermediate lines,


then to two final lines, called k3 , k4 . Define k = k1 + k2 , and in the two intermediate lines
define the momenta as k/2 + q and k/2 − q, where q is the (integrated over) loop momentum,
see Fig.58. Since this diagram has symmetry factor S = 2, we have for it (we call it δM
since it could be part of a larger diagram)

λ2 d4 q 1 1
iδM = (21.11)
2 (2π) (k/2 − q) + m − iϵ (k/2 + q) + m2 − iϵ
4 2 2 2

This amplitude has a discontinuity for k 0 > 2m (in the physical region, since with center
of mass energy k 0 we can create an on-shell 2-particle intermediate state of energy 2m only
if k 0 > 2m), but we compute it for k 0 < 2m and then analytically continue.

191
k=k1+k2
k/2−q k/2+q

k1 k2

Figure 58: The optical theorem is valid diagram by diagram. Consider this one-loop diagram,
to derive the Cutkovsky rules.

In the center of mass system, k = (k 0 , ⃗0), where s = (k 0 )2 , the poles of (21.11) are at
|k /2 ± q| = ⃗q2 + m2 ≡ Eq2 , or at
0

k0
q0 = ± (Eq − iϵ) and
2
k0
= − ± (Eq − iϵ) (21.12)
2

0 0 q0
−k/2−Eq k/2−Eq

0
−k/2+Eq 0
k/2+E
q

Figure 59: Closing the contour downwards we pick up the poles in the lower-half plane.

We now close the contour downwards, with a semi-circle in the lower-half plane, because
of the prescription T = ∞(1 − iϵ), therefore picking the poles in the lower-half planes,
see Fig.59. We will see later that the pole at +k 0 /2 + Eq − iϵ does not contribute to the
discontinuity, so only the pole at −k 0 /2 + Eq − iϵ does. But picking up the residue at the
pole is equivalent to replacing
1 1 −1
≃ 0
(k/2 + q)2+ m − iϵ
2 q + k /2 − Eq + iϵ 2Eq
0

192
δ(q 0 + k/2 − Eq )
→ 2πiδ((k/2 + q)2 + m2 ) = 2πi (21.13)
2Eq
This gives

λ2 d3 q 1 1
iδM = −2πi
2 (2π) 2Eq (k − Eq )2 − Eq2
4 0
∫ ∞
λ2 4π 1 1
= −2πi dEq Eq |q| (21.14)
4
2 (2π) m 2Eq k (k0 − 2Eq )
0

In the first line∫ we used∫ the above delta∫function, and (k/2−q)


∫ 2
2
+m

2
= −[(k 0 −Eq )2 −Eq2 ], and
in the second d q = dΩq dq, with dΩ = 4π and q dq = qEq dEq , since Eq2 = q 2 + m2
3 2

(which also gives Eq ≥ m). Note that if k 0 < 2m, the denominator in the last line has no
pole on the integral path, as k 0 < 2m < 2Eq , so δM is real (has no discontinuity). [Also
note that the pole in q 0 neglected, at +k 0 /1 + Eq − iϵ indeed does not contribute, since its
residue has now no pole for the Eq integral, as we can easily check.] But if k 0 > 2m, we have
a pole on the path of integration, so there is a discontinuity between k 2 + iϵ and k 2 − iϵ. To
find it, we can write
1 1
=P 0 ∓ iπδ(k 0 − 2Eq ) (21.15)
k0 − 2Eq ± iϵ k − 2Eq
where P stands for principal part. Therefore for the discontinuity, equivalently, we can also
replace the second propagator with the delta function
1 1 1
= =
(k/2 − q)2 + m2 − iϵ −(k 0 − Eq )2 + Eq2 − iϵ k 0 (k 0 − 2Eq )
δ(k 0 − 2Eq )
→ 2πiδ((k/2 − q)2 + m2 ) = 2πiδ(k 0 (k 0 − 2Eq )) = 2πi (21.16)
k0
We now go back, to see what we did, and rewrite the original loop integration as the integra-
tion over momenta on the propagators, with a vertex delta function (k/2 − q = p1 , k/2 + q =
p2 ), ∫ ∫ 4 ∫ 4
d4 q d p1 d p2
4
= 4 4
(2π)4 δ (4) (p1 + p2 − k) (21.17)
(2π) (2π) (2π)
In this formulation, the discontinuity is obtained by replacing the propagators with delta
functions for them, i.e.
1
→ 2πiδ(p2i + m2 ) (21.18)
pi + m2i − iϵ
2
∫ ∫
So by doing the integrals over the zero component of the momenta d4 p1 d4 p2 , we put the
momenta on-shell, and we finally get
∫ 3 ∫ 3
i d p1 1 d p2 1
DiscM(k) = 2iImM(k) = 3
|M̃(k)|2 (2π)4 δ (4) (p1 +p2 −k) (21.19)
2 (2π) 2E1 (2π)3 2E2

where the 1/2 factor is interpreted as the symmetry factor for 2 identical bosons, |M̃(k)|2 =
λ2 corresponds in general to M̃(k1 , k2 → p1 , p2 )M̃∗ (k3 , k4 → p1 , p2 ). On the left hand side

193
the discontinuity is in the one-loop amplitude M(k1 , k2 → k3 , k4 ) (we wrote above for forward
scattering k1 = k3 , k2 = k4 ), where we cut the intermediate propagators, putting them on-
shell, i.e. we replace the propagator with the delta function for the on-shell condition, as in
the lhs of Fig.57. On the rhs we have the rhs of Fig.57.
Therefore we have verified the optical theorem at one-loop, and by the discussion at the
beginning of the lecture, we have thus verified unitarity at one-loop in λϕ4 . We can also write
some one-loop diagrams in QED, cut them and write the corresponding optical theorem at
one-loop, see 60. One can prove unitarity for them equivalently to the above.

2
2Im = d

2
2Im = d

Figure 60: Examples of the optical theorem at one-loop in QED.

Finally, Cutkovski gave the general cutting rules, true in general (at all loops).
1) Cut the diagram in all possible ways such that cut propagators can be put simultane-
ously on-shell.
2) For each cut, replace
1
→ 2πiδ(p2 + m2 ) (21.20)
p2i + mi − iϵ
2

in the cut propagators, and then do the integral.


3) Sum over all possible cuts.
Then, he proved unitarity (equivalent to the optical theorem) in perturbation theory.
There exists a more general method, the method of the largest time equation (which is
a generalization of the above method), that was used by ’t Hooft and Veltman to prove the
perturbative unitarity of spontaneously broken gauge theories, but we will not do it here.

Important concepts to remember

194
• The optical theorem is the statement −i(T − T † ) = T † T on states, which follows from
unitarity of the S-matrix.

• The diagrammatic interpretation for it is that we cut over all possible intermediate
states the amplitude for 2 → 2 scattering, and this equals the cut amplitude parts
M(a → f )M∗ (b → f ), integrated over the final phase space.

• It implies that the imaginary part of the forward scattering amplitude is proportional
to the total cross section σtot (k1 k2 → anything).

• In λϕ4 at one-loop, the discontinuity for M(k) is found by replacing the intermediate
cut propagators with delta function of the mass shell condition, leading to the one-loop
optical theorem (defined diagrammatically), which is equivalent to one-loop unitarity.

• The general cutting rules say that we should cut a diagram in all possible ways such
that the cut propagators can be put simultaneously on-shell, then replace the cut
progagators with their mass shell delta function, and sum over all possible cuts.

• Using this method, and its generalization, the largest time equation, perturbative uni-
tarity was proven for λϕ4 theory, gauge theories, spontaneously broken gauge theories.

Further reading: See chapter 7.3 in [2].

195
Exercises, Lecture 21

1) (”Black disk eikonal”) Consider 2 → 2 scattering of massless particles k1 , k2 →


k3 , k4 , with all momenta defined as incoming, as in Fig.61, and s = −(k1 + k2 )2 , t = −⃗q2 =
−(k1 + k3 )2 , and the S-matrix
S = eiδ(b,s) (21.21)
where δ(b, s) satisfies
Re[δ(b, s)] = 0
Im[δ(b, s)] = 0 for b > bmax (s); Im[δ(b, s)] = ∞ for b < bmax (s) (21.22)
and ∫
1 ⃗
M(s, t) = d2 bei⃗q·b T (b, s) (21.23)
s
where ⃗b is the impact parameter. Calculate σtot (k1 , k2 → anything).

k2 k4

k1 k3

Figure 61: 2 to 2 scattering with incoming momenta.

2) Consider the two-loop Feynman diagram in λϕ3 theory (two incoming external lines
merge into an intemediate one, which splits into a circle with a vertical line in it, then mirror
symmetric: intermediate line splittng into the final two external states) in Fig.62.

Figure 62: Two-loop diagram in ϕ3 theory.

Write down the unitarity relation for it using the cutting rules. Which cuts depend on
whether all lines have the same mass, and which ones don’t?

196
22 QED: Definition and Feynman rules; Ward-Takahashi
identities
We now turn to applications to the real world, in particular to QED (quantum electrody-
namics). It is given by a gauge field (electromagnetic field) Aµ and a fermion (the electron
field) ψ. For generality, we couple also to a complex scalar field ϕ.
The Lagrangean in Minkowski space is then
1 2
LQED,M (A, ψ, ϕ) = − Fµν / + m)ψ − (Dµ ϕ)∗ Dµ ϕ − V (ϕ∗ ϕ) = L(A) + L(ψ, A) + L(ϕ, A)
− ψ̄(D
4
(22.1)
† 0
Here Fµν = ∂µ Aν − ∂ν Aµ , D
/ = Dµ γ and Dµ = ∂µ − ieAµ , and ψ̄ = ψ iγ , {γµ , γν } = 2gµν .
µ

The Lagrangean possesses a local U (1) gauge invariance, under


ψ(x) → ψ ′ (x) = ψ(x)eieχ(x)
ϕ(x) → ϕ′ (x) = ϕ(x)eieχ(x)
Aµ (x) → A′µ (x) = Aµ (x) + ∂µ χ(x) (22.2)
We note that the Lagrangean at Aµ = 0 has only global U (1) invariance, and by the condition
to make the invariance local (to ”gauge it”), we need to introduce the gauge field Aµ (x) with
the transformation law above.
In Euclidean space, since as usual iSM → −S, we get for the Euclidean fermion lagrangean
LE [ψ] = ψ̄(γ µ ∂µ + m)ψ (22.3)
where now {γµ , γν } = 2δµν . We also saw that for the gauge field we have
1 (E) µν(E)
LE [A] = + Fµν F (22.4)
4
Therefore the QED Lagrangean in Euclidean space is
1 2
LE (A, ψ, ϕ) = + Fµν / + m)ψ + (Dµ ϕ)∗ Dµ ϕ + V (ϕ∗ ϕ)
+ ψ̄(D (22.5)
4
Path integral
As we saw, defining the path integral by quantizing, we use a procedure like the Fadeev-
Popov one, leading to a gauge fixing term, with
1 2 1 µ
Lef f = L(A) + Lgauge fix = Fµν + (∂ Aµ )2 (22.6)
4 2α
The action with gauge fixing term can then be written as
∫ ( ( ) ) ∫
1 1 1
Sef f [A] = d xAµ −∂ δµν + 1 −
d 2
∂µ ∂ν A ν = dd xAµ Gµν
(0)−1
Aν (22.7)
2 α 2
Then in momentum space the photon propagator is
( )
1 kµ kν
Gµν = 2 δµν − (1 − α) 2
(0)
(22.8)
k k

197
From now on, we talk about pure QED, without ϕ.
The Euclidean space partition function is

Z[Jµ , ξ, ξ] = DADψDψ̄e−Sef f [A,ψ,ψ̄,J,ξ,ξ̄]
¯ (22.9)

where in the presence of sources



¯ − ψ̄ξ]
Sef f = dd x[Lef f (A, ψ, ψ̄) − Jµ Aµ − ξψ (22.10)

The VEV of an operator O(A, ψ̄, ψ) is



DADψ̄DψOe−Sef f
< O(A, ψ̄, ψ) >= ∫ (22.11)
DADψ̄Dψe−Sef f

The free energy F is defined by

Z[J, ψ, ψ̄] = e−F [J,ξ,ξ̄] (22.12)

The effective action is the Legendre transform of the free energy



cl cl cl ¯
Γ[A , ψ̄ , ψ ] = F [J, ξ, ξ] + dd x[Jµ Acl ¯ cl cl
µ + ξψ + ψ̄ ξ] (22.13)

¯ Acl , ψ cl , ψ̄ cl of the Legendre transform relation,


By taking derivatives with respect to Jµ , ξ, ξ, µ
we obtain
δF δF δF
= −Aclµ; = −ψ cl
; = ψ̄ cl
δJµ δ ξ¯ δξ
δΓ δΓ ¯ δΓ = ξ
= Jµ ; = −ξ; (22.14)
δAcl
µ δψ cl δ ψ̄ cl

Feynman rules for Green’s functions in Euclidean momentum space


It is easier to calculate Green’s functions in Euclidean space, hence we first write the
Feynman rules in Euclidean space (see Fig.63a):

• ψ propagator from α to β (arrow from α to β), with momentum p is


( )
1 (−ip/ + m)αβ
= (22.15)
ip/ + m αβ p 2 + m2

• photon propagator, from µ to ν is


( )
1 kµ kν
δµν − (1 − α) 2 (22.16)
k2 k

198
=1/(ip+m) =photon =
p
p
propagator p

=+ie =+e

a) Euclidean Green’s fcts. b)S matrix Minkowski


c) S−matrix external lines: p
A p = = (p) p A = = * (p)
p
s s
p,s = =u (p) p, s = =u (p)

s s
k ,s = =v (p) k, s = =v (p)
k k

Figure 63: Relevant Feynman rules for Green’s functions and S-matrices.

• vertex +ie(γ µ )αβ for a photon with µ to come out of a∫line from α to β. This is so
because the interaction∫∏term in the action is Sint = −ie ψ̄γ µ Aµ ψ, and we know that
for an interaction +g i ϕi , the vertex is −g.

• a minus sign for a fermion loop


Feynman rules for S-matrices in Minkowski space
S-matrices can only be defined in Minkowski space, since we need to have external states,
satisfying p2 + m2 = 0, but that equation has no real solution in Euclidean space, only in
Minkowski space. See Fig.63b,c.
• The fermion propagator can be found from the Euclidean space one by the replacement
ip/ + m → ip
/ + m, p2 + m2 → p2 + m2 − iϵ, and an extra −i in front (from the x space
continuation of t in the action), giving
−ip/ + m −(p/ + im)
−i = 2 (22.17)
+ m − iϵ
p2 2 p + m2 − iϵ
• photon propagator. The same continuation rules give
( )
i kµ kν
− 2 gµν − (1 − α) 2 (22.18)
k − iϵ k − iϵ

199
• vertex +e(γ µ )αβ (there is an i difference between the Euclidean and Minkowski ver-
tices).

• a minus sign for fermion loops.

• external fermion lines: for the ψ|⃗p, s > contraction, = us (p), for the ψ̄|⃗p, s > contrac-
tion, = v̄ s (p), and for the bars, similar: < p⃗, s|ψ̄ = ūs (p) and < p⃗, s|ψ = v s (p).

• external photon lines, Aµ |⃗p >= ϵµ (p) and < p⃗|Aµ = ϵ∗ (p).

• We can also see that if we exchange the order of the external fermion line contractions,
we get a minus sign, so we deduce that if we exchange identical external fermion lines
in otherwise identical diagrams, we get a minus sign.

Ward-Takahashi identities
These are a type of Ward identities for QED, i.e. a local version of the global Ward
identities we studied before. Consider an infinitesimal local gauge invariance transformation

δψ(x) = ieϵ(x)ψ(x)
δAµ (x) = ∂µ ϵ(x) (22.19)

If we consider this as just a change (renaming) of integration variables in the path integral,
the path integral should not change. On the other hand, assuming that the regularization
of the path integral respects gauge invariance, we also expect that there is no nontrivial
Jacobian, and δ(DADψDϕ) = 0.
Note that this assumption is in principle nontrivial. There are classical symmetries
that are broken by the quantum corrections. In the path integral formalism, the way this
happens
∏ is that the regularization of the path integral measure (i.e. defining better Dϕ’s as
i dϕ(xi )’s) does not respect the symmetries (the only other ingredient of the path integral
is the classical action, which is invariant). These are called quantum anomalies. But as we
will see in QFT II, anomalies in gauge symmetries are actually bad, and make the theory
sick, so should be absent from a well-defined regularization. Therefore we will assume that
the regularization of the measure respects gauge invariance.
Since as we said, the change is just a redefinition of the path integral, we also have
¯ =0
δZ[J, ξ, ξ] (22.20)

Using the form of the partition function, and the fact that both the classical action and
the measure are invariant, we only need to evaluate the change due to the gauge fixing and
source terms. This gives
∫ ∫ [ ]
−Sef f 1
¯ − ψ̄ξ) − (∂µ Aµ )∂ ϵ
0 = δZ = DADψDψ̄e d µ
d x J ∂µ ϵ + ieϵ(ξψ 2
(22.21)
α
where the last term comes from varying the gauge fixing term, and the others from varying
the source terms.

200
Taking the derivative with respect to ϵ, and then expressing A, ψ, ψ̄ as derivatives of Sef f
(e.g., Aµ e−Sef f = δ/δJµ e−Sef f ) , taking out of the path integral the remaining objects that
are not integrated over, we are left with an operator that acts on Z, namely
( )
δZ 1 2 δZ ¯δZ δZ
0= = − ∂ ∂µ − (∂µ Jµ )Z + ie ξ ¯ + ξ (22.22)
δϵ(x) α δJµ δξ δξ

Dividing by Z and using that ln Z = −F , we get


( )
1 2 δF ¯δF δF
0 = ∂ ∂µ − ∂µ Jµ − ie ξ ¯ + ξ (22.23)
α δJµ δξ δξ

Then replacing the equations (22.14) in the above, we finally get


( )
1 2 δΓ δΓ cl cl δΓ
− ∂ ∂µ Aµ − ∂µ cl − ie
cl
ψ + ψ̄ =0 (22.24)
α δAµ δψ cl δ ψ̄ cl

These are called generalized Ward-Takahashi identities, or (for the 1PI functions, as we will
shortly derive some examples) the Lee-Zinn-Justin identities.
Example 1. Photon propagator
By taking derivatives of the LZJ identities with respect to Acl cl cl
µ (y) and putting A , ψ , ψ̄
cl

to zero, i.e.
δ
cl
(LZJ)|Acl =ψcl =ψ̄cl =0 (22.25)
δAν (y)
we get

1 δ2Γ 1
− ∂x2 ∂xν δ(x − y) − ∂xµ cl cl
|Acl =ψcl =ψ̄cl =0 = − ∂x2 ∂xν δ(x − y) − ∂xµ G−1
µν (x, y) = 0
α δAµ (x)δAν (y) α
(22.26)
−1
where Gµν is the inverse of the full connected propagator. Indeed, as we saw before, this
equals the two-point effective action, i.e. the second derivative of the effective action. Taking
the Fourier transform, we obtain
[ 2 ]
µ k −1
k δµν − Gµν (k) = 0 (22.27)
α

On the other hand, we have

G−1 (0)−1
µν (k) = Γµν (k) = Gµν + Πµν (22.28)

and since
1
G(0)−1
µν (k) = k 2 δµν − kµ kν + kµ kν (22.29)
α
we have
µ k 2 kν
k G(0)−1
µν (k) = (22.30)
α

201
Replacing it in (22.27), we get
k µ Πµν (k) = 0 (22.31)
That is, the 1PI corrections to the photon propagator are transverse (perpendicular to the
momentum k µ ). That means that we can write them in terms of a single (scalar) function,

Πµν (k) = (k 2 δµν − kµ kν )Π(k) (22.32)

Example 2. n-photon vertex function for n ≥ 3.


We now take n − 1 derivatives with respect to Acl
µ , i.e. put

δ n−1
(LZJ)|Acl =ψcl =ψ̄cl =0 (22.33)
δAcl cl
µ2 (x2 )...δAµn (xn )

and find now


∂ δ (n) Γ
|Acl =ψcl =ψ̄cl =0 (22.34)
∂xµ1 δAcl cl
µ1 (x1 )...δAµn (xn )

Going to momentum space we then find

k µ1 Γ(n) (1)
µ1 ...µn (k , ..., k
(n)
)=0 (22.35)

That is, the n-point 1PI photon vertex functions (from the effective action) are also trans-
verse, like the 1PI 2-point function above.
Example 3. Original Ward-Takahashi identity
We take the derivatives with respect to ψ cl and ψ̄ cl , i.e.

δ2
(LZJ)|Acl =ψcl =ψ̄cl =0 (22.36)
δψβcl (z)δ ψ̄αcl (y)

to obtain
[
∂ δ3Γ δ2Γ
0 = − µ cl − ieδ (x − y) cl
(d)
∂x δψβ (z)δ ψ̄ cl (y)δAµ (x) δψβ (z)δ ψ̄αcl (x)
]
δ2Γ
+ieδ (d) (x − z) cl |Acl =ψcl =ψ̄cl =0 (22.37)
δψβ (x)δ ψ̄αcl (y)

Replacing the derivatives with the corresponding n-point effective action terms, we get

− µ
Γµ;αβ (x; y, z) = −ieδ (d) (x − z)(SF−1 )αβ (y − x) + ieδ (d) (x − y)(SF−1 )αβ (x − z) (22.38)
∂x
where SF−1 is the inverse of the full connected fermion propagator, i.e. the 1PI 2-point
effective action.
In momentum space, this relation becomes

pµ Γµαβ (p; q2 , q1 ) = e(SF−1 (q2 )αβ − SF−1 (q1 )αβ ) (22.39)

202
p
p
q2 q1
=e −e

q2 q1

Figure 64: Diagrammatic representation of the original Ward-Takahashi identity.

which is the original Ward-Takahashi identity: the photon-fermion-antifermion vertex con-


tracted with the momentum gives the difference of the inverse fermion propagator (or 1PI
2-point function) for q2 minus the one for q1 , where q1,2 are the fermion momenta, as in
Fig.64.

Important concepts to remember

• QED is a gauge field (electromagnetic field) coupled to a fermion (generalized to include


a coupling to a complex scalar)

• The generalized Ward-Takahashi identities or Lee-Zin-Justin identities are identities


between n-point effective actions, or 1PI n-point functions, resulting from local versions
of Ward identities.

• The 1PI corrections to the photon propagator are transverse, pµ Πµν (k) = 0.

• The 1PI n-photon vertex functions for n ≥ 3 are also transverse, k µ1 Γµ1 ...µn = 0.

• The original Ward-Takahashi identity relates the photon-fermion-antifermion vertex,


contracted with the photon momentum, with the difference of the inverse photon prop-
agator (2-point effective action) for the two fermions.

Further reading: See chapters 4.8 in [2], 6.3 in [4] and 8.1 in [1].

203
Exercises, Lecture 22

1) Write down the expression for the QED Feynman diagram in Fig.65 for the S-matrix
e (p1 )e+ (p2 ) → e− (p3 )e+ (p4 ).

p2 p4

p3

p1

Figure 65: Feynman diagram for the S-matrix in QED.

2) Using the Lee-Zinn-Justin identities, derive a Ward identity for the vertex Γµ1 µ2 αβ .

204
23 Lecture 23. Nonrelativistic processes: Yukawa po-
tential, Coulomb potential and Rutherford scatter-
ing
In this lecture we will rederive some formulas for classical, nonrelativistic processes using
the quantum field theory formalism we have developed, to check that our formalism works,
and understand how it is applied. In particular, we will calculate the Yukawa and Coulomb
potentials from our formalism, and derive the differential cross section for Rutherford scat-
tering.
Yukawa potential
Even though we are more interested in the QED example of the Coulomb potential, we
will start with the simplest example, the Yukawa potential, as a warm-up.
In Yukawa theory, the interaction of two fermions, ψ(p) + ψ(k) → ψ(p′ ) + ψ(k ′ ), happens
via the exchange of a scalar particle ϕ, of nonzero mass mϕ . We want to see that this gives
indeed the well-known Yukawa potential.

p’
k’
p’ k’
iM= +
q
q k
p k
p
a) b)

Figure 66: The two Feynman diagrams for the potential in the case of Yukawa interaction.
For distinguishable particles, only the diagram a) contributes.

There are two diagrams that in principle contribute to iM. One where the scalar with
momentum q = p − p′ is exchanged between the two fermions, see Fig.66a, and one where
the final fermions are interchanged (the fermion with momentum p′ comes out of the one
with k, and the fermion with momentum k ′ comes out of the one with p), see Fig.66b. But
we will work with distinguishable fermions, i.e. we will assume that there is some property
that we can measure that will distinguish between them (for instance, we could consider one
to be an e− , and another to be a µ− ). In this case, the second diagram will not contribute.
Since this is what happens classically, this is a good assumption in order to get a classical
potential.
In the nonrelativistic limit, we have

p ≃ (m, p⃗); p′ ≃ (m, p⃗′ ); k = (m, ⃗k); k ′ = (m, ⃗k ′ )

205
(p − p′ )2 ≃ (⃗p(− )p⃗′ )2
√ ξs
us (p) ≃ m s (23.1)
ξ
( ) ( )
s 1 0
where ξ = and for s = 1, 2. The last property means that
0 1
( ) ( s)
s′ ′ s †s′
( †s′ †s′ ) 0 1 ξ ′ ′
0 s
ū (p )u (p) = u (iγ )u = m ξ ξ = 2mξ †s ξ s = 2mδ ss (23.2)
1 0 ξs

Then, writing both terms in iM for completeness, we have


−i
iM = (−ig)ū(p′ )u(p) (−ig)ū(k ′ )u(k)
(p − p′ )2 + m2ϕ
−i
−(−ig)ū(p′ )u(k) (−ig)ū(k ′ )u(p) (23.3)
(p − k ′ )2 + m2ϕ

but as we said, we will drop the second term since we consider distinguishable fermions.
Some comments are in order here. We have considered a Yukawa coupling g, which means
a vertex factor −ig in Minkowski space. The fermions are contracted along each of the
fermion lines separately. The overall sign for the fermions is conventional, only the relative
sign between diagrams is important, and that can be found from the simple rule that when
we interchange two fermions between otherwise identical diagrams (like interchanging u(p)
with u(k) in the two terms above), we add a minus sign.
But we can choose a convention. We can define

|⃗p, ⃗k >∼ a†p⃗ a⃗†k |0 >; < p⃗′ , ⃗k ′ | = (|⃗p′ , ⃗k ′ >)† ∼< 0|a⃗k′ ap⃗′ (23.4)

so that in the matrix element

< p⃗′ , ⃗k ′ |(ψ̄ψ)x (ψ̄ψ)y |⃗p, ⃗k >∼< 0|a⃗k′ ap⃗′ (ψ̄ψ)x (ψ̄ψ)y a†p⃗ a⃗†k |0 > (23.5)

the contraction of ⃗k ′ with ψ̄x , p⃗′ with ψ̄y , p⃗ with ψy and ⃗k with ψx , corresponding to the
first diagram (the one we keep), gives a plus sign (we can count how many jumps we need
for the fermions to be able to contract them, and it is an even number). Then we can easily
see that the second diagram, where we now contract ⃗k ′ with ψ̄y and p⃗′ with ψ̄x instead, has
a minus sign. As we said, the simple Feynman rule above accounts for this relative sign.
We then finally have for the contribution of the first diagram (for distinguishable fermions)

ig 2 ′ ′
iM ≃ + (2mδ ss )(2mδ rr ) (23.6)
|⃗p − p⃗ | + m + ϕ
′ 2 2

But we then can compare with the Born approximation to the scattering amplitude in
nonrelativistic quantum mechanics, written in terms of the potential in momentum space,

< p⃗′ |iT |⃗p >= −iV (⃗q)2πδ(Ep′ − Ep ) (23.7)

206
where ⃗q = p⃗′ − p⃗. When comparing, we should remember a few facts: the 2m factors are
from the relativistic normalization we used for the fermions, whereas the Born approximation
above uses nonrelativistic normalization, so we should drop them in the comparison. Then
iT = 2πδ(Ef − Ei )iM, so we drop the 2πδ(Ep′ − Ep ) in the comparison; and also the formula
is at s = s′ , r = r′ . We finally get
−g 2
V (⃗q) = (23.8)
|⃗q|2 + m2ϕ
We should also comment on the logic: it is impossible to directly measure the potential
V (⃗q) in the scattering, since this is a classical concept, so we can only measure things
that depend on it. But for expedience’s sake, we used the comparison with nonrelativistic
quantum mechanics, where we have a quantum mechanics calculation, yet in a classical
potential, in order to directly extract the result for V (⃗q).
The x-space potential is (here |⃗x| ≡ r)

d3 q −g 2
V (⃗x) = ei⃗q·⃗x
(2π)3 |⃗q|2 + m2ϕ
∫ ∞
g2 eiqr − e−iqr 1
= − 2π q 2
dq
(2π) 3 iqr q + m2ϕ
2
∫ +∞
0
g2 qeiqr
= − 2 dq 2 (23.9)
4π ir −∞ q + m2ϕ
∫π ∫1
Here in the second equality we have used 0 sin θdθeiqr cos θ = 0 d(cos θ)e(iqr) cos θ = (eiqr −
e∫−iqr )/(iqr) and in the last equality we have rewritten the second term as an integral from
0
−∞
. The integral has complex poles at q = ±imϕ , so we can close the contour over the real
axis with an infinite semicircle in the upper-half plane, since then eiqr decays exponentially
at infinity. We then pick up the residue of the pole q = imϕ , giving finally
−g 2 −mϕ r
V (r) = e (23.10)
4πr
the well-known attractive Yukawa potential.
Coulomb potential
We can now move to the case of nonrelativistic scattering in QED, as in Fig.67. The
difference is now that we have a vector exchanged between the two fermions, with a (massless)
vector propagator, in the Feynman gauge just gµν times the KG propagator, and a vertex
factor +eγ µ in Minkowski space.
We then get
−igµν
iM = (+e)ū(p′ )γ µ u(p) (+e)ū(k ′ )γ ν u(k) (23.11)
(p − p′ )2
But in the nonrelativistic limit we can check that the only nonzero components are the zero
ones (as expected), so we only need to calculate
( )
′ 0 † ′
( †s′ †s′ ) ξ s ′
ū(p )γ u(p) = u (p )i(γ ) u(p) ≃ −im ξ
0 2
ξ = −2imδ ss (23.12)
ξs

207
p’
k’
iM=
k
p

Figure 67: Feynman diagram for the Coulomb interaction between distinguishable fermions.

(since (γ 0 )2 = −1, from {γ µ , γ ν } = 2g µν ), finally obtaining

ie2 g00 ss′ rr′ ie2 ′ ′


iM ≃ (2mδ )(2mδ ) = − (2mδ ss (2mδ rr ) (23.13)
|⃗p − p⃗ |
′ 2 |⃗p − p⃗|
′ 2

Thus there is a sign change with respect to the Yukawa potential, meaning that the Coulomb
potential is repulsive. Also, the mass is now zero. To go to the configuration space, it is
easier to take the zero mass limit of the Yukawa potential, getting

+e2 α
V (r) = = (23.14)
4πr r
Note that α = e2 /(4π) ≃ 1/137 is the electromagnetic coupling, which is very weak.
Particle-antiparticle scattering

p’
k’

p k

Figure 68: Feynman diagram for the Yukawa potential interaction between particle and
antiparticle.

Yukawa
We now consider scattering a fermion off an antifermion in Yukawa theory. Since now
the second fermion line is antifermionic, with momentum k in and momentum k ′ out, as in
Fig.68, we have a few changes from the calculation before. First, we replace the u’s with v’s

208
for the second line, i.e.
( ) ( s′ )
s′ ′ s′ ′
( †s
) 0 1
†s ξ ′
ū (k )u (k) → v̄ (k)v (k ) ≃ m ξ
s s
−ξ ′ = −2mδ ss (23.15)
1 0 −ξ s

which gives a minus sign with respect to the fermion-fermion calculation. But also changing
the fermion into antifermion amounts to exchanging the fermion lines with momenta k and
k ′ , as we can check. Another way to say this is that if we still do the contraction like in the
fermion-fermion case, i.e. such that the order is: k ′ first, then k, we obtain v(k ′ )v̄(k), and in
order to obtain the Lorentz invariant v̄(k)v(k ′ ) we need to change the order of the fermions.
All in all, we obtain two minus signs, i.e. a plus. Thus the potential between fermion
and antifermion is the same as between two fermions: scalar particle exchange gives an
universally attractive potential.
Coulomb potential
Considering now fermion-antifermion scattering in QED, we again have the same minus
sign for exchanging the fermions, and exchanging the u’s with v’s in the second fermion line
gives now
( )
′ †
( †s †s
) ξs ′
v̄(k)γ v(k ) = v (k)i(γ ) v(k) = −im ξ − ξ
0 0 2
= −2imδ ss (23.16)
−ξ s

instead of

ū(k)γ 0 u(k ′ ) = −2imδ ss (23.17)
so no change here. Thus overall, we have a minus sign with respect to fermion-fermion
scattering, i.e. the fermion-antifermion potential is now attractive. Indeed, e+ e− attract,
whereas e− e− repel.
In conclusion, for the exchange of a vector particle (electromagnetic field in this case),
like charges repel, and opposite charges attract.
We note that the repulsive nature of the fermion-fermion potential was due to the presence
of g00 in (23.13).
We can now guess also what happens for an exchange of a tensor, i.e. spin 2, particle
(the graviton), without doing the full calculation. Since a tensor has two indices, we will
have propagation between a point with indices µν and one with indices ρσ, which in some
appropriate gauge (the equivalent of the Feynman gauge) should be proportional to gµρ gνσ +
gµσ gνρ , and in the nonrelativistic limit only the 00 components should contribute, giving
∝ (g00 )2 = +1.
Therefore we can guess that gravity is attractive, and moreover universally attractive,
like we know from experiments to be the case.
Rutherford scattering
We now calculate an example of nonrelativistic scattering cross section, namely the case
of Rutherford scattering, of a charged particle (”electron”) off a fixed electromagnetic field
(”field of a nucleus”).
We should remember that in the classic experiment of Rutherford that determined that
the atoms are made up of a positively charged nucleus, surrounded by electrons (as opposed to

209
a ”raisin pudding”, a ball of constant density of positive charged filled with electrons), where
charged particles were scattered on a fixed target made of a metal foil, Rutherford made a
classical calculation for the scattering cross section, which was confirmed by experiment,
thus proving the assumption.
We will consider the scattering of an electron off the constant electromagnetic field of a
fixed nucleus. We will thus treat the electromagnetic field classically, and just the fermion
quantum mechanically.
The interaction Hamiltonian is

HI = d3 xeψ̄iγ µ ψAµ (23.18)

Then the S-matrix contribution, coming from ∼ e−i HI dt , the first order contribution is
{ ∫ }
′ ′
< p |iT |p > = < p |T −i d xψ̄ieγ ψAµ |p >
4 µ



= −ieū(p )ieγ u(p) d4 xeip·x e−ip ·x Aµ (x)
′ µ

= +eū(p′ )γ µ u(p)Aµ (p − p′ ) (23.19)

But if we consider that Aµ (x) is time-independent, meaning that

Aµ (p − p′ ) = Aµ (⃗p − p⃗′ )2πδ(Ef − Ei ) (23.20)

and since the M matrix is defined by

< p′ |iT |p >= iM2πδ(Ef − Ei ) (23.21)

we see that in the Feynman rules for iM we can add a rule for the interaction with a classical
field, namely add +eγ µ Aµ (⃗q), with ⃗q = p⃗ − p⃗′ , with a diagram: off the fermion line starts a
wavy photon line that ends on a circled cross, as in Fig.69.

Figure 69: Feynman diagram representation for the interaction of a fermion with a classical
potential.

We now calculate the cross section for scattering of the electron off the classical field
centered at a point situated at an impact parameter ⃗b from the line of the incoming electron
(or rather, the impact parameter is the distance of the incoming electron from the nucleus),

210
as in Fig.70. This is in principle a 2 → 2 scattering, even though we treat the nucleus and
its electromagnetic field classically. As such, we must consider a situation similar to the one
in 2 → 2 scattering, by writing incoming wavefunctions with impact parameter ⃗b, as
∫ 3
d ki ϕ(ki ) i⃗ki ·⃗b
ϕi = √ e (23.22)
(2π)3 2Ei

e− k O(nucleus)

Figure 70: The impact parameter for scattering of an electron off a nucleus (fixed classical
object).

As usual, the probability for i → f is

d3 pf
dP(i → f ) = |out < pf |ϕA >in |2 (23.23)
(2π)3 2Ef

and the cross section element is



dσ = d2⃗bdP(⃗b)
∫ ∫ ∫ 3
d3 pf d3 ki ϕ(ki ) d k̄i ϕ(k̄i )∗ i⃗b·(⃗ki −⃗k¯i ) ∗
= 2
db √ √ e out < pf |ki >in (out < pf |ki >in )
(2π)3 2Ef (2π)3 2Ei (2π)3 2Ei
(23.24)

We then use the relations

out < pf |ki >in = iM(i → f )2πδ(Ef − Ei )


(out∫< pf |ki >in )∗ = −iM∗ (i → f )2πδ(Ef − Ēi )
⃗ ⃗ ⃗¯
d2 beib·(ki −ki ) = (2π)2 δ 2 (ki⊥ − k̄i⊥ )
δ(k̄iz − kiz )
δ(Ef − Ei )δ(Ef − Ēi ) = δ(Ef − Ei ) k̄z
| Eii |
∫ ∫ ∫
d3 pf dpf p2f dΩ dpf p2f Ef
2πδ(Ef − Ei ) = dΩ 2πδ(Ef − Ei ) = 2 δ(pf − pi )
(2π)3 2Ef 3
∫ (2π) 2Ef 8π Ef pf
dΩ
= dpf pf δ(pf − pi ) (23.25)
8π 2

211
k̄z p
where we have used δ(Ēi − Ei ) = δ(k̄iz − kiz )/| Eii | and δ(Ef − Ei ) = δ(pf − pi )/| Eff |. Using the
wavefunctions peaked on the value pi , i.e. |ϕ(ki )|2 = (2π)3 δ 3 (ki − pi ), and since k̄iz = vi Ei ,
putting everything together, we finally have

dσ 1 1
= dpf pf δ(pf − pi )|M(pi → pf )|2 (23.26)
dΩ 16π 2 vi Ei
where Ei ≃ m.
We now apply to the case of the electric field of a nucleus of charge +Ze, i.e. take
+Ze
A0 = (23.27)
4πr
with Fourier transform
Ze
A0 (q) = (23.28)
|⃗q|2
inside the matrix element

iM = eū(pf )γ µ u(pi )Aµ (⃗pf − p⃗i ) (23.29)

In the nonrelativistic limit, as we saw (and summing over s′ , final state helicities),
∑ ′
|ū(pf )γ µ u(pi )|2 ≃ |2m(ξ †s ξ s )|2 δ µ0 = 4m2 δ µ0 (23.30)
s′

giving
Z 2 e2
|M|2 = e2 4m2 |A0 (⃗pf − p⃗i )|2 = e2 4m2 (23.31)
|⃗pf − p⃗i |4
Since Ef = Ei (energy conservation), we have |⃗pf | = |⃗pi |, and if they have an angle θ between
them, |⃗pf − p⃗i | = 2pi sin θ/2 ≃ 2mvi sin θ/2, see Fig.71.

pf

pi

Figure 71: The scattering angle.

Putting everything together, we find


dσ 1 mvi e2 4m2 Z 2 e2

dΩ 8π 2 vi 2m 16m4 vi4 sin4 θ/2
Z 2 α2
= (23.32)
4m2 vi4 sin4 θ/2

212
which is the Rutherford formula, originally found by Rutherford with a classical calculation.

Important concepts to remember

• For distinguishable fermions, fermion-fermion scattering has only one diagram, the
exchange of the force particle, which can be scalar for Yukawa interaction, vector for
Coulomb interaction, tensor for gravitational interaction (etc.)

• In the nonrelativistic limit, the scalar exchange diagram gives rise to the Yukawa
potential. Since the potential is not directly measurable, we calculate it by matching
with the nonrelativistic quantum mechanical calculation of the Born approximation
for scattering in a given potential. The Yukawa potential is attractive.

• In the Coulomb case, due to the g00 = −1 in the vector propagator, we have a repulsive
process for like charges (fermion-fermion scattering).

• For fermion-antifermion scattering, we find the opposite sign in the potential for
Coulomb, and the same sign in the potential for Yukawa.

• To derive the classical Rutherford formula for nonrelativistic scattering of an electron


(or charged particle in general) off a fixed nucleus, we treat the electromagnetic field
generated by the nucleus as a time-independent classical field, interacting with the
quantum field of the incoming electron.

Further reading: See chapters 4.7, 4.8 and exercises in [2].

213
Exercises, Lecture 23

1) Consider a Yukawa interaction with 2 scalars, ϕ1 , ϕ2 with masses m1 , m2 . Using the


quantum mechanical argument in the text, calculate the potential V (⃗x). What happens for
m1 ≪ m2 ? How is the above result for V (⃗x) consistent with the linear principle of quantum
mechanics (wave functions add, not probabilities)?

2) Integrate dσ/dΩ for Rutherford scattering to find σtot (comment on the result). The
dσ/dΩ formula is the same as for classical scattering in a Coulomb potential (that’s how
Rutherford computed it). Yet we used a quantum field theory calculation. Argue how would
the various steps of the QFT calculation turn into steps of a classical calculation.

214
24 Lecture 24. e+e− → ll¯ unpolarized cross section
In the previous lecture we re-derived some formulas for classical, nonrelativistic processes
using the formalism of quantum field theory, in order to gain some confidence in the for-
malism. This lecture we turn to the first new calculation, for the e+ e− → l¯l scattering at
first order (tree level). This will be a relativistic calculation, and also it is quantum field
theory, i.e. it has no classical or (nonrelativistic) quantum mechanics counterpart, since in
this process an electron and a positron annihilate, to create a par of lepton-antilepton, and
this is a purely quantum field theoretic process.
The leptons are e− , µ− , τ − and their antiparticles, together with the corresponding neu-
trinos (νe , νµ , ντ and their antiparticles), but the neutrinos have no charge, so within QED
(interaction only via photons, coupling to charged particles) there is no process creating
neutrinos. The case when l = e− , of Bhabha scattering, is special, and there we have more
diagrams, but for l = µ− or τ − we have only one diagram: e− (p)e+ (p′ ) annihilate into a
photon, who then creates l(k)¯l(k ′ ). For definiteness we will assume that we have a µ− .
Note that we need sufficient energy to create the lepton pair, so only for ECM > 2ml is
the process possible. Therefore, for ECM = 2Ee,CM < 2mµ , we can create only e+ e− , for
2mµ < ECM < 2mτ we can create e+ e− or µ+ µ− , and for ECM > 2mτ we can create all,
e+ e− , µ+ µ− , τ + τ − .

− +
k
k’
q

p p’

e− e+
Figure 72: Feynman diagram for e+ e− → µ+ µ− scattering at tree level.

Using the Feynman rules for Fig.72, we can write the amplitude (in Feynman gauge)
( )
s′ µ s ′ −igµν ′
iM = v̄ (p)(+eγ )u (p ) 2
ūr (k)(+eγ ν )v r (k ′ )
q
e2 ( s′ ′ µ s ) ( s ′
)
= −i 2 v̄ (p )γ u (p) ū (k)γµ v s (k ′ ) (24.1)
q

Note that the vector propagator is −igµν /(q 2 − iϵ), but in this tree level process q is not
integrated, but rather q = p + p′ , so q 2 ̸= 0, therefore we can drop the iϵ.

215
For the large brackets, since they are numbers, their complex conjugates (equal to the
adjoint for a number) give

(v̄iγ µ u)∗ = u† (iγ µ )† (iγ 0 )† v = u† iγ 0 iγ µ v = ūiγ µ v (24.2)

where we have used (iγ 0 )† = iγ 0 and (γ µ )† iγ 0 = −iγ 0 (γ µ ). Therefore without the i, we have
(v̄γ µ u)∗ = −ūγ µ v.
Then we can calculate |M|2 as

e4 [( s′ ′ µ s ) ( s ν s′ ′
)] [(
r′ ′
)( ′

)]
|M|2 = (−)2 v̄ (p )γ u (p) ū (p)γ v (p ) ūr
(k)γµ v (k ) v̄ r
(k )γν ur
(k)
q4
(24.3)
But in many experiments, we don’t keep track of spins (helicities), usually because it is
difficult to do so. Therefore, we must average over the initial spins (since we don’t know the
values of the initial spins), and sum over the final spins (since we measure the final particles
no matter what their spin is). We call the resulting probabilities the unpolarized scattering
cross section.
It is then obtained from
( )( )
1∑ 1 ∑ ∑∑ 1∑
|M(s, s′ → r, r′ )|2 = |M|2 (24.4)
2 s 2 s′ r r ′
4 spins

But in order to do the sums over spins, we use the results derived in lecture 14, that
∑ ∑
usα (p)ūsβ (p) = −ip/α β + mδβα ; v sα (p)v̄βs (p) = −ip/α β − mδβα (24.5)
s s

That means that


∑ ′ ′
v̄αs (p′ )(γ µ )α β usβ (p)ūsδ (p)(γ ν )δ ϵ v s ϵ (p′ ) = (−ip
/′ − me )ϵ α (γ µ )α β (−ip/ + me )β δ (γ ν )δ ϵ
ss′
/′ − me )γ µ (−ip/ + me )γ ν ]
= Tr[(−ip (24.6)

Then similarly we obtain


∑ [( )( ′ )]
r′
ū (k)γµ v (k ) v̄ (k )γν u (k) = Tr[(−ik/ + mµ )γµ (−ik/′ − mµ )γν ]
r ′ r ′ r
(24.7)
rr ′

so that finally

1∑ e4
|M|2 = 2 Tr[(−ip/′ − me )γ µ (−p
/ + me )γ ν ] Tr[(−ik/ + mµ )γµ (−ik/′ − mµ )γν ] (24.8)
4 spins 4q

To calculate this further, we need to use gamma matrix identities, therefore we stop here
and now write all possible gamma matrix identities which we will need here and later on.
Gamma matrix identities

216
First, we remember our conventions for gamma matrices, with the representation
( ) ( ) ( )
0 1 0 σi 1 0
γ = −i
0
; γ = −i
i
; γ =5
(24.9)
1 0 −σ i 0 0 −1

We then see that at least in this representation, we have

Tr[γ µ ] = Tr[γ 5 ] = 0 (24.10)

But this result is more general, since, using the fact that in general (γ 5 )2 = 1, and {γ µ , γ 5 } =
0, we have

Tr[γ µ ] = Tr[(γ 5 )2 γ µ ] = − Tr[γ 5 γ µ γ 5 ] = − Tr[(γ 5 )2 γ µ ] = − Tr[γ µ ] (24.11)

and therefore Tr[γ µ ] = 0. Here we have used the anticommutation of γ 5 with γ µ , and then
the cyclicity of the trace, to put back the γ 5 from the end of the trace, to the beginning.
Using the same cyclicity of the trace, we have
[ ]
1 µ ν
Tr[γ γ ] = Tr {γ , γ } = g µν Tr[1] = 4g µν
µ ν
(24.12)
2

We can also calculate that the trace of an odd number of gammas gives zero, by the same
argument as for the single gamma above:

Tr[γ µ1 ...γ µ2n+1 ] = Tr[(γ 5 )2 γ µ1 ...γ µ2n+1 ] = (−1)2n+1 Tr[γ 5 γ µ1 ...γ µ2n+1 γ 5 ]
= − Tr[(γ 5 )2 γ µ1 ...γ µ2n+1 ] = − Tr[γ µ1 ...γ µ2n+1 ] (24.13)

and therefore
Tr[γ µ1 ...γ µ2n+1 ] = 0 (24.14)
In general, the method for calculating the traces of products of gamma matrices is to expand
in the complete basis of 4 × 4 matrices,

OI = {1, γ 5 , γ µ , γ µ γ 5 , γ µν } (24.15)

since for OI ̸= 1, we can check that Tr[OI ] = 0. Here and always, a gamma matrix with more
than one index means that the indices are totally antisymmetrized, i.e. γ µν = 1/2[γ µ , γ ν ],
γ µνρ = 1/6[γ µ γ ν γ ρ − 5 terms], etc. Indeed, we have
1
Tr[γ µν ] = Tr[[γ µ , γ ν ]] = 4g (µν) = 0 (24.16)
2
(the antisymmetric part of g µν is zero), and also by gamma matrix anticommutation,

Tr[γ µ γ 5 ] = − Tr[γ 5 γ µ ] (24.17)

therefore
Tr[γ µ γ 5 ] = 0 (24.18)

217
We also have the relations
i
γ µν γ 5 = − ϵµνρσ γρσ
2
γ µνρ = −iϵµνρσ γσ γ5
γ µνρ γ5 = −iϵµνρσ γσ
γ µνρσ = iϵµνρσ γ5 (24.19)

To prove this, it suffices it to show them for some particular indices. Consider the first
relation for µ = 0, ν = 1. Note that for matrices gamma with several indices, the indices
must be different (the totally antisymmetric part is nonzero only if the indices are different).
But for different indices, the product of gamma matrices is already antisymmetrized, due to
the anticommutation property of the gamma matrices, γ µ γ ν = −γ ν γ µ , if µ ̸= ν. Therefore
we have e.g. γ 01 = γ 0 γ 1 , etc. On the left hand side (lhs) of the first relation, we have

γ 0 γ 1 γ 5 = γ 0 γ 1 (−i)γ 0 γ 1 γ 2 γ 3 = −iγ 2 γ 3 , (24.20)

(where we have used γ 0 γ 1 = −γ 1 γ 0 and (γ 0 )2 = −1, (γ 1 )2 , which follow from {γ µ , γ ν } = 2g µν )


and on the right hand side (rhs), we have −i/2 × 2ϵ0123 γ23 = −iγ2 γ 3 = −iγ 2 γ 3 , therefore the
same result. (the factor of 2 is because we can have sum over (23) or (32)).
The second relation is proved choosing, e.g. µ = 0, ν = 1, ρ = 2, in which case on the lhs
we have γ 0 γ 1 γ 2 , and on the rhs we have

−iϵ0123 γ3 (−i)γ 0 γ 1 γ 2 γ 3 = −γ 3 γ 0 γ 1 γ 2 γ 3 = +γ 0 γ 1 γ 2 (γ 3 )2 = +γ 0 γ 1 γ 2 (24.21)

The third relation follows from multiplication with γ 5 and using (γ 5 )2 = 1. The fourth
relation can be proven by choosing the only nonzero case, µ = 0, ν = 1, ρ = 2, σ = 3 (of
course, we can also have permutations of these), in which case the lhs is γ 0 γ 1 γ 2 γ 3 and the
rhs is
iϵ0123 (−i)γ 0 γ 1 γ 2 γ 3 = γ 0 γ 1 γ 2 γ 3 (24.22)
To calculate the trace of a product of 4 gamma matrices, we decompose in the basis
OI , which has Lorentz indices, times a corresponding constant Lorentz structure that gives
a total of 4 Lorentz indices. But in 4 dimensions the only possibilities are g µν , ϵµνρσ , and
products of them. That means that there are no constant Lorentz tensors with odd number
of indices, and correspondingly in the product of 4 gamma matrices we cannot have γ µ and
γ µ γ5 in the OI decomposition, since it would need to be multiplied by a 3-index constant
Lorentz structure. We can only have 1, γ5 or γ µν .
For 1 and γ5 we can multiply with two possible Lorentz structures, ϵµνρσ , and g µν g ρσ
and permutations. But we already know that γ µνρσ , which should certainly appear in the
product of gammas, can be written as iϵµνρσ γ5 , and therefore the g µν g ρσ ’s should multiply
the 1 term. Then finally we can write the general formula

γ µ γ ν γ ρ γ σ = c1 ϵµνρσ γ5 + c2 g µν γ ρσ + c3 g µρ γ νσ + c4 g µσ γ νρ + c5 g νρ γ µσ
+c6 g νσ γ µρ + c7 g ρσ γ µν + c8 g µν g ρσ + c9 g µρ g νσ + c10 g µσ g νρ (24.23)

218
Since Tr[γ5 ] = Tr[γµν ] = 0, we have

Tr[γ µ γ ν γ ρ γ σ ] = (c8 g µν g ρσ + c9 g µρ g νσ + c10 g µσ g νρ ) Tr[1] = 4(c8 g µν g ρσ + c9 g µρ g νσ + c10 g µσ g νρ )


(24.24)
To determine c8 , c9 , c10 , we consider indices for which only one structure gives a nonzero
result. We can choose µ = ν = 1, ρ = σ = 2. Then c1 does not contribute, and also
c2 − c7 , since they have symmetric×antisymmetric indices, unlike here, whereas c9 , c10 do
not contribute due to the fact that ρ = σ ̸= µ = ν. Then the lhs is γ 1 γ 1 γ 2 γ 2 = +1, and the
rhs is +c8 , meaning c8 = 1. If we choose µ = ρ = 1, ν = σ = 2 instead, in the same way,
we isolate c9 . Then the lhs is γ 1 γ 2 γ 1 γ 2 = −1, and the rhs is +c9 , meaning c9 = −1. If we
choose µ = σ = 1, ν = ρ = 2, we isolate c10 . Then the lhs is γ 1 γ 2 γ 2 γ 1 = +1, and the rhs is
+c10 , meaning c10 = +1. Therefore

Tr[γ µ γ ν γ ρ γ σ ] = 4(g µν g ρσ − g µρ g νσ + g µσ g νρ ) (24.25)

This relation can be derived in another way as well, by commuting one of the gamma matrices
past all the other ones, and then using cyclicity of the trace at the end:

Tr[γ µ γ ν γ ρ γ σ ] = Tr[(2g µν − γ ν γ µ )γ ρ γ σ ] =
... = Tr[2g µν γ ρ γ σ − 2g µρ γ ν γ σ + 2g µσ γ ν γ σ ] − Tr[γ µ γ ν γ ρ γ σ ]
= Tr[2g µν γ ρ γ σ − 2g µρ γ ν γ σ + 2g µσ γ ν γ σ ] − Tr[γ µ γ ν γ ρ γ σ ] (24.26)

which implies the same relation as above. We have calculated it via decomposition in OI
since the method generalizes easily to other cases. In particular, now we can direcly use
it to calculate Tr[γ µ γ ν γ ρ γ σ γ5 ], by multiplying (24.23) with γ5 . Then again c2 − c7 do not
contribute, since γµν γ5 ∝ γρσ as we proved above, so it has zero trace. But now c8 − c10 are
multiplied by γ5 , so also have zero trace, and the only nonzero result in the trace comes from
c1 , since it multiplies (γ 5 )2 = 1. Then

Tr[γ µ γ ν γ ρ γ σ γ5 ] = 4c1 ϵµνρσ (24.27)

To calculate c1 , we isolate indices that only contribute to c1 , namely µ = 0, ν = 1, ρ = 2, σ =


3. Then the lhs of (24.23) is γ 0 γ 1 γ 2 γ 3 , and the rhs is c1 ϵ0123 (−i)γ 0 γ 1 γ 2 γ 3 = −ic1 γ 0 γ 1 γ 2 γ 3 ,
and therefore c1 = i, giving finally

Tr[γ µ γ ν γ ρ γ σ γ5 ] = 4iϵµνρσ (24.28)

We will also use contractions of ϵ tensors, so we write the results for them here:

ϵαβγδ ϵαβγδ = 4!ϵ0123 ϵ0123 = −24


ϵαβγµ ϵαβγν = 3!δνµ ϵ0123 ϵ0123 = −6δνµ
ϵαβµν ϵαβρσ = 2!(δρµ δσν − δσµ δρν )ϵ0123 ϵ0123 = −2(δρµ δσν − δσµ δρν ) (24.29)

In the first relation we have 4! terms, for the permutations of 0123, in the second we have
µ ̸= αβγ and also ν ̸= αβγ, which means that µ = ν, and for given µ = ν, αβγ run over the
remaining 3 values, with 3! permutations. For the third relation, we have µν ̸= αβ, and also

219
ρσ ̸= αβ, meaning µν = ρσ, or δρµ δσν − δσµ δρν , and for given µν, αβ run over 2 values, giving
2! terms.
We also have

Tr[γ µ γ ν ...] = Tr[Cγ µ C −1 Cγ ν C −1 ...C −1 ] = (−1)n Tr[(γ µ )T (γ ν )T ...]


= (−1)n [Tr[(...γ ν γ µ )T ] = (−1)n Tr[...γ ν γ µ ] (24.30)

but since in any case for odd n the trace is zero, it means that the trace of many gammas is
equal to the trace of the gammas in the opposite order

Tr[γ µ γ ν ...] = Tr[...γ ν γ µ ] (24.31)

Finally, we write the sums over γ µ (...)γµ with some other gammas in the middle. First,
since {γ µ , γ ν } = 2g µν , it means that

γ µ γµ = δµµ = 4 (24.32)

and further, using this, we get

γ µ γ ν γµ = −γ ν γ µ γµ + 2g µν γ ν = −2γ ν (24.33)

In turn, using this, we can calculate the trace with two gammas in the middle,

γ µ γ ν γ ρ γµ = −γ ν γ µ γ ρ γµ + 2g µν γ ρ γµ = 2γ ν γ ρ + 2γ ρ γ ν = 4g νρ (24.34)

And finally, using this, we can calculate the trace with three gammas in the middle,

γ µ γ ν γ ρ γ σ γµ = −γ ν γ µ γ ρ γ σ γµ + 2g µν γ ρ γ σ γµ = −4γ ν g ρσ + 2(−γ σ γ ρ + 2g ρσ ) = −2γ σ γ ρ γ ν


(24.35)
Cross section for unpolarized scattering
We can now go back to the calculation of the cross section. Since the trace of three
gamma matrices gives zero, in the two traces in (24.8), only the traces of two and four
gamma matrices contribute.
We then have for the first trace

Tr[(−ip/′ − me )γ µ (−ip/ + me )γ ν ] = −4p′ρ pσ (g ρµ g νσ − g ρσ g µν + g νρ g µσ ) − m2e 4g µν


= 4[−p′µ pν − p′ν pµ − g µν (−p · p′ + m2e )] (24.36)

and similarly for the second trace

Tr[(−ik/ + mµ )γµ (−ik/′ − mµ )γν ] = −4kρ kσ′ (g µρ g νσ − g ρσ g µν + g ρν g µσ ) − 4m2µ gµν


= 4[−kµ kν′ − kν kµ′ − gµν (−k · k ′ + m2µ )] (24.37)

But since me /mµ ≃ 1/200, we neglect me and only keep mµ .


Then we obtain for the unpolarized |M|2 ,
1∑ 4e4
|M|2 = 4 [p′µ pν + p′ν pµ − p · p′ g µν ][kµ kν′ + kν kµ′ + gµν (−k · k ′ + m2µ )]
4 spins q

220
4e4
= [2(p · k)(p′ · k ′ ) + 2(p · k ′ )(p′ · k) + 2p · p′ (−k · k ′ + m2µ )
q4
+2k · k ′ (−p · p′ ) + 4(−p · p′ )(−k · k ′ + m2µ )]
8e4
= 4 [(p · k)(p′ · k ′ ) + (p · k ′ )(p′ · k) − m2µ p · p′ ] (24.38)
q
CM frame cross section
Since muons are created, we need to have 2Ee > 2mµ ∼ 400me , therefore the electrons
are ultrarelativistic, and can be considered as massless. Therefore for the electron and
positron we have p = (E, E ẑ) and p′ = (E, −E ẑ). Since the µ+ µ− have the same mass,
Ee = Eµ , but they are not necessarily ultrarelativistic, so k = (E, ⃗k) and k ′ = (E, −⃗k),

and |⃗k| = E 2 − m2µ , see Fig.73. We also define the angle θ between the electrons and the
muons, i.e. ⃗k · ẑ = |⃗k| cos θ.

k=(E, k)
p=(E, Ez)
p’=(E, −Ez)
k’=(E, −k)

Figure 73: The kinematics of center of mass scattering for e+ e− → µ+ µ− .

Then also
q 2 = (p + p′ )2 = −4E 2 = 2p · p′ ⇒ p · p′ = −2E 2 (24.39)
We also obtain

p · k = p′ · k ′ = −E 2 + E|⃗k| cos θ
p · k ′ = p′ · k = −E 2 − E|⃗k| cos θ (24.40)

Then substituting these formulas in |M|2 , we get

1∑ 8e4
|M|2 = [E 2 (E − |k| cos θ)2 + E 2 (E + |k| cos θ)2 + 2E 2 m2µ ]
4 spins 16E 4
e4 2
= 2
[E + k 2 cos2 θ + m2µ ]
E[ ( ) ]
m2µ m2µ
= e 1 + 2 + 1 − 2 cos θ
4 2
(24.41)
E E

221
The differential cross section is
( )
dσ 1 |p1 | 1∑
= |M|2 (24.42)
dΩ CM 2EA 2EB |vA − vB | (2π)2 4ECM 4 spins

But 2EA = 2EB = 2E = ECM , and ⃗vA = p⃗A /EA = ẑ, and ⃗vB = p⃗B /EB = −ẑ, so
|vA − vB | = 2. Then
( ) √ [ ( ) ]
dσ |k| 1 1∑ α 2 m2µ m2µ m2µ
= 2
|M| =
2
2
1 − 2 1 + 2 + 1 − 2 cos θ
2
dΩ CM 2ECM 16π 2 2E 4 spins 4ECM E E E
(24.43)
In the ultrarelativistic limit E ≫ mµ , it becomes
( )
dσ α2
= 2
(1 + cos2 θ) (24.44)
dΩ CM,ultrarel. 4ECM
The total cross section∫is found by integrating
∫ 1 over the solid angle, dΩ = 2π sin θdθ =
1
−2πd(cos θ), and using −1 d(cos θ) = 2 and −1 d(cos θ) cos2 θ = 2/3, we get
∫ 1

σtot = 2π d(cos θ) (cos θ)
−1 √ dΩ [ ]
α 2 m2µ 8 4 m2µ
= 2π 2 1− 2 +
4ECM√ E 3 3 E2
[ ]
4π α2 m2µ m2µ
= 2
1− 2 1+ (24.45)
3 ECM E 2E 2
In the ultrarelativistic limit, E ≫ mµ , it becomes
4π α2
σtot → 2
(24.46)
3 ECM

Important concepts to remember


• The e+ e− → l¯l is quantum field theoretic, and for l = µ, τ , we have only one Feynman
diagram.
• The complex conjugation changes∑(ūγv) into v̄γu, meaning that the sum over spins in
|M|2 generates sums over spins s vs (p)v̄ s (p), allowing us to convert all the u’s and
v’s into traces of gamma matrices.
• In calculations with gamma matrices, we use the Clifford algebra, anticommutations,
Lorentz invariant structures, the complet set OI , etc.
• The total unpolarized e+ e− → l¯l cross section is finite, and given in the ultrarelativistic
limit and in the CM frame by 4π/3α2 /ECM 2
.
Further reading: See chapters 5.1 of [2] and 8.1 of [1].

222
Exercises, Lecture 24

1) Consider Bhabha scattering, e+ e− → e+ e− . Write down the Feynman diagrams, and


find the expression for |M|2 in terms of us (p)’s and v s (p)’s.

2) Consider the gamma matrices in 3 Euclidean dimensions, γ i , satisfying the Clifford


algebra {γ i , γ j } = 2δ ij . Calculate Tr[γ i γ j ], Tr[γ i γ j γ k ] and Tr[γ i γ j γ k γ l ]. You can use a
specific representation for γ i if you want (though it’s not needed).

223
25 Lecture 25. e+e− → ll¯ polarized cross section; cross-
ing symmetry
In this lecture we show how to calculate the same process as last lecture, but for given
spin, i.e. the polarized cross section. Then we will explore an important symmetry of QFT
processes called crossing symmetry and how it appears in the Mandelstam variables (which
we will introduce).
For simplicity we will work in the ultrarelativistic case me , mµ → 0.
We have seen that in the Weyl representation for gamma matrices, we have
( )
1 0
γ5 = (25.1)
0 −1

That means that PL = (1 + γ5 )/2 projects a fermion ψ into its upper two components, called
ψL , and PR = (1 − γ5 )/2 projects the fermion unto its lower two components, called ψR , i.e.
PL ψ = ψL , PR ψ = ψR . PL , PR are projectors, i.e. PL2 = PL , PR2 = PR and PL PR = PR PL = 0,
i.e. PR ψL = 0 and PL ψR = 0.
Here ψL and ψR have given helicities, i.e. eigenvalues of the projection of the spin onto
the direction of the momentum, i.e. h = S ⃗ · p⃗/|⃗p| = ±1/2. Therefore measuring a given spin,
or more precisely a given helicity, means considering only ψL or ψR fermions.
Note that if ψL = PL ψ = (1 + γ5 )/2ψ, then (since γ5† = γ5 and γ5 anticommutes with the
other gammas)
( )†
1 + γ5 1 + γ5 1 + γ5 0 1 − γ5 1 − γ5
ψL = ψ= ψ iγ 0 = ψ † iγ = ψ † iγ 0 = ψ̄ = (ψ̄)R (25.2)
2 2 2 2 2

so for instance uL = (ū)R .


Consider a Lorentz invariant object like the one we calculated before, v̄(p′ )γ µ u(p). We

want to calculate v̄(p′ )s γ µ us (p) with s, s′ corresponding to uR and (v̄)L = vR . Since PR uR =
uR , we can introduce for free a PR in front of uR . Then, also on the bar side we have a
nonzero element, since

1 − γ5 1 + γ5 µ 1 − γ5
v̄(p′ )γ µ u(p) = v̄(p′ ) γ u(p) = v(p′ )γ µ u(p) = v̄(p′ )γ µ u(p) (25.3)
2 2 2
But since PR uL = 0, and also (ψ̄)R PL = 0, we can for free add the terms with the other
spins (uL and/or (v̄)R ), since they give zero when we have PR inside, and therefore calculate
the sum over spins for the e+ e− factor in |M2 | (note that now it is a sum over spins, not an
average, because now we do know the initial spin, we don’t average over it, we just choose
to add contributions of spins which give zero in the sum)
∑ ( ) ∑ ( ) ( )
′ µ 1 − γ5 ′ µ 1 − γ5 1 − γ5
|v̄(p )γ 2
u(p)| = v̄(p )γ u(p)ū(p)γ ν
v(p′ ) (25.4)
spins
2 spins
2 2

224
∑ ∑
Using the sums over spins, s us (p)ūs (p) = −ip/ + m ≃ −ip / and s v s (p)v̄ s (p) = −ip/ − m ≃
−ip/, we obtain
[ ( ) ( )] [ ]
′ µ 1 − γ5 1 − γ5 ′ µ ν 1 − γ5
− Tr /p γ /pγ ν
= − Tr /p γ /pγ (25.5)
2 2 2

where we have commuted the first PR past two gammas, thus giving PR again, and then used
PR2 = PR . Now we have two terms, one is the same as in the unpolarized cross section, with
4 gammas inside the trace, and a new one that has an extra γ5 (the trace with 4 gammas
and a γ5 was calculated last lecture also). Substituting the result of the traces, we get
4 1
− [p′µ pν + p′ν pµ − g µν p · p′ ] + 4ip′ρ pσ ϵρµσν = −2[p′µ pν + p′ν pµ − g µν p · p′ − iϵρµσν p′ρ pσ ] (25.6)
2 2
Similarly, we can calculate the sum over spins in the µ+ µ− factor in |M|2 , which gives
∑ ( ) [ ( ) ( )]
1 − γ5 ′ 2 1 − γ5 ′ 1 − γ5
|ū(k)γµ v(k )| = − Tr k/γµ k/ γν
2 2 2
spins
[ ]
= −2 kµ kν′ + kν kµ′ − gµν k · k ′ − ik α k ′β ϵαµβν (25.7)

Then we get for |M|2 (where besides the two factors calculated above we have the factor
e /q 4 )
4

4e4
|M(e− − + 2
R eL → µR µL )|
+
= [2(p · k)(p′ · k ′ ) + 2(p · k ′ )(p′ · k) − ϵρµσν ϵαµβν p′ρ pσ k α k ′β ]
q4
4e4
= [2(p · k)(p′ · k ′ ) + 2(p · k ′ )(p′ · k) + 2((p · k ′ )(p · k ′ ) − (p · k)(p′ · k ′ ))]
q4
16e4
= 4
(p · k ′ )(p′ · k) (25.8)
q

where we used ϵρµσν ϵαµβν = −2(δαρ δβσ − δβρ δασ ). Using also (from last lecture) q 2 = −4E 2 and
p · k ′ = p′ · k = −E(E + k · cos θ), and in the ultrarelativistic limit E = k, we obtain

|M(e− + − + 2
R eL → µR µL )| = e (1 + cos θ)
4 2
(25.9)

Then, since in the center of mass frame, in the ultrarelativistic limit we have (as we saw
before)
dσ |M|2
|CM,mi →0 = 2
(25.10)
dΩ 64π 2 ECM
we finally obtain
dσ − + − + α2
(eR eL → µR µL )|CM,mi →0 = 2
(1 + cos θ)2 (25.11)
dΩ 4ECM
We can similartly calculate the cross sections for other polarizations. For the process
e− +
R eL → µ−L µR , for instance, only one of the two factors has γ5 → −γ5 , meaning that there
+

225
is a minus sign in front of the ϵρµσν in the first term, therefore changing the relative sign in
the next to last line in (25.8). The details are left as an exercise, but it follows that we have
dσ − + α2
(eR eL → µ− µ
L R
+
)|CM,m i →0 = 2
(1 − cos θ)2 (25.12)
dΩ 4ECM
It is then clear that the other two nonzero processes give
dσ − + α2
(eL eR → µ− µ+
)|
R L CM,mi →0 = 2
(1 − cos θ)2
dΩ 4ECM
dσ − + α2
(eL eR → µ− µ
L R
+
)|CM,m i →0 = 2
(1 + cos θ)2 (25.13)
dΩ 4ECM
All the other helicity combinations give zero by the fact that PL PR = PR PL = 0.
There is another way of calculating the polarized cross section that we will just sketch,
but not follow through. Namely, we can use explicit forms for the spinors:
[√ ( 3) √ ( 3 )] 
(√ ) (1 )
−p · σξ E+p 3 1−σ
+ E − p 1+σ3 ξ E→∞ √ (1 + p̂ · ⃗σ )ξ
u(p) = √ = [√ ( 3) √ ( 3 )]  →
2 2
2E 12
−p · σ̄ξ E + p3 1+σ + E − p3 1−σ ξ 2
(1 − p̂ · ⃗σ )ξ
2 2
(25.14)
and similarly (√ ) ( 1 )
−p · σξ E→∞ √ (1 + p̂ · ⃗σ )ξ
v(p) = √ → 2E 2 (25.15)
− −p · σ̄ξ − 12 (1 − p̂ · ⃗σ )ξ
One can then use the explicit form of the gamma matrices in the Weyl representation together
with the above formulae to compute the |M|2 ’s. We will not do it here, but the details can
be found for instance in Peskin and Schroeder.
Crossing symmetry
Feynman diagrams contain more information than the specific processes we analyze, since
by simply rotating the diagrams (considering time running in a different direction) we obtain
a new process, related to the old one in a simple way, by a transformation called crossing.
We will find that we can transform the momenta as well in order to obtain the same value for
the amplitude M, obtaining crossing symmetry. Of course, if we apply the transformation
of momenta on the functional form of a given amplitude, in general we will not obtain a
symmetry. We will discuss this more later.
In the case of the e+ e− → µ+ µ− process, the ”crossed diagram” (”crossed process”),
obtained by a 900 rotation, or by time running horizontally instead of vertically in the
diagram, is e− (p1 )µ− (p2 ) → e− (p′1 )µ− (p′2 ), as in Fig.74, giving
−igµν e2
iM = ū(p′1 )(+eγ µ )u(p1 ) 2
ū(p′
2 )(+eγ ν
)u(p 2 ) = −i 2
ū(p′1 )γ µ u(p1 )ū(p′2 )γµ u(p2 ) (25.16)
q q
Then the unpolarized |M|2 gives
1∑ e4
/′2 + mµ )γµ (−ip
|M|2 = 4 Tr[(−ip/1 + me )γ µ (−ip/′1 + me )γ ν ] Tr[(−ip /2 + mµ )γν ] (25.17)
4 spins 4q

226
p1’ p2’
q
p1 p2

e−
Figure 74: Feynman diagram for e− µ− → e− µ− scattering at tree level.

Here q = p1 − p′1 . This is the same result as for e+ e− → µ+ µ− , except for renaming the
momenta, and some signs. More precisely,
∑ when we change antiparticles
∑ to particles, we
change v s (p)’s to us (p)’s, and since s us (p)ūs (p) = −ip
/ + m, but s v s (p)v̄ s (p) = −ip
/ − m,
and since we do this for two terms, changing −ip/ + m → −ip/ − m = −(ip / + m). Thus we
finally see that the rules for replacing the original diagram with the crossed diagram are

p → p1 ; p′ → −p′1 ; k → p′2 ; k ′ → −p2 (25.18)

Thus we obtain
1∑ 8e4
|M(e− µ− → e− µ− )|2 = 4 [(p1 · p′2 )(p′1 · p2 ) + (p1 · p2 )(p′1 · p′2 ) + m2µ p1 · p′1 ] (25.19)
4 spins q

e− p1’=(k, k)
p1=(k, kz) −
e− p2=(E, −kz)

p2’=(E, −k)

Figure 75: Kinematics of e− µ− → e− µ− scattering in the center of mass frame.

In this process, in the center of mass frame, we have for the initial electron p1 = (k, kẑ),
and for the initial muon we have opposite initial momentum (but higher energy), p2 =
(E, −kẑ), with E 2 = k 2 + m2µ . For the final electron we have p′1 = (k, ⃗k) and for the final
muon p′2 = (E, −⃗k), see Fig.75. The angle between the initial and final directions being θ,
we have ⃗k · ẑ = k cos θ. We also have the center of mass energy ECM = E + k and for the

227
various invariants

p1 · p2 = p′1 · p′2 = −k(E + k)


p′1 · p2 = p1 · p′2 = −k(E + k cos θ)
p1 · p′1 = −k 2 (1 − cos θ)
q 2 = −2p1 · p′1 = 2k 2 (1 − cos θ) (25.20)

Then the sum over spins in the center of mass frame is

1∑ 2e4
|M|2 = 2 [(E + k)2 + (E + k cos θ)2 − m2µ (1 − cos θ)] (25.21)
4 spins k (1 − cos θ)2

and the differential cross section is


dσ |M|2 α2
= = [(E + k)2 + (E + k cos θ)2 − m2µ (1 − cos θ)]
dΩ CM 2
64π 2 ECM 2k 2 (1 − cos θ)2 (E + k)2
(25.22)
In the ultrarelativistic limit, E ≫ mµ , we have

dσ α2 θ→0 1
→ [4 + (1 + cos θ)2 ] ∝ 4 (25.23)
dΩ CM 2ECM (1 − cos θ)
2 2 θ

The
∫ fact that
∫ dσ/dΩ → 1/θ4 as θ → 0 means that there is a strong divergence at θ = 0 (since
dθ|0 ∼ dθ/θ4 → ∞ in the massless limit m → 0; remember that the exchanged photon
is massless, as is the electron in the ultrarelativistic limit). This is indicative of something
that will be dealt with in the second semester, infrared (IR) divergences for scattering of
massless particles.
To complete this portion, we now can define the general crossed diagram by generalizing
the case e+ e− → µ+ µ− into e− µ− → e− µ− , and the corresponding transformation that leaves
the amplitude invariant as

M(ϕ(p) + ... → ...) = M(... → ... + ϕ̄(k)) (25.24)

Here ϕ is a particle, and ϕ̄ is the corresponding antiparticle, i.e. we change particle to


antiparticle, exchanging in and out, and the momenta as k = −p, see Fig.76. This equality
defines crossing symmetry. Note that since k = −p, we can’t have k 0 > 0 and p0 > 0
simultaneously, which means that when a particle is real, the crossed particle is virtual, and
viceversa. Thus really, crossing symmetry also corresponds to an analytical continuation in
energy.
Mandelstam variables
Crossing symmetry is easier to describe in Mandelstam variables, so we will define them
better now.
For a 2 → 2 scattering process, with incoming momenta p, p′ and outgoing momenta
k, k ′ , as in Fig.77, we define the Mandelstam variables s, t, u by

s = −(p + p′ )2 = −(k + k ′ )2 = ECM


2

228
k=−p

Figure 76: Crossing symmetry (includes a charge conjugation).

k k’

p p’

Figure 77: Mandelstam variables.

t = −(k − p)2 = −(k ′ − p′ )2


u = −(k ′ − p)2 = −(k − p′ )2 (25.25)
2
Note that s = ECM . Also, in the more interesting case that the particles with momenta p
and k are the same one (elastic scattering), Ek = Ep , meaning that t = −(⃗k − p⃗)2 is minus
the invariant momentum transfer squared (⃗k − p⃗ = −(⃗k ′ − p⃗′ ) is the momentum transferred
between the scattering particles).
Now consider again the process e+ e− → µ+ µ− . Then
s = −(p + p′ )2 = −q 2
t = −(k − p)2 = 2p · k = 2p′ · k ′
u = −(p − k ′ )2 = 2p · k ′ = 2p′ · k (25.26)
and the sum over spins is
[( ) ]
1∑ 8e 4
t
2 ( u )2
|M|2 = 2 + (25.27)
4 spins s 2 2

229
+ −
k −
k’ e− −p’
k’
a)

e− −k
p
e− b)
p
e+ p’

Figure 78: 900 rotation followed by mirror image of a), giving crossed diagram b).

Consider the 900 rotation of e− (p)e+ (p′ ) → µ+ (k)µ− (k ′ ) (followed by a mirror image
exchanging left with right that doesn’t change physics), as in Fig.78. By the above crossing
rules, it gives for e− µ− → e− µ− , e− (p)µ− (−k) → e− (−p)µ− (k ′ ), the Mandelstam variables
for the process are
s = −(p − k)2 = t̄
t = −(p − (−p′ ))2 = s̄
u = −(p − k ′ )2 = ū (25.28)
where the quantities with bar are defined using the original (uncrossed) diagram. Therefore
if we have a functional form M(s, t) for the original amplitude, the form of the amplitude
for the crossed process is obtained by exchanging s with t and keeping u unchanged. This
in general will not give the same result. In particular, in our case we obtain
[ ]
1∑ 8e4 ( s )2 ( u )2
|M| = 2
2
+ (25.29)
4 spins t 2 2

which is not the same as for the original process. If nevertheless we obtain the same formula,
we say the amplitude is crossing symmetric.
The Mandelstam variables are not independent. In fact, a simple counting argument tells
us there are only 2 independent variables. There are 4 momenta, and for each 3 independent
components (since E 2 = p⃗2 + m2 ), for a total of 12 components. But there are 4 momentum
conservation conditions, leaving 8 components. But there are Lorentz transformations Λµ ν
that we can make on the reference frame, corresponding to an antisymmetric 4 × 4 matrix
(µ, ν = 1, ..., 4), or equivalently 3 rotations (around the 3 axis of coordinates) and 3 Lorentz
boosts (speed of the frame in 3 directions), for a total of 6 frame-dependent components,
leaving only 2 Lorentz invariants. In fact, we can write a simple relation between s, t, u
s + t + u = −(p + p′ )2 − (k − p)2 − (k ′ − p)2

230
= −3p2 − p′2 − k 2 − k ′2 − 2p · p′ + 2p · k + 2p · k ′
= 2p · (k + k ′ − p′ ) − 3p2 − p′2 − k 2 − k ′2 = −p2 − p′2 − k 2 − k ′2
∑4
= m2i (25.30)
i=1

k
k k’ k k’
k’

p p’ p’
p
p p’ t−channel
u−channel
s−channel

Figure 79: In 2 to 2 scattering, we can have the s-channel, the t-channel and the u-channel,
depending on whether the physical pole of the S-matrix is in s, t or u, corresponding to
effective Feynman diagrams shown here.

In the case of 2 → 2 scattering and a single virtual particle being exchanged between
the two, we talk of ”channels”. Consider a particle-antiparticle pair of momenta p, p′ annihi-
lating, and creating a particle ϕ which then creates a different particle-antiparticle pair.
Since then the momentum on ϕ is p + p′ , the diagram has a propagator for ϕ with a
1/[(p + p′ )2 + m2ϕ ] = −1/[s − m2ϕ ] factor, meaning
1
M∝ (25.31)
s − m2ϕ

i.e. it has a pole in s. We thus refer to this as the ”s-channel”, see Fig.79 for details. Consider
now the crossed diagram with the particle with momentum p going into the particle with
momentum k, and the ϕ particle being exchanged with the other particle. In this case, the
propagator for ϕ has a factor of 1/[(p − k)2 + m2ϕ ] = −1/[t − m2ϕ ], so now
1
M∝ (25.32)
t − m2ϕ

i.e. it has a pole in t. We refer to this as the ”t-channel”. Finally, consider a different crossing,
where the particle of momentum p goes into the particle with momentum k ′ instead, and ϕ
exchanged with the other particle. Then the ϕ propagator gives a factor of 1/[(p−k ′ )2 +m2ϕ ] =
−1/[u − m2ϕ ], and therefore
1
M∝ (25.33)
u − m2ϕ

231
i.e. it has a pole in u. We call this the ”u-channel”.
We now define a relativistically invariant differential cross section for elastic scattering
(the same particles are in the initial and final states). We have defined dσ/dΩ before, even a
relativistically invariant formula that becomes the usual dσ/dΩ in the center of mass frame.
But now we can define a formula only in terms of the independent relativistic invariants s
and t. We can write a dσ/dt. Consider that t = −(p − p′ )2 and since p = (E, p⃗CM ) and
k = (E, ⃗kCM ) (if p and k correspond to the same particle, it has the same initial and final
energy),
t = −(⃗p2CM − ⃗kCM )2 = −(p2CM + kCM
2
− 2kCM pCM cos θ) (25.34)
That means that
dt
= 2|kCM ||pCM | (25.35)
d cos θ
Consider then that the solid angle element defined by a fixed θ would be dϕd(cos θ), and
integrating over the angle ϕ at fixed θ between the initial and final momenta, we have
2πd(cos θ), therefore
dt dt 1
= = |kCM ||pCM | (25.36)
dΩ 2πd(cos θ) π
We already defined the relativistically invariant formula
( ) ⃗kCM
dσ 1
=√ |M|2 (25.37)
dΩ rel.inv. (p1 · p2 ) − m1 m2
2 2 2 64π 2 E
CM

where p1 → p, p2 → p′ , and since

s = −(p1 + p2 )2 = −p21 − p22 − 2p1 · p2 = m21 + m22 − 2p1 · p2


1
⇒ (p1 · p2 )2 − m21 m22 = (s − m21 − m22 ) − m21 m22 (25.38)
4
On the other hand, in the center of mass frame,

(p1 · p2 )2 − m21 m22 = (E1 E2 + pCM )2 − m21 − m22 = 2p4CM + p2CM (m21 + m22 ) + 2E1 E2 p2CM
= p2CM (2pCM + 2E1 E2 + m21 + m22 ) = p2CM s (25.39)

where we have used s = −(p1 + p2 )2 = m21 + m22 − 2p1 · p2 = m21 + m22 + 2E1 E2 + 2⃗p2CM .
Then we have
( )
dσ π dσ 1
= = √ √ |M|2
dt |kCM ||pCM | dΩ rel.inv. 64π|pCM | s (p1 · p2 )2 − m21 m22
1
= |M|2 (25.40)
64π[(p1 · p2 )2 − m21 m22 ]

We can write for the total cross section




σtot = dt (s, t) = σtot (s) (25.41)
dt

232
Note that t is the momentum transfer, and as such it can in principle be anything, though of
course we expect that for very large t (in which case at least one of the outgoing 3-momenta,
and therefore also its energy, will be very large) the amplitude for this process would be very
small or zero.
In the case of equal masses, m1 = m2 = m, (25.38) gives s(s − 4m2 )/4 and therefore
finally
dσ 1
= |M(s, t)|2 (25.42)
dt 16πs(s − 4m ) 2

Important concepts to remember

• In order to calculate polarized cross section, i.e. for given helicities, we can introduce
projectors onto helicities PL , PR for free, since PL ψL = ψL , PR ψR = ψR , and then sum
over spins since the other spins have zero contributions, because of PL ψR = PR ψL = 0.

• The crossed diagrams are diagrams were we rotate the direction of time, and reinterpret
the diagram by changing particles into antiparticles.

• By changing the particle momenta into minus antiparticle momenta when crossing, we
obtain the same value for the amplitude, hence crossing symmetry means M(ϕ(p) +
... → ...) = M(... → ...ϕ̄(k)), with k = −p.

• If we keep the functional form of the amplitude fixed, crossing symmetry corresponds
in Mandelstam variables to exchanging s with t and keeping u fixed. If we obtain the
same amplitude after the exchange, we say that the amplitude is crossing symmetric.

• In the 2 → 2 scattering,
∑ there are only two independent Lorentz invariants, s and t.
We have s + t + u = 4i=1 m2i .

• In the diagrams with only an intermediate particle ϕ, we talk of s-channel, t-channel


and u-channel for the diagrams which contain poles in s, t and u respectively, due to
the ϕ propagator having s,t,u as −p2 .

• In 2 → 2 scattering, we can define the relativistic invariant differential cross section


dσ/dt(s, t), which integrates to σtot (s).

Further reading: See chapters 5.2, 5.4 in [2] and 8.1 in [1].

233
Exercises, Lecture 25

1) Check that
dσ − + − + α2
(eL eR → µR µL ) = 2
(1 − cos θ)2 (25.43)
dΩ 4ECM
as in the text (check all the details).

2) Consider the amplitude

Γ(−α(s))Γ(−α(t))
M(s, t) = (25.44)
Γ(−α(s) − α(t))

(Veneziano), where α(x) = a + bx (a, b > 0). Show that it is crossing symmetric. Write
it as an (infinite) sum of s-channel contributions (processes), and equivalently as a sum of
t-channel contributions (processes).

234
26 Lecture 26. (Unpolarized) Compton scattering
In this lecture we will study the most (calculationally) complicated process so far, the last
example of applications of the quantum field theory formalism to QED, namely Compton
scattering, e− γ → e− γ. After that, in the last 3 lectures, we will preview the treatment of
divergences in the next semester, by showing what are the divergences and how we regularize.
Also, in this lecture, we will have a new ingredient compared with the previous lectures, the
sum over photon polarizations. It is also the first example we calculate when we have a sum
of diagrams. The scattering we will calculate is the unpolarized one, when we can’t measure
the spin of the particles.
We have an electron of momentum p scattering with a photon of momentum k, resulting
in a final electron of momentum p′ and a final photon of momentum k ′ . There are 2 possible
Feynman diagrams, with the photon being absorbed by the electron line before the final
photon is emitted from the electron line (or we can write it as the intermediate electron line
being vertical in the diagrams), or first the final photon is emitted, and then the incoming
photon is absorbed (or we can write it as the intermediate electron line being horizontal
in diagram), see Fig.80. In the first diagram the intermediate electron line has momentum
p + k, in the second one it has momentum p − k ′ .

p’ p’
k’
k
+ k’
k
p p

Figure 80: Feynman diagrams for Compton scattering, shown in two ways: above as scat-
terings off an electron line, and below as s-channel and t-channel diagrams.

Using the Feynman rules, we can write the expression for the amplitude as (note that
the two fermion lines are identical, so there is no relative minus sign between the diagrams)

/p + k/ + im
iM = ū(p′ )(+eγ µ )ϵ∗µ (k ′ )(−) 2 2
(+eγ ν )ϵν (k)u(p)
(p + k) + m

235
/p − k/′ + im
+ū(p′ )(+eγ ν )ϵν (k)(−) (+eγ µ )ϵ∗µ (k ′ )u(p)
[ µ(p − k ′ )2 + m2
]
2 ∗ ′ ′ γ (p/ + k/ + im)γ ν γ ν (p / − k/′ + im)γ µ
= −e ϵµ (k )ϵν (k)ū(p ) + u(p) (26.1)
(p + k)2 + m2 (p − k ′ )2 + m2

But note that (p+k)2 +m2 = p2 +k 2 +2p·k+m2 = 2p·k (since k 2 = 0 for the external photon,
and p2 +m2 = 0 for the external electron), and similarly (p−k ′ )2 +m2 = p2 +k ′2 −2p·k ′ +m2 =
−2p · k ′ .
We now also use the fact that u(p) and v(p) satisfy the Dirac equation, i.e. (p/ −im)u(p) =
0 and (p/ + im)v(p) = 0. Then we have

(p/ + im)γ ν u(p) = (−γ ν /p + 2g µν pµ + imγ ν )u(p) = 2pν u(p) − γ ν (p/ − im)u(p) = 2pν u(p) (26.2)

where we have used γ µ γ ν = −γ ν γ µ + 2g µν and the Dirac equation above. Substituting this
formula in the square bracket in M, we find
[ µ ν ]
2 ∗ ′ ′ γ k/γ + 2γ µ pν −γ ν k/′ γ µ + 2γ ν pµ
iM = −e ϵµ (k )ϵν (k)ū(p ) + u(p) (26.3)
2p · k −2p · k ′
As we saw last lecture,
(ūγ µ ũ)∗ = −ũγ
¯ µu (26.4)
due to the fact that for a number, complex conjugation is the same as adjoint, and then we
can write the transposed matrices in the opposite order, and use (γ µ )† iγ 0 = −iγ 0 γ µ (where
ū = u† iγ 0 ). Here u, ũ can be either u or v. Then we can generalize this to

(ūγ µ1 ...γ µn ũ)∗ = (−)n ũγ


¯ µn ...γ µ1 u (26.5)

We then find
[ ]
∗ ′ γ σ k/γ ρ + 2γ ρ pσ −γ µ k/′ γ σ + 2γ σ pρ
−iM = +e ϵρ (k 2
)ϵ∗σ (k)ū(p) + u(p′ ) (26.6)
2p · k −2p · k ′

Photon polarization sums ∑


We note that if we sum over polarizations, in |M|2 we have pol. ϵ∗µ (k)ϵν (k). We then have
the theorem that in |M|2 , such sums over polarizations can be done with the replacement

ϵ∗µ (k)ϵν (k) → gµν (26.7)
pol.

More precisely, if
iM = ϵ∗µ (k)Mµ (k) (26.8)
then ∑ ∑
MM∗ = ϵ∗µ (k)ϵν (k)Mµ (k)M∗ν (k) = gµν Mµ (k)M∗ν (k) (26.9)
pol. pol.

Indeed, Ward identities imply that kµ Mµ (k) = 0. Indeed, we have seen from the Ward-
Takahashi identities, that we have the same for the 1PI functions, k µ Πµν (k) = 0, k µ1 Γµ1 ...µn =

236
0. The S-matrix corresponds to a connected, amputated n-point function, contracted with
external lines for the states, e.g. for the photons with ϵµ (k). As it is reasonable to assume
that the connected and amputated n-point functions have the same property, we should have
kµ Mµ = 0. Another approximate way to see this is that Mµ is the Fourier transform of an
electromagnetic current in between some initial and final state,

M (k) = d4 xeik·x < f |j µ (x)|i >
µ
(26.10)

µ µ
where
∫ 4 j µ = ψ̄γ ψ. The reason is that external photons are created by the interaction term
e d xj Aµ . We can check this fact for our amplitude, where each half of a diagram (initial,
and final halfs) have this property.
Then, consider an on-shell photon momentum (k 2 = 0), for instance k µ = (k, 0, 0, k)
(moving in the 3 direction). The two physical polarizations are then transverse to the
momentum, and of unit norm, i.e. ϵµ(1) = (0, 1, 0, 0) and ϵµ(2) = (0, 0, 1, 0). Then kµ Mµ (k) = 0
is written as
−kM0 (k) + kM3 (k) = 0 ⇒ M3 (k) = M0 (k) (26.11)
Thus finally the sum over photon polarizations gives

ϵ∗µ (k)ϵν (k)Mµ (k)M∗ν (k) = |M1 |2 + |M2 |2 = −|M0 |2 + |M1 |2 + |M2 |2 + |M3 |2
pol.
= gµν Mµ (k)M∗ν (k) (26.12)

q.e.d.
We now go back to the Compton scattering calculation, and find for the unpolarized
|M|2 , (doing also the sum over electron spins in the usual manner)
{ [ µ ν ]
1∑ e4 ′ γ k/γ + 2γ µ pν −γ ν k/′ γ µ + 2γ ν pµ
|M| = − gµρ gνσ Tr (−ip/ + m)
2
+
4 spins 4 2p · k −2p · k ′
[ σ ρ ]}
γ k/γ + 2γ ρ pσ −γ ρ k/′ γ σ + 2γ σ pρ
(−ip/ + m) +
[ 2p · k −2p · k ′ ]
4
e I II III IV
≡ − + + + (26.13)
4 (2p · k)2 (2p · k)(2p · k ′ ) (2p · k ′ )(2p · k) (2p · k ′ )2
But we note that II = III, and that I(k = −k ′ ) = IV , so we only need to calculate I and
II.
We have

I = Tr [(−ip/′ + m)(γ µ k/γ ν + 2γ µ pν )(−ip/ + m)(γν k/γµ + 2γµ pν )] (26.14)

and we can split it into 8 nonzero traces. Indeed, there are 24 = 16 terms, but as we already
saw, only the trace of an even number of gammas is nonzero, so half of the terms are zero.
We calculate each of the 8 traces separately.
First, the longest trace is

I1 = (−i)2 Tr[p/′ γ µ k/γ ν /pγν k/γµ ] = Tr[γ µ /p′ γµ k/(−2p


/)k/] = −4 Tr[p/′ k//pk/]

237
= −4 Tr[p/′ k/(−k//p + 2p · k)] = −8p · k Tr[p
/′ k/] = −32(p · k)(p′ · k) (26.15)

Here we have used the fact that γ µ γν γµ = −2γν , thus γ ν /pγ µ = −2p /, that the trace is
cyclically symmetric, that γ γ = −γ γ + 2g , thus /pk/ = −k//p + 2p · k, and k/k/ = k 2 = 0
µ ν ν µ µν

/′ k/] = 4p′ · k.
(the photon is massless), and finally Tr[γ µ γ ν ] = 4g µν , thus Tr[p
Similarly,

I2 = (−i)2 Tr[p/′ γ µ k/2pν γν /pγµ ] = −2 Tr[γµ /p′ γ µ k/(−m2 )] = −4m2 Tr[p/′ k/] = −16m2 p′ · k
I3 = (−i)2 Tr[p/′ γ µ 2pν γν /pk/γµ ] = −2 Tr[γµ /p′ γ µ (−m2 )k/] = −4m2 Tr[p/′ k/] = −16m2 p′ · k
I4 = (−i)2 (2pν 2pν ) Tr[p/′ γ µ /pγµ ] = +4m2 Tr[p/(−2p /)] = −32m2 p · p′ (26.16)

Here we have used also p2 = −m2 for the external electron, besides the others, already used.
These are all the terms with the (−ip/′ ) and (−ip/) factors. There are also terms with (−ip
/′ )m
and with m(−ip/), but these are zero, since they have an odd number of gammas in the trace.
Therefore we only have the 4 terms with m2 left, which give

I5 = m2 Tr[γ µ k/γ ν γν k/γµ ] = 4m2 Tr[γ µ γµ k/k/] = 0


I6 = 2m2 Tr[γ µ k/pν γν γµ ] = 8m2 Tr[k//p] = 32m2 k · p
I7 = 2m2 Tr[γ µ /pk/γµ ] = 8m2 Tr[p/k/] = 32m2 p · k
I8 = 4m2 pν pν Tr[γ µ γµ ] = −64m4 (26.17)

Then for the term I we have



8
I = Ii = −32(p · k)(p′ · k) − 32m2 p′ · k − 32m2 p · p′ + 64m2 p · k − 64m4
i=1
= −32[(p · k)(p′ · k) + m2 p′ · k + m2 p · p′ − 2m2 p · k + 2m4 ] (26.18)

Similarly, we find (the details are left as an exercise, similarly to the above I term)

II = III = −16[−2k · k ′ p · p′ + 2k · pp′ · p + m2 p′ · k − 2k ′ · pp · p′ − m2 p′ · k ′ + m2 p · p′


−m2 k · k ′ + 2m2 p · k − 2m2 p · k ′ − m4 ] (26.19)

We now translate into Mandelstam variables. We have

s = −(p + k)2 = −p2 − k 2 − 2p · k = m2 − 2p · k


= −(p′ + k ′ )2 = −p′2 − k ′2 − 2p′ · k ′ = m2 − 2p′ · k ′
t = −(p − p′ )2 = −p2 − p′2 + 2p · p′ = 2m2 + 2p · p′
= −(k − k ′ )2 = −k 2 − k ′2 + 2k · k ′ = 2k · k ′
u = −(k ′ − p)2 = −k ′2 − p2 + 2k ′ · p = m2 + 2k ′ · p
= −(k − p′ )2 = −k 2 − p′2 + 2k · p′ = m2 + 2k · p′ (26.20)

which allow us to write all the inner products of external momenta in terms of s, t, u. We
have also to remember that
∑4
s+t+u= m2i = 2m2 (26.21)
i=1

238
As well, to calculate IV we need to replace k ↔ −k ′ , which we easily see to imply exchanging
s ↔ u.
We then find
[ 2 ]
m − s u − m2 2u − m
2
2 t − 2m
2
2m − s
2
I = −32 +m +m − 2m + 2m 4

[ 2 2( 2) ] 2 2
su 3 t s 3
= −32 − + m2 u+ + − m4
[ 4 2
4 2 4 ] 4
su m m4
= −32 − + (3s + u) +
[ 4 4 4 ]
1
= −16 − (s − m )(u − m ) + m (s − m ) + 2m
2 2 2 2 4
(26.22)
2
where in the next to last line we used (s + t + u)/2 = m2 .
Then we immediately find the expression for IV, by just switching u ↔ s,
[ ]
1
IV = −16 − (s − m )(u − m ) + m (u − m ) + 2m
2 2 2 2 4
(26.23)
2
Similarly, after a similar algebra left as an exercise, we find
[ 2 ]
m m2
II = III = 16 (s − m ) +
2
(u − m ) + 2m
2 4
(26.24)
2 2
Finally then, we have
{ [ ]
1∑ e4 16 1
|M| =
2
− (s − m )(u − m ) + m (s − m ) + 2m
2 2 2 2 4
4 spins 4 (s − m2 )2 2
[ ]
16 1
+ − (s − m )(u − m ) + m (u − m ) + 2m
2 2 2 2 4
(u − m2 )2 2 [ ]}
2 · 16 m2 m2
+ (s − m ) +
2
(u − m ) + 2m
2 4
(s{− m2 )(u − m2 ) 2 2
s − m2 u − m2
= 4e4 − −
2(u − m2 ) 2(s − m2 )
( )2 }
2m2 2m2 4 1 1
+ + + 2m +
u − m2 s − m2 u − m2 s − m2
(26.25)
The relativistically invariant differential cross section is
dσ 1 1∑
= |M|2 (26.26)
dt 64π[(p1 · p2 )2 − m21 m22 ] 4 spins

But in our case, with p1 = p, p2 = k, we have


( )2
s − m2 (s − m2 )2
(p1 · p2 ) − m1 m2 =
2 2 2
−0= (26.27)
2 4

239
giving

dσ e4 1∑
= |M|2
dt 4π(s − m ) 4 spins
2 2
{
4πα2 s − m2 u − m2
= − −
(s − m2 )2 2(u − m2 ) 2(s − m2 )
( )2 }
2m2 2m2 1 1
+ + + 2m4 + (26.28)
u − m2 s − m2 u − m2 s − m2

As we see, this formula is not that simple.


We can also analyze this formula in the lab frame, obtaining the spin-averaged Klein-
Nishina formula ( )2 [ ′ ]
dσ πα2 ω ′ ω ω
= 2 + ′ − sin θ
2
(26.29)
d cos θ m ω ω ω
where ω and ω ′ are the angular frequencies of the incoming and outgoing photons in the lab
frame (where the electron target is fixed), related to each other by

ω′ 1
= (26.30)
ω 1+ ω
m
(1 − cos θ)

We will not prove the Klein-Nishina formula here.

240
Important concepts to remember

• For sums over photon polarizations, if iM = ϵ∗µ (k)Mµ (k), we replace ∗
pol ϵµ (k)ϵν (k)
by gµν in |M|2 .

• When we have fermion propagators in a Feynman diagrams, we can use the fact that
/ − im)u(p) = 0 and (p
the u(p) and v(p)’s satisfy the Dirac equation, (p / + im)v(p) = 0.

Further reading: See chapters 5.5 in [2] and 8.2 in [1].

241
Exercises, Lecture 26

1) Show that

[(−ip/′ + m)(γ µ k/γ ν + 2γ µ pν )(−ip/ + ]


Tr [ m)(γ µ k/′ γ ν − 2γ ν pµ )]
2 2
m m
= 16 (s − m2 ) + (u − m2 ) + 2m4 (26.31)
2 2

2) Using the formulas in the text, calculate the total relativistically invariant cross section,
σtot (s).

242
27 Lecture 27. One-loop determinants, vacuum energy
and zeta function regularization
We now start discussing infinities that appear in quantum field theory, with the simplest
example, that we already saw a bit before, namely the infinite zero point energy.
For a harmonic oscillator, the vacuum has a ”zero-point energy” (vacuum energy) E0 =
~ω/2, and we have seen that analogously, for a free bosonic field, for instance for a real scalar
field, we have an sum over all the harmonic oscillator modes of the field, i.e. (in D − 1 space
dimensions) ∫
dD−1 k ~ωk ∑ ~ω⃗
E0 = (2π) D−1
δ(0) → E 0 = k
=∞ (27.1)
(2π)D−1 2 2
⃗k
∫ ∑
where dD−1 k/(2π)D−1 → 1/V ⃗k and δ(⃗p − ⃗k) → V δp⃗⃗k .
In the free theory, we have removed this infinity using normal ordering. But the issue
remains, is this something physical, that has to do with quantum field theory, or just a pure
artifact of the formalism? The answer is that it does have in fact a physical significance. Not
the infinite result of course, but we can measure changes in it due to changes in geometry.
This is known as the Casimir effect, and has already been experimentally observed, meaning
the existence of the zero-point energy has been experimentally confirmed.

Figure 81: Parallel plates at a distance d, inside a box of length L.

Since the result is not only formally infinite (an infinite sum), but also continuous for
a system of infinite size, we will consider a space that has a finite size (length) L, see
Fig.81 and we will take L → ∞ at the end of the calculation. Also, in order to consider the
geometry dependence, consider two parallel perfectly conducting plates situated at a distance
d. Strictly speaking, this would be for the physical case of fluctuation of the electromagnetic
field, the only long range field other than gravity. The conducting plates would impose
constant potential on them, i.e. A0 = V =const., thus more generally, Dirichlet boundary
conditions for the fluctuating field Aµ . In our simple case then, consider a scalar field with
Dirichlet boundary conditions at the plates, such that the eigenmodes between the plates
have
πn
kn = (27.2)
d
One of the plates can be considered to be a boundary of the system without loss of generality,

243
so the system is separated into d and L − d. (We can consider the more general case where
there is space on both sides, but it will not change the final result).
If the y and z directions are infinite, then the energy per unit area (area of x, y directions)
of fluctuations between the plates is given by
∫ √
E ~ ∑ dky dkz ( πn )2
= + ky2 + kz2 (27.3)
A 2 n (2π)2 d

and we can do this calculation, related to the real case (experimentally measured), just that
it will be more involved. It is simpler to work directly with a system in 1+1 dimensions,
meaning there is no ky , kz .
The total vacuum energy of the system is the energy of fluctuations between the plates
(boundary conditions) at distance d and the energy outside the plates. If the space is
considered to be periodic, the boundary at x = L is the same (after the equivalence) as the
first plate at x = 0, and the other boundary of the system is at x = d (the second plate), just
that now the space in between the plates is the ”outside” one, of length L − d. Therefore
the energy is
E = f (d) + f (L − d) (27.4)
where
~π ∑

f (d) = n (27.5)
2d n=1
As we already noted, f (d) is infinite, but let us consider a regularization. This is so, since
at n → ∞, modes have infinite energy, so they clearly need to be suppressed. If for nothing
else, when the energy reaches the Planck energy, we expect that the spacetime will start
getting curved, so this naive notion of perturbation modes in flat space cannot be a good
approximation anymore. In any case, it is physically clear that the contribution of modes
of infinite energy should be suppressed. We consider the regulator e−aωn , i.e. the regulated
function is
f˜(d) π ∑ −aωn π ∑ −a nπ 1 ∂ ∑ ( −a π )n
= ne = ne d = − e d
~ 2d n≥1 2d n≥1 2 ∂a n≥1
e−a d
π
1 ∂ 1 π
= − π = (27.6)
2 ∂a 1 − e−a d 2d (1 − e−a d )2
π

Now taking a → 0, we obtain

f˜(d) d π
≃ − + ... (27.7)
~ 2πa 2 24d
Adding now the contributions of the two pieces (inside and outside the plates) to the Casimir
effect, we obtain
( )
L π 1 2
E0 = f (d) + f (L − d) = − +O ;a (27.8)
2πa2 24d L2

244
and now the infinite term is constant (d-independent), but there is a finite d-dependent term.
In other words, now by varying the geometry, i.e. by varying d, we obtain a force on the
plates,
∂E π
F = =+ + ... (27.9)
∂d 24d2
We now consider a generalization of this case to an interacting quantum field theory for
a scalar ϕ with potential U (ϕ), with minimum at ϕ = ϕ0 and expand around it. We keep
in the expansion of the action and the associated potential energy only the constant and
quadratic terms (since the linear term is zero by the equations of motion). The potential
energy of the system is then
∫ { [ ( 2 ) ] }
1 d U
V [ϕ] ≃ V [ϕ0 ] + d3⃗x δϕ(⃗x, t) −∇⃗2+ δϕ(⃗x, t) + ... (27.10)
2 dϕ2 ϕ=ϕ0

We then need to find the eigenvalues (and eigenfunctions) of the operator in square brackets,
namely ( )
2

d U
−∇⃗2+ ηi (⃗x) = ωi2 ηi (⃗x) (27.11)
dϕ2 ϕ=ϕ0
The eigenfunctions are orthonormal

d3 xηi∗ (⃗x)ηj (⃗x) = δij (27.12)

We can then expand the fluctuation in the eigenfunctions



δϕ(⃗x, t) = ci (t)ηi (⃗x) (27.13)
i

Thus we obtain the total energy


∫ ∫
1 4 2
E = d x (ϕ̇(⃗x, t)) + dtV [ϕ]
2
∫ ∑ [1 1 2
]
≃ dt 2
ċi (t) + ωi ci (t)2
(27.14)
i
2 2

which is a sum of harmonic oscillators, thus the energy of the vacuum of the system is
~∑
E0 = V [ϕ0 ] + ωi + ... (27.15)
2 i

where we ignored corrections, which can be identified as higher loop corrections (whereas
the calculated term is only one-loop, as we will see shortly). Note that this formula has
many subtleties. In particular, for nonzero interactions, there will be renormalization of the
parameters of the interaction (as we will show in the next semester). But at a formal level,
the above formula is correct.

245

As an example, let’s see what happens for the free theory, with U = 0. Then ηi = eiki ·⃗x ,
so
⃗ 2 )ηi = ⃗k 2 ηi ≡ ω 2 ηi
(−∇ (27.16)
i i

therefore √
~∑ ⃗k 2
E0 = i (27.17)
2 i
Moreover, for U = m2 ϕ2 /2 (just a mass term),
⃗ 2 + m2 )ηi = (⃗k 2 + m2 )ηi ≡ ω 2 ηi
(−∇ (27.18)
i i

therefore √
~∑ ⃗k 2 + m2
E0 = i (27.19)
2 i
Let us now understand better the regularization that we ∑ used and its generality (or
uniqueness of the nontrivial piece). The quantity we want is n≥1 n. It is of course infi-
nite, but in mathematics there is a well-defined finite value associated with it by analytical
continuation.
Indeed, one can define the Riemann zeta function,
∑ 1
ζ(s) = (27.20)
n≥1
ns

For real s, it is convergent for s > 1. However, one can define it in the complex plane C
(Riemann). In this case, away from the real axis, the function ζ(s) is well-defined close to
-1. One can then uniquely define ζ(−1) by analytical continuation as ζ(s → −1).∗ We then
have
1
ζ(−1) = − (27.21)
12
Then we can write directly for the Casimir effect with two plates,
( )∑
~ ~ ~π
E0 ≃ + n=− (27.22)
2d 2(L − d) n≥1 24d

the same as we obtained before, but with the infinite constant already removed. Therefore
the zeta function procedure is a good regularization.
Zeta function regularization
We now generalize the above regularization procedure using the zeta function to a general
operator.

We can write an integral representation for the zeta function, by writing an integral representation for
1
∫ ∞ s−1 ∑ −nt 1
∫ ∞ ts−1 e−t
each term in the sum, as ζ(s) = Γ(s) 0
t n≥1 e = Γ(s) 0
dt 1−e−t , which form is shown to be
well-defined (analytic) over the whole complex plane, with the only pole at s = +1. For s = −2m, with
m integer, it has in fact (trivial) poles. The Riemann hypothesis, one of the most important unsolved
(unproven) problems in mathematics, is that all the nontrivial zeroes of the Riemann zeta function (other
than the trivial ones above) lie on the line Re(s) = +1/2.

246
Consider an operator with positive, real and discrete eigenvalues a1 , ..., an and corre-
sponding eigenfunctions fn ,
Afn (x) = an fn (x) (27.23)
Then we define ∑ 1
ζA (s) = (27.24)
as
n≥1 n

Then
d ∑ ∏
ζA (s)|s=0 = − ln an e−s ln an |s→0 = − ln an (27.25)
ds n n

So finally, ∏ ′
eTr ln A = det A = an = e−ζA (0) (27.26)
n

As we will shortly see, the object that we want to calculate is det A = eTr ln A , so we can
calculate it if we know ζA (s).
Heat kernel regularization
Sometimes we can’t calculate directly det A or ζA (s), but it may be easier to calculate
another object, called the associated ”heat kernel”, defined as

G(x, y, τ ) ≡ e−an τ fn (x)fn∗ (y) (27.27)
n

It satisfies the generalized heat equation



Ax G(x, y, τ ) = − G(x, y, τ ) (27.28)
∂τ
as we can easily check. We can also easily check that, since the fn (x) are orthornormal, the
heat kernel satisfies the boundary condition

G(x, y; τ = 0) = δ(x − y) (27.29)

Sometimes it is easier to solve the generalized heat equation with the above boundary con-
dition, and then we can find ζA (s) from it as
∫ ∞ ∫
1 s−1
ζA (s) = dτ τ d4 xG(x, x, τ ) (27.30)
Γ(s) 0

since d4 x|fn (x)|2 = 1.
If we find ζA (s), then we can calculate det A, as we saw.
Saddle point evaluation
We now see how to calculate the sum over frequencies from the the path integral formu-
lation. For that, we need to make a quadratic approximation around a minimum (sometimes
it is only a local minimum, not a global minimum) of the action.

247
This is based on the idea of saddle point evaluation of an integral. If we write the
integrand as an exponential of minus something that has a minimum,

I ≡ dxe−a(x) (27.31)

with
1
a(x) ≃ a(x0 ) + (x − x0 )2 a′′ (x0 ) (27.32)
2
with a′′ (x0 ) > 0, then we can approximate the integral as a gaussian around the minimum,
i.e. as
∫ √
′′ 2π
I ≃ e−a(x0 ) dxe− 2 (x−x0 ) a (x0 ) = e−a(x0 )
1 2

′′
(27.33)
a (x0 )
Path integral formulation
Consider the case of field theory in Euclidean space, with partition function
∫ ∫
−S[ϕ]+J·ϕ
Z[J] = N Dϕe ≡ N Dϕe−SE [ϕ,J] (27.34)

We can again make a quadratic approximation around a minimum,


⟨ 2 ⟩
1 δ SE
SE [ϕ, J] = SE [ϕ0 , J] + (ϕ − ϕ0 )1 (ϕ − ϕ0 )2 + ... (27.35)
2 δϕ1 δϕ2 1,2

where ϕ1 = (ϕ − ϕ0 )1 and ϕ2 = (ϕ − ϕ0 )2 , with 1,2 corresponding to different variables of


integration (x, y), and <>1,2 to integrating over them.
Then the saddle point evaluation of the partition function gives
∫ { ⟨ ⟩ }
2
1 δ S
Z[J] ≃ N ′ e−SE [ϕ0 ]+ϕ0 ·J Dϕ exp −
E
ϕ1 ϕ2
2 δϕ1 δϕ2 1,2
−1/2
= N ′′ e−SE [ϕ0 ]+ϕ0 ·J det [(−∂µ ∂ µ + U ′′ (ϕ0 ))δ1,2 ] (27.36)

Since det A = eTr ln A , we have


1
−W [J] = ln Z[J] = −S[ϕ0 ] + ϕ0 · J − Tr ln [(−∂µ ∂ µ + U ′′ (ϕ0 ))δ1,2 ] (27.37)
2
and as we saw, we can calculate

1 1 dζA
− Tr ln[A] = (27.38)
2 2 ds s=0

To make contact with what we had before, we continue to Minkowski space, where the
eigenvalue-eigenfunction problem is
( )
∂2 ′′
− 2 +∇ ⃗ − U (ϕ0 ) ϕk (⃗x, t) = ak ϕk (⃗x, t)
2
(27.39)
∂t

248
We separate variables writing
ϕk (⃗x, t) = ηr (⃗x)fn (t) (27.40)
(i.e. k = (rn)), with
⃗ 2 + U ′′ (ϕ0 ))ηr (⃗x) = ω 2 ηr
(−∇ (27.41)
r

Then we have ( ) ( 2 2 )
∂2 nπt nπ nπt
− 2 − ωr sin
2
= − ω 2
r sin (27.42)
∂t T T2 T
So that finally
−1/2
∏ ( n2 π 2 )−1/2
′′
det [∂µ ∂ − U (ϕ0 )]
µ
= − ωr2 (27.43)
n≥1
T2
One can calculate the partition function then, and obtain
∑ iEn T
Z= e− ~ (27.44)
n

and as before
~∑
E0 = ωr (27.45)
2 r
The first observation is that E0 ∝ ~, and it is obtained by quantum fluctuations around
the classical minimum of the action, so it is the first quantum correction, i.e. one-loop.
It is one-loop since in the quadratic (free) approximation we used, calculating Z0 [J], the
only Feynman diagrams there are are vacuum bubbles, i.e. circles, which form one loop.
Moreover, we have seen that the result is given by determinants of the operator acting on
fluctuations. Hence one refers to these quantum corrections as one-loop determinants.
Fermions
In the case of fermions, as we know, gaussian integration gives the determinant in the
numerator instead of the denominator, i.e.
∫ √
T 1
dn xex Ax = 2n/2 det A = N e+ 2 Tr ln A (27.46)

Moreover, in the operator formalism we saw that the energy is


∑ ∑ ~ωi † ∑∑ ( )
† 1
H= (b bi − bi bi ) = ~ω Ni − (27.47)
i n
2 i i n
2
i i

so in both ways we obtain


~∑
E0 = − ωi (27.48)
2 i
which are opposite in sign to the bosonic corrections. In supersymmetric theories, the bosonic
and fermionic zero-point energies cancel each other.

Important concepts to remember

249
• We can measure changes in the infinite zero point energy of a field due to geometry,
but not the infinite constant piece. This is known as the Casimir effect.

• We obtain the relevant finite piece by using a regularization, either by introducing


∑ a
−aπn/d
e factor with a → 0, or better, by using the finite number associated to n n by
the zeta function, ζ(−1) = −1/12.

• We generalize to an arbitrary field theory by calculating the eigenvalue-eigenfunction


problem for the potential energy operator, and summing over ~ωi /2.

• In the path integral formulation, the saddle point evaluation gives a determinant of
fluctuations around the minimum of the action. It leads to the same form of the
zero-point energy. This corresponds to one-loop corrections, namely vacuum bubbles.
Higher loop corrections appear from higher order terms in the action (neglected here).

• One can define


∑ zeta function regularization by generalizing the harmonic oscillator case
s
via ζA (s) = n≥1 1/(an ) .

• The determinant of an operator is then det A = e−ζA (0) .

• We can define the heat kernel associated with an operator, that satisfies the generalized
heat equation, with a boundary condition G(x, y; τ = 0) = δ(x − y), which sometimes
is easier to solve. Then we can find ζA (s) from the heat kernel.

• Fermions give (det A)+1/2 instead of (det A)−1/2 , and a negative sign in front of the
zero point energy, so they can in principle cancel against the bosonic corrections.

Further reading: See chapters 3.4, 3.5 in [3].

250
Exercises, Lecture 27

1) Consider two scalar fields with potential (λ1 , λ2 > 0)

U (ϕ, χ) = λ1 (m1 ϕ − χ2 )2 + λ2 (m2 χ − ϕ2 )2 (27.49)

Write down the infinite sum for the zero-point energy in 1+1 dimensions for this model (no
need to calculate the sum) around the vacuum at ϕ = χ = 0.

2) Write down an expression for the zeta function regularization for the above sum, and
for the heat kernel.

251
28 Lecture 28. One-loop divergences for scalars; power
counting
In this lecture we will analyze possible divergences in loop integrals, in particular we will look
at one-loop, and how to determine if a theory contains divergences (using power counting). In
the final lecture we will understand how to regularize these divergences in order to calculate
something finite.

Figure 82: One-loop divergence in ϕ4 theory for the 2-point function.

Up to now, we have only calculated explicitly tree processes, which are finite, and we
have ignored the fact that loop integrals can be divergent. For example, in λϕ4 theory in
Euclidean space, consider the unique one-loop O(λ) diagram, a loop connected to the free
line (propagator) at a point, see Fig.82. It is

dD q 1
−λ (28.1)
(2π) q + m2
D 2

Since the integral is


∫ ∫
ΩD−1 1
≃ −λ q D−1
dq 2 ∼ dq q D−3 (28.2)
(2π)D q + m2
it is divergent in D ≥ 2 and convergent only for D < 2. In particular, in D = 4 it is
quadratically divergent ∫ Λ
∼ qdq ∼ Λ2 (28.3)

We call this kind of divergence ”ultraviolet”, or UV divergence, from the fact that it is at
large energies (4-momenta), or large frequencies.
Note also that we had one more type of divergence for loop integrals that was easily dealt
with, the fact that when integrating over loop momenta in Minkowski space, the propagators
can go on-shell, leading to a pole, which needed to be regulated. But the iϵ prescription
dealt with that. Otherwise, we can work in Euclidean space and then analytically continue
to Minkowski space at the end of the calculation. This issue did not appear until now,
because we only calculated tree level processes, when the propagators have fixed momenta,
and are not on-shell.
Let us now consider a one-loop diagram for the 2k-point function in λϕk+2 theory with
k momenta in, then two propagators forming a loop, and then k lines out, as in Fig.83. The

252
1 p=p 1
i i
2
q 2

k k
q−p

Figure 83: One-loop diagram in ϕk+2 theory, for the 2k-point function.


incoming momenta are called p1 , ..., pk and sum to p = ki=1 pi . Then the two propagators
have momenta q (loop variable) and q − p, giving

λ2 dD q 1
(28.4)
2 (2π) (q + m )((q − p)2 + m2 )
D 2 2

Again, we can see that at large q, it behaves as


∫ D−1
q dq
4
(28.5)
q
so is convergent only for D < 4. In particular, in D = 4 it is (log) divergent, and this again
is an UV divergence. From this example we can see that various diagrams are divergent in
various dimensions.
But this diagrams has also another type of divergence, namely at low q (q → 0). This
divergence appears only if we have m2 = 0 AND p2 = 0. Thus only if we have massless
particles, and all the particles that are incoming on the same vertex sum up to something
on-shell (in general, the sum of on-shell momenta is not on-shell). Then the integral is
∫ ∫
q 3 dq
∼ dΩ (28.6)
q 2 (q 2 − 2q · p)
and in the integral over angles, there will be a point where the unit vector on q, q̂, satisfies
q̂ · p̂ = 0 with respect to the (constant) unit vector on p, p̂. Then we obtain

dq
(28.7)
q
i.e., log divergent. We call this kind of divergences ”infrared” or IR divergences, since they
occur at low energies (low 4-momenta), i.e. low frequencies.
Thus we have two kinds of potential divergences, UV and IR divergences. The UV
divergences are an artifact of perturbation theory, i.e. of the fact that we were forced
to introduce asymptotic states as states of the free theory, and calculate using Feynman

253
diagrams. As such, they can be removed by redefining the parameters of the theory (like
masses, couplings, etc.), a process known as renormalization, which will be studied next
semester. A nonperturbative definition is not in general available, in particular for scattering
processes it isn’t. But for things like masses and couplings of bound states (like the proton
mass in QCD, for instance), one can define the theory nonperturbatively, for instance on
the lattice, and then we always obtain finite results. The infinities of perturbation theory
manifest themselves only in something called the renormalization group, which will also be
studied next semester.
By contrast, the IR divergences are genuine divergences from the point of view of the
Feynman diagram (can’t be reabsorbed by redefining the parameters). But they arise because
the Feynman diagram we are interested in, in the case of a theory with massless external
states, and with external states that are on-shell at the vertex, are not quantities that can
be experimentally measured. Indeed, for a massless external state (m = 0), of energy E,
experimentally we cannot distinguish between the process with this external state, or with
it and another emitted ”soft and/or collinear particle”, namely one of m = 0 and E ≃ 0
and/or parallel to the first. If we include the tree level process for that second process, and
sum it together with the first (loop level), we obtain a finite differential cross section (which
can be experimentally measured), for a given cut-off in energy and/or angle of resolution
between two particles.
Thus the physical processes are always finite, in spite of the infinities in the Feynman
diagram.
Analytical continuation
A question which one could have already asked is: Is Wick rotation of the final result the
same with Wick rotation of the integral to Minkowski space, followed by evaluation?
Let us look at the simplest one-loop diagram in Euclidean space (in λϕ4 , already discussed
above) ∫
dD q 1
(28.8)
(2π) q + m2
D 2

In Minkowski space it becomes


∫ ∫ ∫
dD q 1 dD−1 q dq0 1
−i = +i (28.9)
(2π) q + m − iϵ
D 2 2 (2π)D−1 2π q0 − ⃗q − m2 + iϵ
2 2

where now q 2 = −q02 + ⃗q2 . √


Then the poles are at q̃0 − iϵ and −q̃0 + iϵ, where q̃0 = ⃗q2 + m2 . The Minkowski space
integration contour is along the real axis in the q0 plane, in the increasing direction, called
CR . On the other hand, the Euclidean space integration contour CI is along the imaginary
axis, in the increasing direction, see Fig.84. As there are no poles in betweeen CR and CI
(in the quadrants I and III of the complex plane, the poles are in the quadrants II and IV),
the integral along CI is equal to the integral along CR (since we can close the contour at
infinity, with no poles inside). Therefore, along CI , q0 = iqD , with qD real and increasing,
and therefore dq0 = idqD , so
∫ ∫ ∫ D−1 ∫
d q dqD
dq0 (...) = dq0 (...) = (−i)i (28.10)
CR CI (2π)D ⃗q2 + (qD )2 + m2 − iϵ

254
Im q
0
−q +i CI
0
CR
Re q
0
q +i
0

Figure 84: Wick rotation of the integration contour.

which gives the same result as the Euclidean space integral, after we drop the (now unnec-
essary) −iϵ.
However, in general it is not true that we can easily analytically continue. Instead, we
must define the Euclidean space integral and Wick rotate the final result, since in general this
will be seemingly different than the continuation of the Minkowski space integral (rather, it
means that the Wick rotation of the integrals is subtle). But the quantum field theory per-
turbation in Euclidean space is well-defined, unlike the Minkowski space one, as we already
saw, so is a good starting point.
Let’s see an example of this situation. Consider the second integral we analyzed, now in
Minkowski space ∫
λ2 dD q 1 1
− (28.11)
2 (2π) q + m − iϵ (q − p) + m2 − iϵ
D 2 2 2

We deal with the two different propagators in the loop integral using the Feynman trick.
We will study it in more detail next class, but this time we will just use the result for two
propagators. The Feynman trick for this case is the observation that
∫ 1
1
= dx[xA + (1 − x)B]−2 (28.12)
AB 0

which allows one to turn the two propagators with different momenta into a single propagator
squared. Indeed, now we can write
∫ ∫ 1
λ2 dD q
− D
dx[x(q − p)2 + (1 − x)q 2 + (x + 1 − x)(m2 − iϵ)]−2 (28.13)
2 (2π) 0
The square bracket equals q 2 + xp2 − 2xq · p + m2 − iϵ. Changing variables to q ′µ = q µ − xpµ
allows us to get rid of the term linear in q. We can change the integration variable to
q ′ , since the Jacobian for the transformation is 1, and then the square bracket becomes
q ′2 + x(1 − x)p2 + m2 − iϵ. Finally, the integral is
∫ ∫ 1
(−iλ)2 dD q
D
dx[q ′2 + x(1 − x)p2 + m2 − iϵ]−2 (28.14)
2 (2π) 0

255
which has poles at
q̃02 = ⃗q2 + m2 − iϵ + x(1 − x)p2 (28.15)
If p2 > 0, this is the same as the in the previous example, we just redefine m2 : the poles
are outside quadrants I and III, so we can make the Wick rotation of the integral without
problem. However, if p2 < 0 and sufficiently large in absolute value, we can have q02 < 0, so
the poles are now in quadrants I and III, and we cannot simply rotate the contour CR to
the contour CI , since we encounter poles along the way. So in this case, the Wick rotation
is more subtle: apparently, the Minkowski space integral gives a different result from the
Euclidean space result, Wick rotated. However, the latter is better defined, so we can use it.
Power counting
We now want to understand how we can figure out if a diagram, and more generally a
theory, contains UV divergences. We do this by power counting. We consider here scalar
λn ϕn theories.
Consider first just a (Euclidean space) diagram, with L loops and E external lines, and
I internal lines and V vertices. The loop integral will be
∫ ∏
dd qα ∏
L I
1
ID (p1 , ..., pE ; m) = 2
(28.16)
α=1
(2π) j=1 qj + m2
d

where qj = qj (pi , qα ) are the momenta of the internal lines (which have propagators 1/(qj2 +
m2 )). More precisely, they are linear combinations of the loop momenta and external mo-
menta,
∑L ∑
E
qj = cjα qα + cji pi (28.17)
α=1 i=1

Like we already mentioned in Lecture 11, L = I − V + 1, since there are I momentum


variables, constrained by V delta functions (one at each vertex), but one of the delta functions
is the overall (external) momentum conservation.
If we scale the momenta and masses by the same multiplicative factor t, ∏ we can also
change the integration variables (loop momenta qα ) by the same factor t, getting Lα=1 dd q →

tLD Lα=1 dd q, as well as qj → tqj , and q 2 + m2 → t2 (q 2 + m2 ), giving finally

ID (tpi ; tm) = tω(D) ID (pi ; m) (28.18)

where
ω(D) = dL − 2I (28.19)
is called the superficial degree of divergence of the diagram D, since it is the overall dimension
for the scaling above.
Theorem This gives rise to the following theorem: ω(D) < 0 is necessary for the con-
vergence of ID . (Note: but is not sufficient!)
Proof: We have
∏ I ∑I
(qi + m ) ≤ (
2 2
qi2 + m2 )I (28.20)
i=1 i=1

256
Then for large enough qα , there is a constant C such that
( )2 
∑I ∑I ∑ L ∑
E ∑
L
(qi2 + m2 ) =  ciα qα + cij pj + m2  ≤ C qα2 (28.21)
i=1 i=1 α=1 j=1 α=1

as we can easily see. Then we have


∫ ∏L ∫
1 dd q 1 rDL−1 dr
ID > I ∑ 2 d ∑ L
> (28.22)
C qα >Λ2 α=1 (2π) (
2 I
α=1 qα ) r>Λ r2I
∑L ∑dL 2
α=1 qα ≡
2
where we used the fact that M =1 qM is
∑a sum of dL terms, which we can
consider as a dL-dimensional space, and the condition α qα > Λ2 , stated before as qα being
2

large enough, now becomes the fact that the modulus of the dL-dimensional qM is bounded
from below. We finally see that if ω(D) = dL − 2I > 0, ID is divergent. The opposite
statement is that if ID is convergent, then ω(D) < 0, i.e. ω(D) < 0 is a necessary condition
for convergence. q.e.d.
As we said, the condition is necessary, but not sufficient. Indeed, we can have subdiagrams
that are superficially divergent (ω(Ds ) ≥ 0), therefore divergent, then the full diagram is
also divergent, in spite of having ω(D) < 0.

DS

Figure 85: Power counting example: diagram is power counting convergent, but subdiagram
is actually divergent.

We can take an example in λϕ3 theory in D = 4 the one in Fig.85: a circle with 3
external lines connected with it, with a subdiagram Ds connected to the inside of the circle:
a propagator line that has a loop made out of two propagators in the middle. The diagram
has ID = 9, VD = 7, therefore LD = ID − VD + 1 = 9 − 7 + 1 = 3, and then ω(D) =
dLD − 2ID = 4 · 3 − 2 · 9 = −6 < 0. However, the subdiagram has IDs = 2, VDs = 2, therefore

257
LDs = 2 − 2 + 1 = 1, and then ω(Ds ) = 4 · 1 − 2 · 2 = 0, therefore we have a logarithmically
divergent subdiagram, and therefore the full diagram is also logarithmically divergent.
We can then guess that we have the following
Theorem (which we will not prove here) ω(Ds ) < 0∀Ds 1PI subdiagrams of D ⇔
ID (p1 , ..., pE ) is an absolutely convergent integral.
We note that the ⇒ implication should obviously be true, and moreover is valid for any
field theory. But the ⇐ implication is true only for scalar theories. If there are spin 1/2 and
spin 1 fields, then ω(D) < 0 is not even necessary, since there can be cancellations between
different spins, giving a zero result for a superficially divergent diagram (hence the name
superficial degree of divergence, it is not necessarily the actual degree of divergence).
We can now write a formula from which we derive a condition on the type of divergencies
we can have.
We note that each internal line ∑ connects to two vertices, and each external line connects
only one vertex. In a theory with n λn ϕn /n! interactions, we can have a number n = nv
of legs at each vertex v, meaning we have


V
2I + E = nv (28.23)
v=1

Then the superficial degree of divergence is

∑ V ( )
d−2 d−2
ω(D) ≡ dL − 2I = (d − 2)I − dV + d = d − E+ nv − d (28.24)
2 v=1
2

∑ second equality we used L = I − V + 1 and in the last equality we have used


where in the
2I + E = v nv . ∫
Since the kinetic term for a scalar is − dd x(∂µ ϕ)2 /2, and it has to be dimensionless,
we∫ need the dimension of ϕ to be [ϕ] = (d − 2)/2. Then since the interaction term is
− dd xλn ϕn /n!, we have
d−2
[λnv ] = d − nv [ϕ] = d − nv (28.25)
2
meaning that finally
d−2 ∑V
ω(D) = d − E− [λv ] (28.26)
2 v=1

Thus we find that if [λv ] ≥ 0, there are only a finite number (and very small) of divergent
diagrams. Indeed, first, we note that by increasing the number of external lines, we get
to ω(D) < 0. For instance, consider the limiting case of [λv ] = 0, and d = 4. Then
ω(D) = 4 − E, and only E = 0, 1, 2, 3, 4 give divergent results, irrespective of V . Since
E = 0, 1 are not physical (E = 0 is a vacuum bubble, and E = 1 should be zero in a
good theory), we have only E = 2, 3, 4, corresponding to 3 physical parameters (physical
parameters are defined by the number of external lines, which define physical objects like
n-point functions). For [λv ] > 0, any vertices lower ω(D), so we could have even a smaller
number of E’s for divergent n-point functions, since we need at least a vertex for a loop

258
diagram. In higher dimensions, we will have a slightly higher number of divergent n-point
functions, but otherwise the same idea applies.
Such theories, where there are a finite number of n-point functions that have divergent
diagrams, is called renormalizable, since we can absorb the infinities in the redefinition of
the (finite number of) parameters of the theory.
By contrast, if [λv ] > 0, we can make divergent diagrams for any n-point function (any
E) just by increasing V . Therefore there are an infinite number of divergent n-point func-
tions that would need redefinition, so we can’t make this by redefining the parameters of the
theory. Such a theory is called nonrenormalizable. Note that a nonrenormalizable theory
can be so only in perturbation theory, there exist examples of theories that are perturba-
tively nonrenormalizable, but the nonperturbative theory is well-defined. Also note that we
can work with nonrenormalizable theories in perturbation theory, just by introducing new
parameters at each loop order. Therefore we can compute quantum corrections, though the
degree of complexity of the theory quickly increases with the loop order.

Important concepts to remember

• Loop diagrams can contain UV divergence (at high momenta), divergent in diagram-
dependent dimensions, and IR divergences, which appear only for massless theories
and for on-shell total external momenta at vertices.

• UV divergences can be absorbed in a redefinition of the parameters of the theory


(renormalization), and IR divergences can be cancelled by adding the tree diagrams
for emission of low momentum (E ≃ 0) particles, perhaps parallel to the original
external particle.

• Wick rotation of the result of the Euclidean integrals can in general not be the same
as Wick rotation of the Euclidean integral, since there can be poles in between the
Minkowskian and Euclidean contours for the loop energy integration. We can work
in Euclidean space and continue the final result, since the Euclidean theory is better
defined.

• Power counting gives the superficial degree of divergence of a diagram as ω(D) =


dL − 2I.

• In a scalar theory, ω(D) < 0 is necessary for convergence of the integral ID , but in
general not sufficient.

• In a scalar theory, ω(Ds ) < 0 for any 1PI subdiagram Ds of a diagram D ⇔ ID is


absolutely convergent.

• Theories with couplings satisfying [λv ] ≥ 0 are renormalizable, i.e. one can absorb the
infinities in redefinitions of the parameters of the theory, while theories with [λv ] < 0
are nonrenormalizable, since we can’t (there are an infinite number of different infinities
to be absorbed).

259
Further reading: See chapters 5.1,5.2 in [4], 9.1 in [1] and 4.2 in [3].

260
Exercises, Lecture 28

1) Consider the one-loop diagram for arbitrary masses of the various lines (see class).
Check whether there are any divergences.

p2 p3

p1

Figure 86: One loop Feynman diagram. Check for divergences.

2) Check whether there are any UV divergences in the D = 3 diagram in λϕ4 theory in
Fig.87.

Figure 87: Check for UV divergences in this Feynman diagram.

261
29 Lecture 29. Regularization, definitions: cut-off,
Pauli-Villars, dimensional regularization
In this lecture we will preview methods of regularization, that will be used in the second
semester for renormalization. We already saw a few methods of regularization, i.e. making
finite the integrals.
The simplest is cut-off regularization, which means just putting upper and lower bounds
on the integral over the modulus of the momenta, i.e. a |p|max = Λ for the UV divergence,
and an |p|min = ϵ for the IR divergence. It has to be over the modulus only (the integral over
angles is not divergent), and then the procedure works best in Euclidean space (since then
we don’t need to worry about the fact that −(p0 )2 + p⃗2 = Λ2 has a continuum of solutions for
Minkowski space). Note that having a |p|max = Λ is more or less the same as considering a
lattice of size Λ−1 in Euclidean space, which breaks Euclidean (”Lorentz”) invariance (since
translational and rotational invariance are thus broken). For this reason, very seldom we
consider the cut-off regularization.
There are many regularizations possible, and in general we want to consider a regulariza-
tion that preserves all symmetries that play an important role at the quantum level. If there
are several that preserve the symmetries we want, all of them can in principle be used (we
could even consider cut-off regularization, just that then we would have a hard time showing
that our results are consistent with the symmetries we want to preserve).

q q

p p
b) p+q
a)
c)

Figure 88: Examples of one-loop divergent diagrams.

Let’s see the effect of cut-off regularization on the simplest diagram we can write, a loop
for a massless field, with no external momentum, but two external points, i.e. two equal
loop propagators inside (see Fig.88a):
∫ ∫ Λ
d4 p 1 Ω3 p3 dp 1 Λ
= = ln (29.1)
(2π)4 (p2 )2 (2π)4 ϵ p4 8π 2 ϵ

As we said, we see that this has both UV and IR divergences.


We also already saw that sometimes, we can turn integrals into infinite sums, like in the
case of the zero point energy, where a first regularization, putting the system in a box of

262

size L, allowed us to get a sum instead of an integral: n ~ωn /2. The sum however was still
divergent. Then there were various ways of regularizing the sum. We can deduce that if the
result depends on the method of regularization, there are two possibilities:
• perhaps the result is unphysical (can’t be measured). This is what happened to the
full zero-point energy. Certainly the infinite constant (which in e−aωn regularization
was something depending only on L) was not measurable.
• perhaps some of the methods of regularization are unphysical, so we have to choose
the physical one.
An example of a physical calculation is the difference of two infinite sums. We saw the
example of the Casimir effect, i.e. the difference between two geometries (two values of d).
In general,
∑ ~ωn(1) ∑ ~ωn(2)
− (29.2)
n
2 n
2
Another physical case of this type arises when we consider quantum corrections∑ to masses
of solitons. Then we can consider the difference between quantum fluctuations ( n ~ωn /2)
in the presence of the soliton, and in the vacuum (without the soliton), and this would give
the physical calculation of the quantum correction to the mass of the soliton.
Note that it would seem that we could write
∑ ~ωn(1) − ~ωn(2)
(29.3)
n
2

and calculate this, but this amounts to a choice of regularization, called mode number (n)
regularization.
∑ Indeed, we have now ∞ − ∞, so unlike the case of finite integrals, now giving
the n operator as a common factor is only possible if we choose the same number N as
the upper bound in both sums (if one is N , and the other N + a say, with a ∼ O(1), then
obviously we obtain a different result for the difference of two sums).
This may seem natural, but there is more than one other way to calculate: for instance,
we can turn the sums into integrals in the usual way, and then take the same upper limit
in the integral (i.e., in energy), obtaining energy/momentum cut-off regularization. The
difference of the two integrals gives a result differing from mode number cut-off by a finite
piece. ∑
For n ~ωn /2, we saw∑ other regularizations
∑ as well: zeta function regularization, heat
kernel regularization, and n ωn → n ωn e−aωn .
Returning to the loop integrals appearing in Feynman diagrams, we have other ways to
regulate. One of the oldest ways used is Pauli-Villars regularization, and its generalizations.
These fall under the category of modifications to the propagator that cut off the high mo-
mentum modes in a smoother way than the hard cut-off |p|max = Λ. The generalized case
corresponds to making the substitution

1 1 ∑ ci (Λ; m2 )
N
→ − (29.4)
q 2 + m2 q 2 + m2 i=1
q 2 + Λ2i

263
and we can adjust the ci such that at large momentum q, the redefined propagator behaves
as
Λ2N
∼ 2N +2 (29.5)
q
In other words, if in a loop integral, we need to have the propagator at large momentum
behave as 1/q 2N +2 in order for the integral to become finite, we choose a redefinition with
the desired N , and with ci ’s chosen for the required high momentum behaviour.
In particular, the original Pauli-Villars regularization is

1 1 1 Λ 2 − m2 Λ2
→ − = ∼ (29.6)
q 2 + m2 q 2 + m2 q 2 + Λ 2 (q 2 + m2 )(q 2 + Λ2 ) q4

We can easily see that it cannot be obtained from a normal modification of the action,
because of the minus sign, however it corresponds to subtracting the contribution of a very
heavy particle. Indeed, physically it is clear that heavy particle cannot modify anything
physical (for instance, a Planck mass particle cannot influence Standard Model phyics). But
it is equally obvious that subtracting its contribution will cancel heavy momentum modes
in the loop integral, cancelling the unphysical infinities of the loop.
However, there is a simple modification that has the same result as the above Pauli-Villars
subtraction at high momentum, and has a simple physical interpretation as the effect of a
higher derivative term in the action. Specifically, consider the replacement of the propagator
1 1
→ 2 (29.7)
q2 +m 2 q + m + q 4 /Λ2
2

The usual propagator comes from


∫ ∫ ∫
4 1 d4 p 1 d4 p 1
2
d x (∂µ ϕ) = ϕ(p)p 2
ϕ(−p) = ϕ(p)∆−1 (p)ϕ(−p) (29.8)
2 (2π)4 2 (2π)4 2

so the above is obtained by adding to the action a higher derivative term:


∫ [ ] ∫ [ ]
4 1 2 (∂ 2 ϕ)2 d4 p 2 (p2 )2
d x (∂µ ϕ) + = ϕ(p) p + 2 ϕ(−p) (29.9)
2 Λ2 (2π)4 Λ

Now consider a non-Pauli-Villars, but similar modification of the the loop integral, that
is strictly speaking not a modification of the propagator, but of its square. Consider the
same simplest loop integral, with two equal propagators, i.e.
∫ ∫ [( )2 ( )2 ]
d4 p 1 d4 p 1 1
I= 4 2 2 2
→ 4 2 2
− 2 2
= I(m2 ) − I(Λ2 )
(2π) (p + m ) (2π) p +m p +Λ
(29.10)
The new object in the square brackets is

2p2 (Λ2 − m2 ) + Λ4 − m4 2Λ2


∼ (29.11)
(p2 + m2 )2 (p2 + Λ2 )2 p6

264
so is now UV convergent. Since the object is UV convergent, we can use any method to
calculate it. In particular, we can take a derivative ∂/∂m2 of it, and since I(Λ2 ) doesn’t
contribute, we get for the integral
∫ ∫ ∞
∂ d4 p −2 Ω3 p3 dp
[I(m ) − I(Λ )] =
2 2
= −2 (29.12)
∂m2 (2π)4 (p2 + m2 )3 (2π)4 0 (p2 + m2 )3

where Ω3 = 2π 2 is the volume of the 3 dimensional unit sphere. Considering p2 + m2 = x,


so p3 dp = (x − m2 )dx/2, we get
∫ ∞
∂ 1 (x − m2 )dx 1 1
I(m 2
, Λ 2
) = − = − (29.13)
∂m2 8π 2 m2 x3 8π 2 2m2
Integrating this, we obtain I(m2 ), then

1 Λ2
I(m2 , Λ2 ) = I(m2 ) − I(Λ2 ) = ln (29.14)
16π 2 m2
This object is UV divergent, as Λ → ∞, and also divergent as m → 0 (IR divergent).
However, note that in the way we calculated, we really introduced another type of regu-
larization. It was implicit, since we first found a finite result by subtracting the contribution
with m → Λ, and then calculated this finite result using what was a simple trick.
However, if we keep the original integral and do the same derivative on it, after the
derivative we obtain a finite result,
∫ ∫
∂ d4 p 1 d4 p 1 1
= −2 = − = finite. (29.15)
∂m2 (2π)4 (p2 + m2 )2 (2π)4 (p2 + m2 )3 16π 2 m2
and now the integral (and its result) is UV convergent, despite the integral before the deriva-
tive being UV divergent. Hence the derivative with respect to the parameter m2 is indeed a
regularization. Both the initial and final results are however still IR divergent as m2 → 0.
Now integrating, we obtain

d4 p 1 1 m2
= − ln (29.16)
(2π)4 (p2 + m2 )2 16π 2 ϵ
which is still IR divergent as ϵ → 0.
However, all the regularizations we analyzed until now don’t respect a very important
invariance, namely gauge invariance. Therefore, ’t Hooft and Veltman introduced a new
regularization to deal with the spontaneously broken gauge theories, namely dimensional
regularization, rather late (early seventies), since it is a rather strange concept.
Dimensional regularization
Dimensional regularization means that we analytically continue in D, the dimension of
spacetime. This seems like a strange thing to do, given that the dimension of spacetime is
an integer, so it is not clear what can physically mean a real dimension, but we nevertheless
choose d = 4 ± ϵ. The sign has some significance as well, but here we will just consider
d = 4 + ϵ.

265
We already saw a way to continue a result defined for integers to a result for real numbers,
in the case of the Riemann zeta function: defined initially for integers, but extended for
complex numbers by writing an integral formula for it. A more relevant example for us is
the case of the Euler gamma function, which is an extension of the factorial, n!, defined for
integers, to the complex plane. Again this is done by writing an integral formula,
∫ ∞
Γ(z) = dααz−1 e−α (29.17)
0

Indeed, one easily shows that Γ(n) = (n − 1)!, for n ∈ N∗ , but the integral formula can be
extended to the complex plane, defining the Euler gamma function. The gamma function
satisfies
zΓ(z) = Γ(z + 1), (29.18)
an extension of the factorial property. But that means that we can find the behaviour at
z = ϵ → 0, which is a simple pole, since
1
ϵΓ(ϵ) = Γ(1 + ϵ) ≃ Γ(1) = 1 ⇒ Γ(ϵ) ≃ (29.19)
ϵ
We can repeat this process,
1
(−1 + ϵ)Γ(−1 + ϵ) = Γ(ϵ) ⇒ Γ(−1 + ϵ) ≃ −
ϵ
1
(−2 + ϵ)Γ(−2 + ϵ) = Γ(−1 + ϵ) ⇒ Γ(−2 + ϵ) ≃ (29.20)

etc. We see then that the gamma function has simple poles at all Γ(−n), with n ∈ N. In fact,
these poles are exactly the one we obtain in dimensional regularization, as we now show.
Consider first the simplest case, the tadpole diagram, with a single loop, of momentum
q, connected at a point to a propagator line, as in Fig.88b:

dD q 1
I= (29.21)
(2π) q + m2
D 2

We now write ∫ ∞
1
dαe−α(q
2 +m2 )
= (29.22)
q + m2
2
0
and then
∫ ∞ ∫ ∫ ∞ ∫ ∞
−αm2 dD q −αq2 −αm2 ΩD−1
dqq D−1 e−αq
2
I = dαe D
e = dαe D
∫0 ∞ (2π) ∫ ∞ 0 (2π) 0
−αm2 ΩD−1 1 D
−1 −x
= dαe dxx 2 e (29.23)
0 (2π)D 2αD/2 0
∫∞
and we use the fact that 0 dxxD/2−1 e−x = Γ(D/2), and that the volume of the D-
dimensional sphere is
D+1
2π 2
ΩD = ( D+1 ) , (29.24)
Γ 2

266
which we √ can easily test on a few examples, Ω1 = 2π/Γ(1) = 2π, Ω2 = 2π 3/2 /Γ(3/2) =
2π 3/2 /( π/2) = 4π, Ω3 = 2π 2 /Γ(2) = 2π 2 . Then we have ΩD−1 Γ(D/2) = 2π D/2 , so
∫ ∞ ( )
(m2 ) 2 −1
D
−αm2 −D D
I= dαe (4πα)2 = D Γ 1− (29.25)
0 (4π) 2 2

Taking derivatives (∂/∂m2 )n−1 on both sides (both the definition of I and the result), we
obtain in general
∫ ( ) D −n
dD q 1 Γ(n − d/2) m2 2
= (29.26)
(2π)D (q 2 + m2 )n (4π)n Γ(n) 4π
We see that in D = 4, this formula has a pole at n = 1, 2, as expected from the integral
form. In these cases, the divergent part is contained in the gamma function, namely Γ(−1)
and Γ(0).
We now move to a more complicated integral, which we will solve with Feynman parametriza-
tion, cited last lecture. Specifically, we consider the diagram for a one-loop correction to the
propagator in ϕ3 theory, with momentum p on the propagator and q and p + q in the loop,
as in Fig.88c, i.e. ∫
dD q 1 1
(29.27)
(2π) q + m (q + p)2 + m2
D 2 2

We now prove the Feynman parametrization in this case of two propagators. We do the trick
used in the first integral (tadpole) twice, obtaining
∫ ∞ ∫ ∞
1
= dα1 dα2 e−(α1 ∆1 +α2 ∆2 ) (29.28)
∆1 ∆2 0 0

We
( then change
) variables in the integral as α1 = t(1 − α), α2 = tα, with Jacobian
1−α α
= t, so
−t t
∫ 1 ∫ ∞ ∫ 1
1 −t[(1−α)∆1 +α∆2 ] 1
= dα dt te = dα (29.29)
∆1 ∆2 0 0 0 [(1 − α)∆1 + α∆2 ]2

We finally want to write the square bracket as a new propagator, so we redefine q µ = q̃ µ −αpµ ,
obtaining

(1 − α)∆1 + α∆2 = (1 − α)q 2 + α(q + p)2 + m2 = q̃ 2 + m2 + α(1 − α)p2 (29.30)

Finally, we obtain for the integral


∫ 1 ∫ ∫
dD q̃ 1 Γ(2 − D/2) 1
dα[α(1 − α)p2 + m2 ] 2 −1
D
I= dα =
0 (2π) [q̃ + (α(1 − α)p + m )]
D 2 2 2 2 (4π) D/2
0
(29.31)
and again we obtained the divergence as just an overall factor coming from the simple pole
of the gamma function at D = 4.

267
The Feynman parametrization can in fact be generalized, to write instead of a product
of propagators, a single ”propagator” raised at the corresponding power, with integrals over
parameters α left to do. Then we can use the formula (29.26) to calculate the momentum
integral, and we are left with the integration over parameters. So the dimensional regular-
ization procedure described above is general, and we see that the divergence always appears
as the simple pole of the gamma function at D = 4.
So we saw that the loop integrals of the above type are OK to dimensionally continue,
but is it OK to dimensionally continue the Lagrangeans?
For scalars, it is OK, but we have to be careful. The Lagrangean is

1 m2 2 λ n n
L = (∂µ ϕ)2 + ϕ + ϕ (29.32)
2 2 n!

and since the action S = dD x must be dimensionless, the dimension of the scalar is
[ϕ] = (D − 2)/2 and thus the dimension of the coupling is [λn ] = D − n(D − 2)/2. For
instance, for D = 4 and n = 4, we have [λ4 ] = 0, but for D = 4 + ϵ, we have [λ4 ] = −ϵ. That
means that outside D = 4, we must redefine the coupling with a factor µϵ , where µ is some
scale that appears dynamically. This process is called dynamical transmutation, but we will
not say more about it here.
For higher spins however, we must be more careful. The number of components of a field
depends on dimension, which is a subtle issue. We must then use dimensional continuation
of various gamma matrix formulae

g µν gµν = D
γµ /pγ µ = (2 − D)p/ (29.33)

etc. On the other hand, the gamma matrices still satisfy the Clifford algebra {γ µ , γ ν } = 2g µν .
But the dimension of the (spinor) representation of the Clifford algebra depends on dimension
in an unusual way, n = 2[D/2] , which means it is 2 dimensional in D = 2, 3 and 4 dimensional
in D = 4, 5. That means that we cannot continue dimensionally n to D = 4 + ϵ. Instead, we
must still consider the gamma matrices as 4 × 4 even in D = 4 + ϵ, and thus we still have

Tr[γµ γν ] = 4gµν (29.34)

This is not a problem, however there is another fact that is still a problem. The definition
of γ5 is
i
γ5 = ϵµνρσ γ µ γ ν γ ρ γ σ = −iγ 0 γ 1 γ 2 γ 3 = iγ0 γ1 γ2 γ3 (29.35)
4!
and that cannot be easily dimensionally continued. Since chiral fermions, i.e. fermions that
are eigenvalues of the chiral projectors PL,R = (1 ± γ5 )/2, appear in the Standard Model, we
would need to be able to continue dimensionally chiral fermions. But that is very difficult
to do.
Therefore we can say that there are no perfect regularization procedures, always there is
something that does not work easily, but for particular cases we might prefer one or another.

268
Next semester we will see that the divergences that we have regularized in this lecture
can be absorbed in a redefinition of the parameters of the theory, leaving only a finite piece
giving quantum corrections. But for now, this is the only thing we will say.

Important concepts to remember

• We must regularize the infinities appearing in loop integrals, and the infinite sums.

• Cut-off regularization, imposing upper and lower limits on |p| in Euclidean space,
regulates integrals, but is not very used because it is related to breaking of Euclidean
(”Lorentz”) invariance, as well as breaking of gauge invariance.

• Often the difference of infinite sums is a physical observable, and then the result is
regularization-dependent. In particular, we can have mode-number cut-off (giving the
sum operator as a common factor), or energy cut-off (giving a resulting energy integral
as a common factor). We must choose one that is more physical.

• The choice of regularization scheme for integrals is dictated by what symmetries we


want to preserve. If several respect the wanted symmetries, they are equally good.

• (Generalized) Pauli-Villars regularization removes the contribution of high energy


modes from the propagator, by subtracting the propagator a very massive particle
from it. A related version for it is obtained from a term in the action which is higher
derivative.

• By taking derivatives with respect to a parameter (e.g. m2 ), we obtain derivative


regularization, which also reduces the degree of divergence of integrals.

• Dimensional regularization respects gauge invariance, and it corresponds to analytically


continuing the dimension, as D = 4 + ϵ. It is based on the fact that we can continue
n! away from the integers to the Euler gamma function.

• In dimensional regularization, the divergences are the simple poles of the gamma func-
tion at Γ(−n), and appear as a multiplicative 1/ϵ.

• For scalars, dimensional regularization of the action is OK, if we remember that cou-
plings have extra mass dimensions away from D = 4. For higher spins, we must
regulate the number of components, including things like g µν gµν and gamma matrix
identities.

• The dimension of gamma matrices away from D = 4 is still 4, so traces of gamma


matrices still give a factor of 4, and the γ5 cannot be continued away from D = 4,
which means analytical continuation in dimension that involve chiral fermions are very
hard.

‘ Further reading: See chapters 5.3 in [4] 4.3 in [3] and 9.3 in [1].

269
Exercises, Lecture 29

1) Calculate ∫
d4 p 1 1
(29.36)
(2π) p + m1 p + m22
4 2 2 2

in Pauli-Villars regularization.

2) Calculate ∫
dD q 1 1
(29.37)
(2π) (q + m1 ) (q + p)2 + m22
D 2 2 2

in dimensional regularization (it is divergent in D = 6).

270
References
[1] George Sterman, ”An introduction to Quantum Field Theory” [GS]

[2] M.E. Peskin and D.V. Schroeder, ”An introduction to Quantum Field Theory” [PS]

[3] Pierre Ramond, ”Field theory: A modern primer” [PR]

[4] Jan Ambjorn and Jens Lyng Petersen, ”Quantum field theory”, Niels Bohr Institute
lecture notes, 1994. [NBI]

[5] P.A.M.Dirac, ”Lectures in Quantum Mechanics”

[6] Thomas Banks, ”Modern Quantum Field Theory”.

271

You might also like